Edge ETL Data Collector Service Pattern
This subflow handles the pushing of data to an HTTP out endpoint. It accepts JSON input (object or array) in the msg.payload, which is considered the data to be pushed. The flow is intended to be run as a service at the edge, serving as an intermediate data collector. The main goal is to ensure a robust and reliable process for data collection, that is not affected by network connectivity loss or other disturbing factors at the edge.
The main logic of the flow appears in the following figure.

The Node-RED input layer can be used to adapt to diverse protocols needed at the Edge IoT side. An example of an HTTP In local interface that then uses the Edge ETL subflow appears in the following figure.

Then some relevant arbitrary preprocessing may be further performed at the inner flow, for minimizing the need to perform this later on the Cloud side, which may involve huge volumes of data coming from multiple sources. After that, the flow tries to push the data to the main Cloud storage service, applying the retry pattern. Details of these settings are set in the subflow UI.

If this is not successful after a number of retries, the data are stored locally in an sqlite DB (default location: /data/lost_data.db). Periodically, a relevant cron job is activated in order to check the lost data local db and retry again to push them centrally, while avoiding duplicate values insertion. The subflow internals appear in the following figure.

The target POST endpoint can be configured in the node UI, as well as the number of retries before temporarily giving up on pushing. Credentials for the HTTP out node need to be set in the “pushout” node. A number of tester flows have also been created for local testing, including applying dummy connection details for simulating a network outage.

If more than one instances of the subflow need to be deployed on a server (e.g. for pushing different data points to different endpoints), suitable modifications are needed since they would reuse the same db.
Owner: Harokopio University of Athens
PoC: Prof. George Kousiouris
email: gkousiou@hua.gr
Release Date: 09/07/2022
License: Apache License v2.0
Field of Use: Application design