Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Empowering Visual Internet-of-Things Mashups with Self-Healing Capabilities

JP
June 03, 2021

Empowering Visual Internet-of-Things Mashups with Self-Healing Capabilities

Empowering Visual Internet-of-Things Mashups with Self-Healing Capabilities presentation from the SERP4IoT workshop, part of ICSE 2021.

Paper pre-print: https://arxiv.org/abs/2103.07395

JP

June 03, 2021
Tweet

More Decks by JP

Other Decks in Research

Transcript

  1. Empowering Visual Internet-of-Things Mashups with
    Self-Healing Capabilities
    João Pedro Dias, André Restivo, Hugo Sereno Ferreira
    {jpmdias,arestivo,hugosf}@fe.up.pt
    June 3rd, 2021
    3rd International Workshop on Software Engineering Research & Practices for the
    Internet of Things (SERP4IoT 2021)
    Co-located with the 43rd ACM/IEEE International Conference on Software Engineering (ICSE 2021)

    View Slide

  2. Index
    • Context
    • Self-healing for IoT
    • Related Work
    • Previous Work
    • Self-Healing Extensions
    • Experiments & Results
    • Final Remarks
    2

    View Slide

  3. Context: Internet-of-Things
    • Rapid IoT expansion across application domains:
    - Rushed development of devices and systems by competing vendors;
    - Overall neglect of interoperability standards and best practices;
    - Result is a highly-complex, heterogeneous, and frangible ecosystem.
    • More and more stories about devices stop working or behaving in
    unforeseeable ways:
    - e.g., smart locks which randomly open, doorbells that do not work without Internet,
    unsafe thermostat temperature adjustments...
    • Little to none safeguards or fallbacks:
    - Ignoring years of research in other fields, e.g., mission critical systems.
    • Traditional development approaches becoming unsuitable for IoT
    development:
    - Visual programming solutions (and other low-code approaches) have been proposed as
    an alternative.
    3

    View Slide

  4. Context: Visual Programming and Node-RED
    • Node-RED is the most popular visual programing solution used for IoT
    development:
    - Leverages the use of flows, by drag-n-drop nodes and links;
    - Provides both a development environment and a runtime;
    - Additional nodes can be added, by implementing them in JavaScript.
    • There are few available nodes that allow developers to improve the
    resilience of flows, and none come with the default node palette.
    - This is a current limitation of most low-code IoT development solutions.
    4

    View Slide

  5. Self-healing for IoT
    • Several authors have been proposing autonomic computing as an approach
    to mitigate some management issues of complex of IoT systems.
    • An autonomic computing system should be able to:
    - Configure itself (self-configuration);
    - Constantly improve its performance (self-optimization);
    - Protect itself against malicious attacks (self-protection);
    - Automatically detect, diagnosis and repair system defections (self-healing).
    5

    View Slide

  6. Related Work
    • Angarita et al. introduces the notion of responsible objects, stating that things
    should be self-aware of their context and apply smart self-healing
    decisions.
    - However, the transactional nature of the solution proposed have several shortcomings.
    • Aktas et al. and Leotta et al. were among the few to propose the use of
    runtime verification mechanisms to detect system problems. The first using
    a complex-event processing approach and the other using formal
    specifications of the system (i.e., UML).
    • Szydlo et al., Blackstock et al. and others have been proposing solutions to
    improve the reliability of Node-RED itself:
    - Approaches include partition of Node-RED flows across instances or conversion of flows
    (or nodes) into code that can be run by edge devices.
    - Disregards typical edge tier capabilities or assumes computational power above what is
    typical of constrained devices.
    6

    View Slide

  7. Previous Work
    • “Visual Self-healing Modelling for Reliable Internet-of-Things Systems”:
    - We proposed an approach for improve the reliability of Node-RED flows by leveraging
    existent nodes (abstracted into sub-flows), found several limitations and shortcomings of
    the approach and of Node-RED itself.
    • “A Pattern-Language for Self-Healing Internet-of-Things Systems”:
    - We systemized the existent knowledge from several fields in what regards reliability,
    fault-tolerance and self-healing into a pattern-language with 27 patterns.
    - Defined two pattern categories:
    - Error Detection (probes)
    - Recovery and Maintenance of Health
    7

    View Slide

  8. Previous Work: Pattern-Language 8
    Resource Monitor

    View Slide

  9. Self-healing for IoT: Node-RED Extensions (SHEN) 9
    • A set of 17 nodes that can be used to add self-healing capabilities to
    Node-RED flows.
    • A single node might leverage more than one self-healing pattern.
    - There are nodes that do both detection and recovery (or maintenance of health),
    while others do only one of the parts.
    - Some nodes only provide specific use cases of the general pattern (e.g., Kalman
    noise filter).
    • There are some self-healing patterns that are not possible to implement
    only in Node-RED, e.g., depending on the devices features or exposed
    interfaces.
    • The extension only encompass nodes (and flows) with reactive behavior.
    - We consider proactive (e.g., preventive) approaches as a future research direction.

    View Slide

  10. Experiments & Results: SmartLab 10

    View Slide

  11. Experiments & Results: Sensor Failure 11
    • Sensor device emits temperature and humidity readings each 60 seconds.
    - Possible errors: Sensor do not emit reading, values are out-of-spec for the sensor or other sensor misbehavior (e.g.,
    stuck-at readings). Additionally, Node-RED can restart (e.g., due to a crash).
    • The extra nodes detect if (1) the device stops emitting values (heartbeat) and (2) if values are out-of-
    spec (threshold-check). If any issue appears, (3) missing values are compensated (compensate). If
    Node-RED restarts for some reason, the last reading is injected to the flow output (checkpoint).

    View Slide

  12. Experiments & Results: Load Spike 12
    • An NFC reader is used to authenticate accesses to the lab.
    - The usage frequency varies during the day, and the load can require extra resources.
    • The extra nodes detect (1) the frequency of readings (timing) as slow, fast and
    normal, with a 15s interval per reading configuration. If there is a load spike, a
    balancing of the requests is done amongst available resources.

    View Slide

  13. Experiments & Results: Redundancy 13
    • With more than one Node-RED instances running (which have different flows
    configured), if one fails, the other (which becomes the main one) must enable
    a specific flow (which ensures the maintenance of health of the system).
    • The extra nodes (1) allow to manage different instances by exchanging ping
    and election messages (redundancy) and (2) to enable or disable flows during
    runtime (flow-control), thus allowing to configure such self-healing behaviors
    (RUNTIME ADAPTATION).

    View Slide

  14. Experiments & Results: Summary
    • With the presented validation steps, we showcase the feasibility of using self-
    healing mechanisms within Node-RED flows to improve the system dependability.
    • However, we consider that these scenarios show only a portion of the possibilities
    of configuration/use of the self-healing extensions.
    • The experimental scenarios, although inspired in real-world use cases, have been
    hand-picked with prior knowledge of system, which is one of the considered threats to
    validity.
    • We also consider that using a real deployed testbed enhances the quality of the
    experiments. Nonetheless, it also poses some limitations/threats:
    - Limits the number of devices used during the experiments due to additional costs;
    - Makes it more difficult to replicate;
    - Capturing failures-over-time requires long-running experiments;
    - The users that typically interact with our system exhibit a level of expertise that is not
    representative of most IoT deployment scenarios.
    14

    View Slide

  15. Final Remarks
    • Based on previous work, with the self-healing extensions we have enabled Node-
    RED users to improve the overall system dependability via the addition of self-
    healing mechanisms.
    • We have also encountered several limitations in the current version which we
    consider as future work:
    - Some nodes have issues dealing with some cases (e.g., redundancy node is uncapable to deal
    with runtime network partitioning);
    - Node-RED's points of extension limits what we can do without modifying Node-RED itself, or
    the end-devices.
    - Most of the nodes do not have acceptable delays/margins into consideration (e.g., a delay of
    1sec can be ignored for most smart home applications);
    - The nodes for device/service discovery, device registry and resource monitoring are limited
    due to the nature of IoT (e.g., lack of standards and the heterogeneity of communication
    protocols).
    - To better understand the limitations of our approach we need to be able to deliberately
    provoke failures, process which is, currently, mostly manual.
    15

    View Slide

  16. Empowering Visual Internet-of-Things Mashups with
    Self-Healing Capabilities
    Thank You!
    João Pedro Dias, André Restivo, Hugo Sereno Ferreira
    {jpmdias,arestivo,hugosf}@fe.up.pt
    Self-Healing Node-RED Extensions (SHEN)
    https://github.com/jpdias/node-red-contrib-self-healing (Stars: 17)
    https://www.npmjs.com/package/node-red-contrib-self-healing (Total downloads: 1,644)

    View Slide