Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic Stream Processing for the Internet of Things

Elastic Stream Processing for the Internet of Things

Presentation of our research paper "Elastic Stream Processing for the Internet of Things" at the IEEE Cloud 2016 conference in San Francisco, USA.

Christoph Hochreiner

June 29, 2016
Tweet

More Decks by Christoph Hochreiner

Other Decks in Research

Transcript

  1. Elastic Stream Processing for the Internet of Things Christoph Hochreiner,

    Michael Vögler, Stefan Schulte, Schahram Dustdar
  2. Motivational Scenario Challenges • Raw Sensor Data must not be

    exported to other countries due to legal reasons • Analysis algorithm must not be hosted outside the companies premises • System needs to be configurable • Data should be processed next to the sensor to reduce latency 3
  3. Requirements • Inherent hybrid cloud support to consider legal and

    business related regulations • Reconfiguration at runtime • Computational resource elasticity • Cost efficiency 5
  4. Related Systems 6 System S Storm Spark Cloud Data Flow

    Stream Cloud Distributed Storm Cloud Dataflow AWS IOT Hybrid Cloud Support ✔ ✔ ✔ ✔ Reconfiguration at Runtime (✔) Resource Elasticity at Runtime (✔) ✔ ✔ (✔) ✔ ✔ Cost Efficiency (✔) (✔)
  5. Evaluation Scenario Objective Analyze individual taxi rides, which are composed

    of location-based time-series 12 Data Transfer Data Aggregation Processing operation Distance Speed Average Speed Aggregation Analysis Monitor
  6. Evaluation Preliminaries Service Level Agreement Report Generation Time Maximal processing

    duration after the last time-series item was posted until the analysis is finished is 60 seconds. Node granularity Each Operator Node and Processing Node is presented by a virtual machine. 13
  7. Evaluation Preliminaries Resource Provisioning Approaches Elastic-provisioning Threshold-based resource allocation approach

    based on the CPU load of the Processing Nodes as well as the load on the incoming message queue. Under-provisioning Fixed provisioning of Processing Nodes which just do not comply with any SLA. Over-provisioning Fixed provisioning of a minimal set of Processing Nodes, which yield a 100 % SLA compliance. Node allocation for baselines Node assignment for baselines is indentical as for the elastic scenario. 14
  8. Evaluation Results 15 Elastic Provisioning Under provisioning Over Provisioning Cost

    for Processing Nodes 2160,66 1855 2665 Total Makespan (sec) 6653 6975 6655 Average Report Generation (sec) 77 355 35 Total Delays 21 75 0 SLA Adherance (%) 28 0 100
  9. Evaluation Results 16 Elastic Provisioning Under provisioning Over Provisioning Cost

    for Processing Nodes 2160,66 1855 2665 Total Makespan (sec) 6653 6975 6655 Average Report Generation (sec) 77 355 35 Total Delays 21 75 0 SLA Adherance (%) 28 0 100 20 % cost reduction compared to over-provisioning
  10. Evaluation Results 17 Elastic Provisioning Under provisioning Over Provisioning Cost

    for Processing Nodes 2160,66 1855 2665 Total Makespan (sec) 6653 6975 6655 Average Report Generation (sec) 77 355 35 Total Delays 21 75 0 SLA Adherance (%) 28 0 100 total makespan is similar as for the over-provisioning scenario and 5 % faster than the underprovisioning
  11. Evaluation Results 18 Elastic Provisioning Under provisioning Over Provisioning Cost

    for Processing Nodes 2160,66 1855 2665 Total Makespan (sec) 6653 6975 6655 Average Report Generation (sec) 77 355 35 Total Delays 21 75 0 SLA Adherance (%) 28 0 100 Average report generation is 4.3 times faster than for the underprovisioning scenario only 2 times as for the over-provisioning one
  12. Evaluation Results 19 Elastic Provisioning Under provisioning Over Provisioning Cost

    for Processing Nodes 2160,66 1855 2665 Total Makespan (sec) 6653 6975 6655 Average Report Generation (sec) 77 355 35 Total Delays 21 75 0 SLA Adherance (%) 28 0 100 Average report generation duration is slightly above the SLA
  13. Requirements Revisited • Inherent hybrid cloud support to consider legal

    and business related regulations • Reconfiguration at runtime • Computational resource elasticity • Cost efficiency 23 ✔ ✔ ✔ ✔
  14. Lessons Learned • Threshold-based resource allocation can lead to delays

    and impact the QoS negatively • VM based provisioning causes delays due to the long startup duration • Redundant infrastructure of Operator Nodes cause high computational resource requirements 24
  15. Outlook • Investigate towards predictive scheduling approaches • Implement a

    more lightweight system design • Pool the Operator Node infrastructure to reduce the computational overhead 25 https://github.com/chochreiner/VISP-Runtime
  16. Elastic Resource Provisioning min X p2P pi + X p2P

    piBT U + u · N + d · N pi piBT U u d Specific processing node Remaining BTU for a specific processing Node Upscaling decision variable Downscaling decision variable