Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pratipad: A Declarative Framework for Describi...

Pratipad: A Declarative Framework for Describing Bidirectional Dataflow in IoT Systems with Elixir

Kentaro Kuribayashi

October 15, 2021
Tweet

More Decks by Kentaro Kuribayashi

Other Decks in Technology

Transcript

  1. Pratipad A Declarative Framework for Describing Bidirectional Dataflow in IoT

    Systems with Elixir Kentaro Kuribayashi from GMO Pepabo, Inc.
  2. He is: • A software engineer • From Tokyo, Japan

    • CTO at GMO Pepabo, Inc. • A graduate student at JAIST @kentaro @kentaro Kentaro Kuribayashi
  3. Elixir is eating the IoT world! • Dr. Takase, a

    co-leader of NervesJP, illustrates a future vision of IoT systems (right figure). • He envisions that Elixir can empower such systems, running on all of the layers such as device, edge, cloud. Cite: Hideki Takase “ElixirでIoT!?ナウでヤングで coolなNervesフレームワーク p.21” https://www.slideshare.net/takasehideki/elixiriotcoolnerves-236780506/21
  4. I have a dream I’d like to • Progress the

    vision to connect things over the layers. • Realize the vision by connecting each other via the distributed Erlang network. • Mitigate issues in developing complicated IoT systems. Everything becomes an Erlang node and connects each other via the distributed network, which is the way how we can save this complicated world.
  5. Complicated IoT systems? Application Network Perception Application Middleware Processing Transport

    Perception Cloud Edge Device Three-layer model Five-layer model Our model - Sensing the physical environment and actuating devices. - Transport the sensor data to the upper layer. - Processing the data, i.e., formatting, transforming, extending, etc. - Analysing the data to produce meaningful information. - Providing services to users. - Sending actuation orders to the device layer.
  6. What makes IoT systems complicated? The 3 factors below cause

    the complexity: 1. Various options of programming languages and communication protocols. 2. Various and bidirectional data acquisition methods. 3. Poor visibility of the dataflow throughout the IoT system.
  7. Here Pratipad comes! Pratipad is • A solution against the

    problems mentioned earlier. • A declarative framework for describing bidirectional dataflow. Note: The word “pratipad” is borrowed from Sanskrit. It means “way” in English. CIte: kentaro/pratipad: “A Declarative Framework for Describing Bidirectional Dataflow” https://github.com/kentaro/pratipad
  8. An example IoT system using Pratipad Room 1 Room 2

    Local Area Network Edge Layer Device Layer Device Layer My House Wide Area Network Sensing: - CO2 concentration - air pressure - humidity Cloud Layer - visualize - analyze - send back actuation order External API - provide additional metadata (e.g. precipitation) Users - monitor situation - do some actions (e.g. open the window) mTLS connection - aggregate - transform - add more info ・・・ Running on Elixir LED blinks when the cloud system sends a notification to prompt uses to open the window.
  9. Proposals 1: What we need is just only Elixir! Pratipad

    provides a method to design and implement IoT systems in an integrated manner using the same programming language and communication protocol. 1. All the 3 layers are implemented in Elixir. ◦ Device: Nerves + Pratipad.Client ◦ Edge: Pratipad ◦ Cloud: Phoenix + Pratipad.Client 2. They are connected to each other via the Erlang distribution protocol. ◦ All connections are over TLS. ◦ Client certificates are required for authentication.
  10. Proposals 2: Flexible and bidirectional dataflow Pratipad provides a framework

    that can support push, pull, and demand methods, including bidirectional dataflow. 1. The data is retrieved from the device layer. Incidentally, there are 3 ways for that as below. Pratipad supports all of them. a. Push b. Pull c. Demand 2. Bidirectional dataflow is also supported.
  11. Proposals 3: Declarative dataflow notation Pratipad provides a notation method

    that can grasp the dataflow consisting of the 3 layers under a single view. Pratipad allows us to 1. Describe dataflow in a declarative manner. 2. Separate the dataflow itself and how to process the data.
  12. Why Erlang distribution? • We can adopt some other different

    protocols such as HTTP or MQTT for IoT systems. • Elixir is built on the top of Erlang/OTP that has legendary achievements in telecom systems. • IoT is like such a system in the current era. All the 3 layers are connected each other with distributed Erlang network.
  13. Joe Armstrong once said At this time, we were only

    interested in connecting conventional sequential computers with no shared memory. Our idea was to connect stock hardware through TCP/IP sockets and run a cluster of machines behind a corporate firewall. We were not interested in security since we imagined all our computers running on a private network with no external access. This architecture led to a form of all-or-nothing security that makes distributed Erlang suitable for programming cluster applications running on a private network, but unsuitable for running distributed applications where various degrees of trust are involved. Cite: Joe Armstrong “A history of Erlang” “
 ”

  14. Securing Erlang distribution IoT systems, in nature, need distributed networks

    in the “external” world. To realize such a distribution with Erlang/OTP, we have to ensure 1. All connections are over TLS (Transport Layer Security) to avoid MITM (Man-In-The-Middle) attack. 2. Client certificates are required for authentication to prohibit those who don’t have permission from joining the network.
  15. Enable TLS for distributed Erlang network Live example from pratipad_example_device

    project: -proto_dist inet_tls -ssl_dist_optfile /etc/<%= System.get_env("PRATIPAD_DEVICE") %>.tls.conf -start_epmd false -erl_epmd_port 44300 CIte: pratipad_example_device/rel/vm.args.eex https://github.com/kentaro/pratipad_example_device/blob/main/rel/vm.args.eex rel/vm.args.eex ※ epmd is prevented from running because it cannot be secure; it is vulnerable to brute-force attacks. Moreover, all the nodes in the network can be known by those who can crack the network.
  16. Client certificate authentication (mTLS) Live example from pratipad_example_device project: 1.

    Using the settings as the right figure, the device001 can be both a TLS server and a TLS client try authentication in mTLS manner. 2. If both verify and fail_if_no_peer_cert are set, servers must verify client certificates to authenticate them. CIte: pratipad_example_device/rootfs_overlay/etc/device001.tls.conf https://github.com/kentaro/pratipad_example_device/blob/main/rootfs_overlay/etc/device001.tls.conf [ {server, [{cacertfile, "/etc/ca.crt"}, {certfile, "/etc/device001.pratipad.local.crt"}, {keyfile, "/etc/device001.pratipad.local.key"}, {secure_renegotiate, true}, {fail_if_no_peer_cert, true}, {verify, verify_peer} ]}, {client, [{cacertfile, "/etc/ca.crt"}, {certfile, "/etc/device001.pratipad.local.crt"}, {keyfile, "/etc/device001.pratipad.local.key"}, {secure_renegotiate, true}, {fail_if_no_peer_cert, true}, {verify, verify_peer} ]} ]. device001.tls.conf
  17. It’s on the Broadway • Broadway is a library to

    “build concurrent and multi-stage data ingestion and data processing pipelines with Elixir.” • Pratipad is built on the top of Broadway to realize flexible and bidirectional dataflow in IoT systems. Cite: Broadway https://elixir-broadway.org/
  18. Producer/consumer pattern and Erlang distribution kentaro/off_broadway_otp_distribution allows us to use

    Broadway with Erlang distribution. Cite: kentaro/off_broadway_otp_distribution: “An OTP distribution connector for Broadway” https://github.com/kentaro/off_broadway_otp_distribution Erlang nodes Producer Consumers Retrieve messages via distributed Erlang network protocol
  19. Bidirectional Broadways to connect the 3 layers Pratipad uses two

    Broadways to realize the dataflow between the 3 layers. Device Layer Edge Layer Cloud Layer Forwarder Broadway Backwarder Broadway Pratipad
  20. How messages flow The dataflow is constituted by the combination

    of Pratipad and Broadway. Device Layer Edge Layer Cloud Layer Forwarder Broadway Pratipad.Client off_broadway_otp_distribution Broadway.handle_message/3 Broadway.handle_batch/4 Pratipad.Client Backwarder Broadway Pratipad.Client Pratipad.Client off_broadway_otp_distribution
  21. Data acquisition methods There are 3 ways to retrieve data

    from the devices: 1. Push 2. Pull 3. Demand Pratipad supports all of the data acquisition methods mentioned above!
  22. Push method The dataflow starts from the device layer. •

    The device layer is responsible for how many messages are emitted. • It’s useful when the time resolution of sensor data is important for the target IoT system.
  23. Pull method The dataflow starts from the cloud layer. •

    The cloud layer is responsible for how many messages are emitted. • It’s useful when you need to control the message flow rate along with service levels to be offered to each system user.
  24. Demand method The dataflow starts from the edge layer. •

    The edge layer is responsible for how many messages are emitted. • It’s useful when the edge layer becomes rate-limiting for the throughput of the dataflow. • In this mode, the edge layer demands messages to the device layer only when it can handle them.
  25. Dataflow notation Dataflow can be configured in one line. •

    You can grasp whole the dataflow and how the data is processed at a glance. • Dataflow notation consists of 3 types below: ◦ Input ◦ Processing ◦ Direction Push <~> P1 <~> P2 <~> P3 <~> Output Processing Direction Input
  26. Dataflow and processors Dataflow and processors can be configured separately

    as the right figures. • The names of processors are linked to their implementation as Elixir modules. • Processors are supposed to implement Pratipad.Processor behaviour’s process/2 method. • Do anything you want to do with messages in the method. Push <~> P1 <~> P2 <~> P3 <~> Output defmodule P1 do alias Pratipad.Processor use Processor @impl GenServer def init(initial_state) do %{:ok, initial_state) end @impl Processor def process(message, state) do # do something with the message end end Dataflow Processor
  27. Notations that Pratipad provides Type Notation Description Input Push Pull

    Demand Push mode Pull mode Demand mode Processing P [P1, P2, …, PN] {P1, P2, …, PN} A single processor Multiprocessor for sequential processing Multiprocessor for concurrent processing Direction ~> <~> Unidirectional dataflow Bidirectional dataflow
  28. Sequential and concurrent processing Pratipad provides 2 types of processing

    to meet your demand. Sequential Message Concurrent Message Message is processed sequentially Message Message Message is processed concurrently Message can be modified by the processors. Message is identical to the original.
  29. Leveraging the Elixir syntax with macros Both unidirectional and bidirectional

    notations are implemented by macros. defmacro left ~> right do quote do handle_unidirectional_op(unquote(left), unquote(right)) end end defp handle_unidirectional_op(left, right) when left == Push or left == Demand do %Dataflow{ mode: @input_mode_map[left], forward: %Forward{ processors: [right] }, backward_enabled: false } end CIte: pratipad/lib/pratipad/dataflow/dsl.ex https://github.com/kentaro/pratipad/blob/main/lib/pratipad/dataflow/dsl.ex
  30. How to handle messages in Broadway and processors Broadway.handle_message/3 delegates

    the messages to processors. Device Layer Cloud Layer Pratipad.Broadway.Forward off_broadway_otp_distribution Broadway.handle_message/3 Broadway.handle_batch/4 P1-3 implements Pratipad.Processor P1.process/1 Push <~> P1 <~> P2 <~> P3 <~> Output Above dataflow given ① ② ③ ④ ⑤ P2.process/1 P3.process/1 Invoke the process method of each processors
  31. Wrap-up and a future Internet of Things == Distributed network

    of Erlang nodes • Elixir can be the best candidate to build IoT systems. • Building such systems in Elixir realizes that everything ◦ Runs as an Erlang node ◦ Connects each other via the distributed network. • Pratipad can be a help of Elixir for the IoT world. Next big thing: Toward the multi-tenant Erlang distribution.