Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pratipad: A Declarative Framework for Describing Bidirectional Dataflow in IoT Systems with Elixir

Pratipad: A Declarative Framework for Describing Bidirectional Dataflow in IoT Systems with Elixir

Kentaro Kuribayashi

October 15, 2021
Tweet

More Decks by Kentaro Kuribayashi

Other Decks in Technology

Transcript

  1. Pratipad
    A Declarative Framework for
    Describing Bidirectional Dataflow in
    IoT Systems with Elixir
    Kentaro Kuribayashi from GMO Pepabo, Inc.

    View Slide

  2. He is:
    ● A software engineer
    ● From Tokyo, Japan
    ● CTO at GMO Pepabo, Inc.
    ● A graduate student at JAIST
    @kentaro
    @kentaro
    Kentaro Kuribayashi

    View Slide

  3. Agenda
    1. Background
    2. Proposals
    2.1. Secure Distribution
    2.2. Thanks to Broadway
    2.3. Elixir Magic
    3. Conclusion

    View Slide

  4. 1. Background

    View Slide

  5. Elixir is eating the IoT world!
    ● Dr. Takase, a co-leader of NervesJP,
    illustrates a future vision of IoT
    systems (right figure).
    ● He envisions that Elixir can
    empower such systems, running on
    all of the layers such as device,
    edge, cloud.
    Cite: Hideki Takase “ElixirでIoT!?ナウでヤングで coolなNervesフレームワーク p.21” https://www.slideshare.net/takasehideki/elixiriotcoolnerves-236780506/21

    View Slide

  6. I have a dream
    I’d like to
    ● Progress the vision to connect things over the layers.
    ● Realize the vision by connecting each other via the distributed Erlang network.
    ● Mitigate issues in developing complicated IoT systems.
    Everything becomes an Erlang node and connects each other via the distributed
    network, which is the way how we can save this complicated world.

    View Slide

  7. Complicated IoT systems?
    Application
    Network
    Perception
    Application
    Middleware
    Processing
    Transport
    Perception
    Cloud
    Edge
    Device
    Three-layer
    model
    Five-layer
    model
    Our model
    - Sensing the physical environment
    and actuating devices.
    - Transport the sensor data to the
    upper layer.
    - Processing the data, i.e., formatting,
    transforming, extending, etc.
    - Analysing the data to produce
    meaningful information.
    - Providing services to users.
    - Sending actuation orders to the
    device layer.

    View Slide

  8. What makes IoT systems complicated?
    The 3 factors below cause the complexity:
    1. Various options of programming languages and communication protocols.
    2. Various and bidirectional data acquisition methods.
    3. Poor visibility of the dataflow throughout the IoT system.

    View Slide

  9. Here Pratipad comes!
    Pratipad is
    ● A solution against the problems
    mentioned earlier.
    ● A declarative framework for
    describing bidirectional dataflow.
    Note: The word “pratipad” is borrowed from
    Sanskrit. It means “way” in English.
    CIte: kentaro/pratipad: “A Declarative Framework for Describing Bidirectional Dataflow” https://github.com/kentaro/pratipad

    View Slide

  10. 2. Proposals

    View Slide

  11. An example IoT system using Pratipad
    Room 1
    Room 2
    Local Area Network
    Edge Layer
    Device Layer
    Device Layer
    My House
    Wide Area Network
    Sensing:
    - CO2 concentration
    - air pressure
    - humidity
    Cloud Layer
    - visualize
    - analyze
    - send back actuation
    order
    External API
    - provide additional metadata
    (e.g. precipitation)
    Users
    - monitor situation
    - do some actions
    (e.g. open the window)
    mTLS
    connection
    - aggregate
    - transform
    - add more info
    ・・・ Running on Elixir LED blinks when the cloud system sends a notification to prompt uses to open the window.

    View Slide

  12. Proposals 1: What we need is just only Elixir!
    Pratipad provides a method to design and implement IoT systems in an integrated
    manner using the same programming language and communication protocol.
    1. All the 3 layers are implemented in Elixir.
    ○ Device: Nerves + Pratipad.Client
    ○ Edge: Pratipad
    ○ Cloud: Phoenix + Pratipad.Client
    2. They are connected to each other via the Erlang distribution protocol.
    ○ All connections are over TLS.
    ○ Client certificates are required for authentication.

    View Slide

  13. Proposals 2: Flexible and bidirectional dataflow
    Pratipad provides a framework that can support push, pull, and demand methods,
    including bidirectional dataflow.
    1. The data is retrieved from the device layer. Incidentally, there are 3 ways for
    that as below. Pratipad supports all of them.
    a. Push
    b. Pull
    c. Demand
    2. Bidirectional dataflow is also supported.

    View Slide

  14. Proposals 3: Declarative dataflow notation
    Pratipad provides a notation method that can grasp the dataflow consisting of the
    3 layers under a single view.
    Pratipad allows us to
    1. Describe dataflow in a declarative manner.
    2. Separate the dataflow itself and how to process the data.

    View Slide

  15. 2.1. Safe Distribution

    View Slide

  16. Why Erlang distribution?
    ● We can adopt some other different
    protocols such as HTTP or MQTT
    for IoT systems.
    ● Elixir is built on the top of
    Erlang/OTP that has legendary
    achievements in telecom systems.
    ● IoT is like such a system in the
    current era. All the 3 layers are connected each other with
    distributed Erlang network.

    View Slide

  17. Joe Armstrong once said
    At this time, we were only interested in connecting conventional sequential
    computers with no shared memory. Our idea was to connect stock hardware
    through TCP/IP sockets and run a cluster of machines behind a corporate firewall.
    We were not interested in security since we imagined all our computers running
    on a private network with no external access. This architecture led to a form of
    all-or-nothing security that makes distributed Erlang suitable for programming
    cluster applications running on a private network, but unsuitable for running
    distributed applications where various degrees of trust are involved.
    Cite: Joe Armstrong “A history of Erlang”
    “

    ”


    View Slide

  18. Securing Erlang distribution
    IoT systems, in nature, need distributed networks in the “external” world.
    To realize such a distribution with Erlang/OTP, we have to ensure
    1. All connections are over TLS (Transport Layer Security) to avoid MITM
    (Man-In-The-Middle) attack.
    2. Client certificates are required for authentication to prohibit those who don’t
    have permission from joining the network.

    View Slide

  19. Enable TLS for distributed Erlang network
    Live example from pratipad_example_device project:
    -proto_dist inet_tls
    -ssl_dist_optfile /etc/<%= System.get_env("PRATIPAD_DEVICE") %>.tls.conf
    -start_epmd false
    -erl_epmd_port 44300
    CIte: pratipad_example_device/rel/vm.args.eex https://github.com/kentaro/pratipad_example_device/blob/main/rel/vm.args.eex
    rel/vm.args.eex
    ※ epmd is prevented from running because it cannot be secure; it is vulnerable to brute-force
    attacks. Moreover, all the nodes in the network can be known by those who can crack the network.

    View Slide

  20. Client certificate
    authentication (mTLS)
    Live example from
    pratipad_example_device project:
    1. Using the settings as the right
    figure, the device001 can be both a
    TLS server and a TLS client try
    authentication in mTLS manner.
    2. If both verify and
    fail_if_no_peer_cert are set,
    servers must verify client
    certificates to authenticate them.
    CIte: pratipad_example_device/rootfs_overlay/etc/device001.tls.conf https://github.com/kentaro/pratipad_example_device/blob/main/rootfs_overlay/etc/device001.tls.conf
    [
    {server,
    [{cacertfile, "/etc/ca.crt"},
    {certfile, "/etc/device001.pratipad.local.crt"},
    {keyfile, "/etc/device001.pratipad.local.key"},
    {secure_renegotiate, true},
    {fail_if_no_peer_cert, true},
    {verify, verify_peer}
    ]},
    {client,
    [{cacertfile, "/etc/ca.crt"},
    {certfile, "/etc/device001.pratipad.local.crt"},
    {keyfile, "/etc/device001.pratipad.local.key"},
    {secure_renegotiate, true},
    {fail_if_no_peer_cert, true},
    {verify, verify_peer}
    ]}
    ].
    device001.tls.conf

    View Slide

  21. 2.2. Thanks to Broadway

    View Slide

  22. It’s on the Broadway
    ● Broadway is a library to “build
    concurrent and multi-stage data
    ingestion and data processing
    pipelines with Elixir.”
    ● Pratipad is built on the top of
    Broadway to realize flexible and
    bidirectional dataflow in IoT
    systems.
    Cite: Broadway https://elixir-broadway.org/

    View Slide

  23. Producer/consumer pattern and Erlang distribution
    kentaro/off_broadway_otp_distribution allows us to use Broadway with Erlang
    distribution.
    Cite: kentaro/off_broadway_otp_distribution: “An OTP distribution connector for Broadway” https://github.com/kentaro/off_broadway_otp_distribution
    Erlang nodes Producer Consumers
    Retrieve messages via distributed Erlang network protocol

    View Slide

  24. Bidirectional Broadways to connect the 3 layers
    Pratipad uses two Broadways to realize the dataflow between the 3 layers.
    Device Layer Edge Layer Cloud Layer
    Forwarder Broadway
    Backwarder Broadway
    Pratipad

    View Slide

  25. How messages flow
    The dataflow is constituted by the combination of Pratipad and Broadway.
    Device Layer Edge Layer Cloud Layer
    Forwarder Broadway
    Pratipad.Client off_broadway_otp_distribution
    Broadway.handle_message/3
    Broadway.handle_batch/4
    Pratipad.Client
    Backwarder Broadway
    Pratipad.Client Pratipad.Client
    off_broadway_otp_distribution

    View Slide

  26. Data acquisition methods
    There are 3 ways to retrieve data from the devices:
    1. Push
    2. Pull
    3. Demand
    Pratipad supports all of the data acquisition methods mentioned above!

    View Slide

  27. Push method
    The dataflow starts from the device
    layer.
    ● The device layer is responsible for
    how many messages are emitted.
    ● It’s useful when the time resolution
    of sensor data is important for the
    target IoT system.

    View Slide

  28. Pull method
    The dataflow starts from the cloud
    layer.
    ● The cloud layer is responsible for
    how many messages are emitted.
    ● It’s useful when you need to control
    the message flow rate along with
    service levels to be offered to each
    system user.

    View Slide

  29. Demand method
    The dataflow starts from the edge layer.
    ● The edge layer is responsible for
    how many messages are emitted.
    ● It’s useful when the edge layer
    becomes rate-limiting for the
    throughput of the dataflow.
    ● In this mode, the edge layer
    demands messages to the device
    layer only when it can handle them.

    View Slide

  30. 2.3. Elixir Magic

    View Slide

  31. Dataflow notation
    Dataflow can be configured in one line.
    ● You can grasp whole the dataflow
    and how the data is processed at a
    glance.
    ● Dataflow notation consists of 3
    types below:
    ○ Input
    ○ Processing
    ○ Direction
    Push <~> P1 <~> P2 <~> P3 <~> Output
    Processing
    Direction
    Input

    View Slide

  32. Dataflow and processors
    Dataflow and processors can be
    configured separately as the right
    figures.
    ● The names of processors are linked
    to their implementation as Elixir
    modules.
    ● Processors are supposed to
    implement Pratipad.Processor
    behaviour’s process/2 method.
    ● Do anything you want to do with
    messages in the method.
    Push <~> P1 <~> P2 <~> P3 <~> Output
    defmodule P1 do
    alias Pratipad.Processor
    use Processor
    @impl GenServer
    def init(initial_state) do
    %{:ok, initial_state)
    end
    @impl Processor
    def process(message, state) do
    # do something with the message
    end
    end
    Dataflow
    Processor

    View Slide

  33. Notations that Pratipad provides
    Type Notation Description
    Input
    Push
    Pull
    Demand
    Push mode
    Pull mode
    Demand mode
    Processing
    P
    [P1, P2, …, PN]
    {P1, P2, …, PN}
    A single processor
    Multiprocessor for sequential processing
    Multiprocessor for concurrent processing
    Direction
    ~>
    <~>
    Unidirectional dataflow
    Bidirectional dataflow

    View Slide

  34. Sequential and concurrent processing
    Pratipad provides 2 types of processing to meet your demand.
    Sequential
    Message
    Concurrent
    Message
    Message is processed sequentially
    Message Message
    Message is processed
    concurrently
    Message can be modified
    by the processors.
    Message is identical to
    the original.

    View Slide

  35. Leveraging the Elixir syntax with macros
    Both unidirectional and bidirectional notations are implemented by macros.
    defmacro left ~> right do
    quote do
    handle_unidirectional_op(unquote(left), unquote(right))
    end
    end
    defp handle_unidirectional_op(left, right) when left == Push or left == Demand do
    %Dataflow{
    mode: @input_mode_map[left],
    forward: %Forward{
    processors: [right]
    },
    backward_enabled: false
    }
    end
    CIte: pratipad/lib/pratipad/dataflow/dsl.ex https://github.com/kentaro/pratipad/blob/main/lib/pratipad/dataflow/dsl.ex

    View Slide

  36. How to handle messages in Broadway and processors
    Broadway.handle_message/3 delegates the messages to processors.
    Device Layer
    Cloud Layer
    Pratipad.Broadway.Forward
    off_broadway_otp_distribution
    Broadway.handle_message/3
    Broadway.handle_batch/4
    P1-3 implements Pratipad.Processor
    P1.process/1
    Push <~> P1 <~> P2 <~> P3 <~> Output
    Above dataflow given

    ② ③


    P2.process/1
    P3.process/1
    Invoke the process method of each processors

    View Slide

  37. 6. Conclusion

    View Slide

  38. Wrap-up and a future
    Internet of Things == Distributed network of Erlang nodes
    ● Elixir can be the best candidate to build IoT systems.
    ● Building such systems in Elixir realizes that everything
    ○ Runs as an Erlang node
    ○ Connects each other via the distributed network.
    ● Pratipad can be a help of Elixir for the IoT world.
    Next big thing: Toward the multi-tenant Erlang distribution.

    View Slide

  39. Thank you!

    View Slide