Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming IoT Data - Making Sense of Sensors with SQL Streaming @VoxxedCluj 2019

Taming IoT Data - Making Sense of Sensors with SQL Streaming @VoxxedCluj 2019

* Abstract:

We are living in a data streaming era, yet until recently it has been particularly hard to leverage existing stream processing technologies. On the one hand, because dealing with data in motion has its inherent challenges. On the other hand, most frameworks and APIs which are allowing for stream processing are typically very hard to employ and/or operate – NOT so for KSQL, the newest kid in Apache Kafka’s ecosystem.

Based on a simplified version of an IoT use case this session gives a gentle introduction into KSQL – Kafka’s SQL streaming engine for the masses. Join this fast-paced tour during which we are discussing a streaming IoT architecture. Concretely, we are going to:

(1) ingest smart home energy data into Apache Kafka,
(2) use KSQL for flexible, powerful and scalable SQL-only stream processing,
(3) send raw data as well as pre-processed results to an operational NoSQL data store, (4) reactively serve data to clients in near real-time and
(5) finally feed informative live charts.

* Video Recording:
https://www.youtube.com/watch?v=BkIwgWYRTYc&list=PLRsbF2sD7JVo4wqpokeojf07YfZsn5iUq&index=11

Hans-Peter Grahsl

October 31, 2019
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. Taming I
    !
    T Data:
    Making Sense
    !
    f Sens
    !
    rs with
    SQL Streaming

    View Slide

  2. ? streaming ?
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 2

    View Slide

  3. "... data processing
    that is designed with
    infinite data sets
    in mind."
    — Tyler Akidau
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 3

    View Slide

  4. Streaming
    is a big deal
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 4

    View Slide

  5. EVENTS
    ...events everywhere
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 5

    View Slide

  6. impatient and demanding
    NOW!
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 6

    View Slide

  7. Streaming Technologies
    ✓ purpose-built for data-in-motion
    ✓ events are 1st class citizens
    ✓ faster results & accurate answers
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 7

    View Slide

  8. biggest
    Challenge?
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 8

    View Slide

  9. View Slide

  10. not a mess
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 10

    View Slide

  11. BUT A MAZE
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 11

    View Slide

  12. View Slide

  13. View Slide

  14. Apache
    Kafka
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 14

    View Slide

  15. Apache Kafka
    ✓ pub / sub to event streams
    ✓ (permanently) store event streams
    ✓ process streams in near real-time
    ➔ horizontal scalability
    ➔ high fault-tolerance
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 15

    View Slide

  16. Event
    Streaming
    PLATFORM
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 16

    View Slide

  17. APIs for Everything
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 17

    View Slide

  18. Everything built for Streaming
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 18

    View Slide

  19. Kafka's
    streaming
    SQL
    engine
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 19

    View Slide

  20. declarative
    stream
    processing
    language
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 20

    View Slide

  21. skyrocketing
    developer
    productivity
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 21

    View Slide

  22. unleash
    streaming for
    the masses
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 22

    View Slide

  23. KSQL in your Blood Cell
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 23

    View Slide

  24. KSQL in a Nutshell
    ✓ built on top of Kafka Streams
    ✓ NO(!) coding skills required
    ✓ SQL only ➔ not embedded
    ✓ extremely low entry barrier
    ✓ familiar syntax & semantics
    ✓ concise & expressive
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 24

    View Slide

  25. KSQL in a Nutshell
    the usual suspects OOTB:
    ✓ projections, filters
    ✓ joins, aggregations
    ✓ windowing
    something missing?
    ✓ UDF & UDAF
    ✓ UDTF pending
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 25

    View Slide

  26. KSQL Queries
    ✓ per-record streaming with ms latency
    ✓ compiled into Kafka Streams apps
    ✓ distributed execution: KSQL servers
    ✓ 2 modes: interactive vs. headless
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 26

    View Slide

  27. KSQL's Interactive Mode
    ✓ KSQL servers accessed via REST API
    ✓ offers ad-hoc analytics of streams
    ✓ users can share streams & tables
    ✓ used for exploration and during development
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 27

    View Slide

  28. @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 28

    View Slide

  29. @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 29

    View Slide

  30. @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 30

    View Slide

  31. @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 31

    View Slide

  32. KSQL'S Headless Mode
    ✓ application == SQL file
    ✓ KSQL servers run streaming queries
    ✓ use case specific isolation
    ✓ "locked-down" ➡ NO REST API access
    ✓ used for production deployments
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 32

    View Slide

  33. @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 33

    View Slide

  34. SHOW ME!
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 34

    View Slide

  35. Demo Scenario ...
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 35

    View Slide

  36. Demo Scenario: Data Ingestion
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 36

    View Slide

  37. Demo Scenario: Data Processing
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 37

    View Slide

  38. KSQL in Action
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 38

    View Slide

  39. "You think that's a
    database table
    you're querying now?"
    — Morpheus
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 39

    View Slide

  40. "Instead, only try to
    realize the truth - there is
    no database table."
    — Spoon Boy
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 40

    View Slide

  41. Demo Scenario: Data Integration
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 41

    View Slide

  42. Demo Scenario: Data Serving
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 42

    View Slide

  43. MISSION
    accomplished
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 43

    View Slide

  44. KSQL Wrap-Up
    ✓ SQL... and nothing but SQL
    ✓ use cases of any size (XS ... XXXL)
    ✓ scalable & fault-tolerant
    ✓ deployable anywhere in any way
    ✓ no additional infrastructure
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 44

    View Slide

  45. Even if your obsession tells you to do batching,
    I'd like you to walk away and stream with
    KSQL
    The choice is yours folks!
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 45

    View Slide

  46. reach out to me
    @hpgrahsl
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 46

    View Slide

  47. THANK YOU
    ? Questions ?
    https://bit.ly/2NF3CGL
    @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 47

    View Slide

  48. View Slide