Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming IoT Data: Making Sense of Sensors with SQL Streaming @VoxxedDays Zurich

Taming IoT Data: Making Sense of Sensors with SQL Streaming @VoxxedDays Zurich

as presented at VoxxedDays Zürich 2019, Switzerland

Abstract:

We are living in a data streaming era, yet until recently it has been particularly hard to leverage existing stream processing technologies. On the one hand, because dealing with data in motion has its inherent challenges. On the other hand, most frameworks and APIs which are allowing for stream processing are typically very hard to employ and/or operate - NOT so for KSQL, the newest kid in Apache Kafka's ecosytem. Based on a simplified version of an IoT use case this session gives a gentle introduction into KSQL - Kafka's SQL streaming engine for the masses.

Join this whirlwind tour during which we are discussing a fully-fledged streaming IoT architecture. Concretely, we are going to:

(1) ingest smart home energy data into Apache Kafka, (2) use KSQL for flexible, powerful and scalable SQL-only stream processing, (3) send raw data as well as pre-processed results to an operational NoSQL data store, (4) reactively serve data to clients in near real-time and (5) finally build informative live charts.

Conference Page: https://voxxeddays.com/zurich

YouTube Recording: https://www.youtube.com/watch?v=fqqHG3pT5xQ

744f1c2c6cbea2ff5104b0ac512936bd?s=128

Hans-Peter Grahsl

March 19, 2019
Tweet

Transcript

  1. Taming IoT Data: Making Sense of Sensors with SQL Streaming

  2. $ whoami • Hans-Peter Grahsl • working & living in

    Graz • technical trainer at • independent consultant & engineer • associate lecturer • " irregular conference speaker @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 2
  3. WHAT IS STREAMING ! ❓ ! ❓ @hpgrahsl | #VDZ19

    @VoxxedZurich, 19th March 2019, Switzerland 3
  4. "a type of data processing that is designed with infinite

    data sets in mind" — Tyler Akidau
  5. Streaming == BIG DEAL 1. unbounded data sets are prevalent

    ➡ never-ending data streams need purpose-built systems 2. people crave for timely information ➡ stream processing technology aids lower latencies @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 5
  6. BIGGEST Challenge? @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland

    6
  7. These and many many more...

  8. Today the choice is mine

  9. Apache Kafka @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland

    9
  10. STREAMING PLATFORM

  11. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 11

  12. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 12

  13. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 13

  14. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 14

  15. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 15

  16. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 16

  17. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 17

  18. Kafka's streaming SQL engine @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 18
  19. declarative stream processing language @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 19
  20. skyrocketing developer productivity

  21. unlocks streaming for the masses

  22. KSQL's Nature • built on top of Kafka Streams •

    SQL only (not embedded) • NO(!) coding skills required • extremely low entry barrier • familiar syntax and semantics • concise and expressive • joins, aggregations, windowing • UD(A)Fs and UDTFs coming soon... @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 22
  23. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 23

  24. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 24

  25. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 25

  26. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 26

  27. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 27

  28. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 28

  29. KSQL Queries • per-record streaming with milliseconds latency • compiled

    into Kafka Streams applications • follow same execution model • distributed over multiple KSQL servers • two operation modes / deployment options: • interactive vs. headless @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 29
  30. KSQL interactive mode • KSQL servers accessed via REST API

    • offers ad-hoc stream analytics • share streams & tables across users • used for exploration and during development
  31. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 31

  32. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 32

  33. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 33

  34. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 34

  35. KSQL headless mode • streaming queries given by a SQL

    file • KSQL servers process SQL file • use case specific isolation • "locked-down" ➡ NO REST API access • used for production deployments
  36. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 36

  37. None
  38. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 38

  39. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 39

  40. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 40

  41. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 41

  42. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 42

  43. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 43

  44. @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 44

  45. step 1 ingest sensor data @hpgrahsl | #VDZ19 @VoxxedZurich, 19th

    March 2019, Switzerland 45
  46. step 2 KSQL streaming @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 46
  47. "You think that's a database table you are querying now?"

    — Morpheus
  48. "Instead, only try to realize the truth... there is no

    database table." — Spoon Boy
  49. step 3 connecting NoSQL @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 49
  50. step 4 reactive notifications @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 50
  51. step 5 live dashboards @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March

    2019, Switzerland 51
  52. MISSION accomplished

  53. KSQL wrap-up • streaming with SQL ... and nothing but

    SQL • scalable & fault-tolerant • deployable anywhere: cloud or on prem • viable for use cases of any size (XS ... XXXL) • exactly-once delivery guarantee semantics @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 53
  54. "If I have been faster it's by streaming on the

    shoulders of Apache Kafka." — my other self
  55. Your obsession tells you to do batching. I tell you

    to walk away and stream with KSQL The choice is yours folks!
  56. THANK YOU Q & A ? https://bit.ly/2FaLr7w @hpgrahsl | #VDZ19

    @VoxxedZurich, 19th March 2019, Switzerland 56
  57. None