Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From ksqlDB with LOVE: Detecting 007 with a Das...

From ksqlDB with LOVE: Detecting 007 with a Dash of Machine Learning @ Graz Kafka Meetup #5

* Abstract:

While lots of people are fancying stream processing technologies, many are hesitant when it comes to adoption because the landscape and ecosystem feel somewhat intimidating at first sight. Fear not, ksqlDB (the event streaming database for Apache Kafka) takes away a lot of burden so that application developers can focus on solving business needs.After a brief introduction to ksqlDB, this session continues with a memorable and fun example featuring the fictional Secret Service agent 007. In a step-by- step fashion, you are walked through an end-to-end streaming scenario which is about near real-time face identification in synthetic CCTV images. The example helps you to gain a better understanding about ksqlDB’s learning curve characteristics, extensibility options as well as integration paths with machine learning related tools and services.

* Recording: https://videos.confluent.io/watch/zkbpHyW6ECKvfJdV1prpib

Hans-Peter Grahsl

July 16, 2020
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. Hans-Peter Grahsl ‣ technical trainer ‣ independent engineer & consultant

    ‣ associate lecturer ‣ occasional conference speaker @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 2
  2. KSQL in a Nutshell ‣ ANSI SQL inspired ‣familiar syntax

    & semantics ‣concise & expressive ‣ built on top of Kafka Streams ‣ NO(!) coding skills required ‣ entry barrier? "none" @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 9
  3. KSQL in a Nutshell ‣ usual suspects OOTB: ‣projections, filters

    ‣joins, aggregations ‣windowing @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 10
  4. Criteria 1 - Learning Curve ‣ SQL == widespread &

    successful 4GL ‣ algorithms & data structures handled for us ‣ KSQL ‣ very easy to learn ‣ quick implementation cycles ‣ productivity-wise hard to beat @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 12
  5. What if I told you it's as easy to build

    a STREAMING app as it is to build a CRUD app? @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 22
  6. Connecting with Sources HINT: connector examples shown in demo later

    @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 26
  7. Connecting with Sinks HINT: connector examples shown in demo later

    @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 27
  8. Instead, only try to realize the truth... ksqlDB is NO

    DATABASE as we know it. @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 33
  9. Data Concepts in ksqlDB STREAM ‣ immutable append-only sequence ‣

    captures events representing a series of facts @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 34
  10. Data Concepts in ksqlDB TABLE ‣ mutable collection of events

    ‣ holds the last known value for each key ‣ also result from stateful operations @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 36
  11. ksqlDB: PUSH Queries ‣ act as subscription to query results

    ‣ fit asynchronous & reactive data flows ‣ run indefinitely ‣ new data causes continuous updates @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 39
  12. ksqlDB: PULL Queries ‣ fetch point-in-time results ‣ fit request

    / response data flows ‣ terminate immediately ‣ lookup current state of materialized views @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 41
  13. ksqlDB Functions ‣ choose from three categories ‣ scalar ➔

    UDF ‣ aggregation ➔ UDAF ‣ table ➔ UDTF @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 43
  14. Criteria 2: Extensibility Options ‣ custom functions (UDFs, UDFAs &

    UDTFs) ‣ enable flexbile & powerful capabilities ‣ but Java code needed HINT: custom UDF example shown in demo later @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 45
  15. Criteria 3: ML Integration Paths ‣ call fully-managed ML services

    (external) ‣ run your own model server (co-located) ‣ package home-brewed model into UDF (embedded) ‣ completely separated: integrate ML results via Connectors @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 58
  16. Criteria 4: Deployment Options ‣ run however you want ‣

    bare metal, VMs, containers ‣ deploy wherever you need ‣ on-premises, private / public cloud, hybrid ‣ something fully-managed? @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 60
  17. Unfortunately, no one can be told what ksqlDB is...You have

    to TRY IT for yourselves! https://ksqldb.io @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 62