Processing Streaming Data with KSQL

Processing Streaming Data with KSQL

Apache Kafka is a de facto standard streaming data processing platform, being widely deployed as a messaging system, and having a robust data integration framework (Kafka Connect) and stream processing API (Kafka Streams) to meet the needs that common attend real-time message processing. But there’s more!

Kafka now offers KSQL, a declarative, SQL-like stream processing language that lets you define powerful stream-processing applications easily. What once took some moderately sophisticated Java code can now be done at the command line with a familiar and eminently approachable syntax. Come to this talk for an overview of KSQL with live coding on live streaming data.

0680be1c881abcf19219f09f1e8cf140?s=128

Viktor Gamov

October 06, 2018
Tweet

Transcript

  1. Processing Streaming Data with KSQL @gamussa #SQLSaturday

  2. @gamussa #SQLSaturday @confluentinc Declarative Stream Language Processing KSQL is a

  3. @gamussa #SQLSaturday @confluentinc KSQL is the Streaming SQL Engine for

    Apache Kafka
  4. @ @gamussa #SQLSaturday @confluentinc Solutions Architect Developer Advocate @gamussa in

    internetz Hey you, yes, you, go follow me in twitter © Who am I?
  5. @gamussa #SQLSaturday @confluentinc Stream Processing by Analogy Kafka Cluster Connect

    API Stream Processing Connect API $ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
  6. @ @gamussa #SQLSaturday @confluentinc Kafka is a Streaming Platform The

    Log Connectors Connectors Producer Consumer Streaming Engine
  7. @ @gamussa #SQLSaturday @confluentinc Streaming 
 is the toolset for

    dealing 
 with events 
 as they move!
  8. @ @gamussa #SQLSaturday @confluentinc authorization_attempts possible_fraud What exactly is Stream

    Processing?
  9. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  10. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  11. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  12. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  13. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  14. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  15. None
  16. @ @gamussa #SQLSaturday @confluentinc Table-Stream Duality

  17. Do you think that’s a table you are querying ?

  18. @ @gamussa #SQLSaturday @confluentinc Streams to Tables

  19. @ @gamussa #SQLSaturday @confluentinc

  20. @ @gamussa #SQLSaturday @confluentinc Stream/Table Duality

  21. @ @gamussa #SQLSaturday @confluentinc Stream/Table Duality

  22. @ @gamussa #SQLSaturday @confluentinc Gary 1 Gary 1 Viktor 1

    Gary 2 Viktor 1 Gary 2 Viktor 1 Soby 1 TABLE STREAM TABLE (“Gary”, 1) (“Viktor”, 1) (“Gary”, 2) (“Soby”, 1) Gary 1 Gary 1 Viktor 1 Gary 2 Viktor 1 Gary 2 Viktor 1 Soby 1
  23. @ @gamussa #SQLSaturday @confluentinc Join Streams and Tables Compacted Topic

    Join Stream Table Kafka Kafka Streams Topic
  24. Demo

  25. @gamussa #SQLSaturday @confluentinc Where is KSQL not such a great

    fit? BI reports (Tableau etc.) •No indexes •No JDBC (most BI tools are not good with continuous results!) Ad-hoc queries •Limited span of time usually retained in Kafka •No indexes
  26. @gamussa #SQLSaturday @confluentinc Resources and Next Steps https://github.com/confluentinc/ksql http://confluent.io/ksql https://slackpass.io/confluentcommunity

    #ksql
  27. @ @gamussa #SQLSaturday @confluentinc Thanks! @gamussa viktor@confluent.io We are hiring!

    https://www.confluent.io/careers/