Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Processing Streaming Data with KSQL

Processing Streaming Data with KSQL

Apache Kafka is a de facto standard streaming data processing platform, being widely deployed as a messaging system, and having a robust data integration framework (Kafka Connect) and stream processing API (Kafka Streams) to meet the needs that common attend real-time message processing. But there’s more!

Kafka now offers KSQL, a declarative, SQL-like stream processing language that lets you define powerful stream-processing applications easily. What once took some moderately sophisticated Java code can now be done at the command line with a familiar and eminently approachable syntax. Come to this talk for an overview of KSQL with live coding on live streaming data.

Viktor Gamov

October 06, 2018
Tweet

More Decks by Viktor Gamov

Other Decks in Technology

Transcript

  1. @ @gamussa #SQLSaturday @confluentinc Solutions Architect Developer Advocate @gamussa in

    internetz Hey you, yes, you, go follow me in twitter © Who am I?
  2. @gamussa #SQLSaturday @confluentinc Stream Processing by Analogy Kafka Cluster Connect

    API Stream Processing Connect API $ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
  3. @ @gamussa #SQLSaturday @confluentinc Kafka is a Streaming Platform The

    Log Connectors Connectors Producer Consumer Streaming Engine
  4. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  5. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  6. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  7. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  8. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  9. @ @gamussa #SQLSaturday @confluentinc CREATE STREAM possible_fraud AS SELECT card_number,

    count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud What exactly is Stream Processing?
  10. @ @gamussa #SQLSaturday @confluentinc Gary 1 Gary 1 Viktor 1

    Gary 2 Viktor 1 Gary 2 Viktor 1 Soby 1 TABLE STREAM TABLE (“Gary”, 1) (“Viktor”, 1) (“Gary”, 2) (“Soby”, 1) Gary 1 Gary 1 Viktor 1 Gary 2 Viktor 1 Gary 2 Viktor 1 Soby 1
  11. @gamussa #SQLSaturday @confluentinc Where is KSQL not such a great

    fit? BI reports (Tableau etc.) •No indexes •No JDBC (most BI tools are not good with continuous results!) Ad-hoc queries •Limited span of time usually retained in Kafka •No indexes