Slide 1

Slide 1 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Calculating the Value of Pie Real-Time Survey Analysis With Apache Kafka®

Slide 2

Slide 2 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ I like houseplants.

Slide 3

Slide 3 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ I also like baking.

Slide 4

Slide 4 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/

Slide 5

Slide 5 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Google Forms?

Slide 6

Slide 6 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Google Forms?

Slide 7

Slide 7 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Kafka- and Telegram-based Survey-issuing Application?

Slide 8

Slide 8 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Surveying the Problem Landscape ● Surveys are EVERYWHERE ● Great business value ○ External: what do your customers think of your product? ○ Internal: how are your employees doing? ● Often batch-processed

Slide 9

Slide 9 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Real Time Survey Analysis

Slide 10

Slide 10 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/

Slide 11

Slide 11 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Enhanced Recipe ? !

Slide 12

Slide 12 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Kafka? What’s that?

Slide 13

Slide 13 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Kafka? What’s that? A distributed event streaming platform.

Slide 14

Slide 14 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ A distributed event streaming platform.

Slide 15

Slide 15 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Paradigm Shift Batch Processing Realtime Processing Batch Result Batch Result PROCESSOR … … PROCESSOR Realtime Update Realtime Update Realtime Update Realtime Update

Slide 16

Slide 16 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ A distributed event streaming platform.

Slide 17

Slide 17 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Thinking in Events ● Natural way to reason about things ● Indicate that something has happened ○ When ○ What/Who ● Immutable pieces of information

Slide 18

Slide 18 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Thinking in Events ● Natural way to reason about things ● Indicate that something has happened ○ When ○ What/Who ● Immutable pieces of information

Slide 19

Slide 19 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ A distributed event streaming platform.

Slide 20

Slide 20 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ ● Topics ○ Basic storage unit ○ Read/Write: ■ Producer and consumer clients ■ Completely decoupled ● Partitions ○ Immutable, append-only logs ○ Data is replicated at this level Kafka Storage P0 P1 P2 Kafka Topic

Slide 21

Slide 21 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Recipe ? !

Slide 22

Slide 22 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ ! Recipe ?

Slide 23

Slide 23 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Kafka, but make it simple. ● Fully-managed, cloud-based Kafka ● Auxiliary tools: ○ Kafka Connect ○ ksqlDB ○ Schema management

Slide 24

Slide 24 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ ! Recipe ?

Slide 25

Slide 25 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Knowing Your Pie Data

Slide 26

Slide 26 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ survey-entries { "doc": "Contains individual survey entries for a user and survey question.", "fields": [ { "name": "survey_id", "type": "string" }, { "name": "user_id", "type": "string" }, { "name": "name", "type": "string" }, { "default": null, "name": "company", "type": ["null","string"] }, { "default": null, "name": "location", "type": ["null","string"] }, { "name": "response", "type": "string" } ], "name": "surveyEntry", "namespace": "survey.bot", "type": "record" }

Slide 27

Slide 27 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ survey-respondents { "doc": "Respondents to survey questions.", "fields": [ { "name": "user_id", "type": "string" }, { "name": "name", "type": "string" }, { "default": null, "name": "company", "type": ["null","string"] }, { "default": null, "name": "location", "type": ["null","string"] } ], "name": "surveyRespondent", "namespace": "survey.bot", "type": "record" }

Slide 28

Slide 28 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ survey-responses { "doc": "Contains individual survey responses for a user and survey question.", "fields": [ { "name": "survey_id", "type": "string" }, { "name": "user_id", "type": "string" }, { "name": "response", "type": "string" } ], "name": "surveyResponse", "namespace": "survey.bot", "type": "record" }

Slide 29

Slide 29 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ survey-questions { "doc": "Details for a single question survey.", "fields": [ { "name": "survey_id", "type": "string" }, { "name": "question", "type": "string" }, { "name": "summary", "type": "string" }, { "name": "options", "type": { "items": "string", "type": "array" } }, { "name": "enabled", "type": "boolean" } ], "name": "surveyQuestion", "namespace": "survey.bot", "type": "record" }

Slide 30

Slide 30 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Creating a Survey { "survey_id": "1", "question": "Which Thanksgiving Pie is your favorite?", "summary": "Thanksgiving Pie", "options": [ "Pumpkin Pie", "Pecan Pie", "Apple Pie", "Thanksgiving Leftover Pot Pie", "Other" ], "enabled": true }

Slide 31

Slide 31 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Building a Telegram Bot

Slide 32

Slide 32 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Telegram as a Producer ● python-telegram-bot library ○ Wrapper for Telegram API ○ Define conversation handlers ● Produce data to Kafka ○ Capture data using conversation handlers ○ Create survey-entry message, serialize, and produce

Slide 33

Slide 33 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Conversation Handler survey_handler = ConversationHandler( entry_points = [CommandHandler(survey, survey_command)], states = { SURVEY_STATE.NAME: [ MessageHandler(filters.TEXT & ~filters.COMMAND, name_command), CommandHandler('cancel', cancel_command) ], ... SURVEY_STATE.RESPONSE: [ MessageHandler(filters.Regex( "^(Pumpkin Pie|Pecan Pie|Apple Pie|Thanksgiving Leftover Pot Pie|Other)$") ), response_command), CommandHandler('cancel', cancel_command) ], SURVEY_STATE.CONFIRM: [ CommandHandler('y', confirm_command), CommandHandler('n', cancel_command) ] }, fallbacks=[CommandHandler('cancel', cancel_command)] )

Slide 34

Slide 34 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Command Handler async def name_command(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None: # capture and store name data context.user_data['name'] = update.message.text # get chat_id as user_id context.user_data['user_id'] = update.message.chat_id # prompt for company information await update.message.reply_text( "Enter the company that you work for or use /skip to go to the next question." ) return SURVEY_STATE.COMPANY

Slide 35

Slide 35 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Producer Code def send_entry(entry): # send survey entry message try: # set up Kafka producer for survey entries producer = clients.producer(clients.entry_serializer()) # prep key and value for message k = str(metadata.get('survey_id')) value = SurveyEntry.dict_to_entry(entry) logger.info("Publishing survey entry message for key %s", k) producer.produce(config['topics']['survey-entries'], key=k, value=value) except Exception as e: logger.error("Got exception %s", e) raise e finally: producer.poll() producer.flush()

Slide 36

Slide 36 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Taking a Thanksgiving Survey

Slide 37

Slide 37 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/

Slide 38

Slide 38 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ ! Recipe ?

Slide 39

Slide 39 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ ksqlDB Processing

Slide 40

Slide 40 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Load Stream of survey-entries CREATE STREAM survey_entries WITH ( kafka_topic = 'survey-entries', value_format = 'AVRO' );

Slide 41

Slide 41 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Extracting Respondents CREATE STREAM survey_respondents WITH ( kafka_topic = 'survey-respondents', value_format = 'AVRO' ) AS SELECT user_id, name, company, location FROM survey_entries EMIT CHANGES;

Slide 42

Slide 42 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Extracting Respondents CREATE STREAM survey_respondents WITH ( kafka_topic = 'survey-respondents', value_format = 'AVRO' ) AS SELECT user_id, name, company, location FROM survey_entries EMIT CHANGES;

Slide 43

Slide 43 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Extracting Respondents CREATE STREAM survey_respondents WITH ( kafka_topic = 'survey-respondents', value_format = 'AVRO' ) AS SELECT user_id, name, company, location FROM survey_entries EMIT CHANGES;

Slide 44

Slide 44 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Extracting Respondents CREATE STREAM survey_respondents WITH ( kafka_topic = 'survey-respondents', value_format = 'AVRO' ) AS SELECT user_id, name, company, location FROM survey_entries EMIT CHANGES;

Slide 45

Slide 45 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Extracting Responses CREATE STREAM survey_responses WITH ( kafka_topic = 'survey-responses', value_format = 'AVRO' ) AS SELECT user_id, survey_id, response FROM survey_entries EMIT CHANGES;

Slide 46

Slide 46 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Questions Table CREATE TABLE survey_questions ( id STRING PRIMARY KEY ) WITH ( kafka_topic = 'survey-questions', value_format = 'AVRO' );

Slide 47

Slide 47 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Analysis CREATE TABLE survey_results_live WITH ( kafka_topic='survey-results-live', value_format='AVRO' ) AS SELECT question AS question, HISTOGRAM(response) AS results FROM survey_entries GROUP BY question EMIT CHANGES;

Slide 48

Slide 48 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Results What is your Favorite Thanksgiving Pie? { Pumpkin Pie=1, Pecan Pie=1, ... Apple Pie=3 } { Pumpkin Pie=1, Pecan Pie=1, ... Apple Pie=4 } { Pumpkin Pie=2, Pecan Pie=1, ... Apple Pie=4 } { Pumpkin Pie=2, Pecan Pie=2, ... Apple Pie=4 } ...

Slide 49

Slide 49 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Analysis CREATE TABLE survey_results_live WITH ( kafka_topic='survey-results-live', value_format='AVRO' ) AS SELECT question AS question, HISTOGRAM(response) AS results FROM survey_entries GROUP BY question EMIT CHANGES;

Slide 50

Slide 50 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Analysis CREATE TABLE survey_results_final WITH ( kafka_topic='survey-results-final', value_format='AVRO' ) AS SELECT question AS question, HISTOGRAM(response) AS results FROM survey_entries WINDOW TUMBLING (SIZE 48 HOURS, GRACE PERIOD 10 MINUTE) GROUP BY question EMIT FINAL;

Slide 51

Slide 51 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Analysis CREATE TABLE survey_results_final WITH ( kafka_topic='survey-results-final', value_format='AVRO' ) AS SELECT question AS question, HISTOGRAM(response) AS results FROM survey_entries WINDOW TUMBLING (SIZE 48 HOURS, GRACE PERIOD 10 MINUTE) GROUP BY question EMIT FINAL;

Slide 52

Slide 52 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Alerting with Telegram ● Kafka Connect ○ Connects Kafka and external systems ○ Independent framework ○ Configuration-based ● Kafka Connect HTTP Sink Connector ○ Fully-managed in Confluent Cloud ○ Migrates the events from the survey_results_final topic as they happen !

Slide 53

Slide 53 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Alerting with Telegram ● Kafka Connect ○ Connects Kafka and external systems ○ Independent framework ○ Configuration-based ● Kafka Connect HTTP Sink Connector ○ Fully-managed in Confluent Cloud ○ Migrates the events from the survey_results_final topic as they happen !

Slide 54

Slide 54 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Recipe ? !

Slide 55

Slide 55 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Recipe++ ? !

Slide 56

Slide 56 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Stream Processing Use Case Recipes

Slide 57

Slide 57 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Looking to get started? LinkTree Resources

Slide 58

Slide 58 text

dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/ Questions?