Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Apps with Kafka

Pooja Mistry
September 15, 2021

Scaling Apps with Kafka

Scaling Apps with Kafka

Pooja Mistry

September 15, 2021
Tweet

More Decks by Pooja Mistry

Other Decks in Technology

Transcript

  1. Or how I learned to build a complete streaming app

    with four simple SQL statements in ksqlDB. Data In Motion
  2. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. 2 “You could not step twice into the same river; for other waters are ever flowing on to you. Unless that water is data and the river is Kafka, then, sure.” Heraclitus - Probably
  3. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. The Rise of Data in Motion Data as a continuous stream of events 80% Fortune 100 Companies Using Apache Kafka 4
  4. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Transforming our customers’ apps and data architecture Auto / Transport Without Event Streaming With Event Streaming Batch-driven scheduling Real-time ETA Banking Nightly credit-card fraud checks Real-time credit card fraud prevention Retail Batch inventory updates Real-time inventory management Healthcare Batch claims processing Real-time claims processing Media Batch data pipelines - production supply chain Real-time data pipeline Manufacturing Scheduled equipment maintenance Automated, predictive maintenance Defense Reactive cyber-security forensics Automated SIEM and Anomaly Detection U.S. Defense Agencies
  5. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Confluent Transforms Data Usage Throughout Enterprises Retail Drive consumer analytics & streamline operations Healthcare Provide patients better choices & doctors better insight Capital Markets Combat fraud & remain competitive Automotive Amplify vehicle intelligence & safety Inventory Management Personalized Promotions Product Development & Introduction Sentiment Analysis Streaming Enterprise Messaging Systems of Scale for High Traffic Periods Connected Health Records Data Confidentiality & Accessibility Dynamic Staff Allocation Optimization Integrated Treatment Proactive Patient Care Real-Time Monitoring Capital Management Early-On Fraud Detection Market Risk Recognition & Investigation Preventive Regulatory Scanning Real-Time What-If Analysis Trade Flow Monitoring Advanced Navigation Environmental Factor Processing Fleet Management Predictive Maintenance Threat Detection & Real-Time Response Traffic Distribution Optimization Common In All Industries Infrastructure Use Cases Data Pipelines Messaging Microservice/ Event Sourcing Stream Processing Data Integration Streaming ETL
  6. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Confluent Customers by Industry FINANCIAL SERVICES INSURANCE TECH HEALTHCARE COMMUNICATIONS & MEDIA AUTOMOTIVE/TRANSPORTATION CONSUMER/RETAIL TRAVEL
  7. Kafka is powerful … but hard Install Configure Make secure

    Build apps Debug Find data Get data in/out Monitor pipelines ? Upgrade Monitor apps Alert errors
  8. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Enterprise Data Architecture is a Giant Mess LINE OF BUSINESS 01 LINE OF BUSINESS 02 PUBLIC CLOUD
  9. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Service Oriented Architecture
  10. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Service Oriented Architecture ?
  11. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Most stream processing architectures are complex DB CONNECTOR CONNECTOR APP APP DB STREAM PROCESSING CONNECTOR APP DB
  12. Manage Make changes to Kafka objects and services and see

    real-time statuses. • Create/edit topics • Change cluster settings • Manage connectors • Manage ksqlDB Monitor See metrics data for Kafka and connected services over a period of time. • Broker throughput • Topic throughput • Under Replicated Partitions • Disk usage over time Deploy Manage Kafka and connected services at scale. • Upgrade a cluster • Restart a cluster • Add a new broker
  13. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Confluent Products Performance & Elasticity Auto Data Balancer | Tiered Storage Flexible DevOps Automation Operator | Ansible GUI-driven Mgmt & Monitoring Control Center Efficient Operations at Scale Freedom of Choice Committer-driven Expertise Event Streaming Database ksqlDB Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development Non-Java Clients | REST Proxy Global Resilience Multi-region Clusters | Replicator Data Compatibility Schema Registry | Schema Validation Enterprise-grade Security RBAC | Secrets | Audit Logs ARCHITECT OPERATOR DEVELOPER Open Source | Community licensed Unrestricted Developer Productivity Production-stage Prerequisites Fully Managed Cloud Service Self-managed Software Training Partners Enterprise Support Professional Services Apache Kafka
  14. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Complete Technology Ecosystem 15 Data Diode
  15. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Confluent Delivers A Complete Event Streaming Platform Apache Kafka® Core | Connect API | Streams API Performance & Scalability Tiered Storage | Self-Balancing Clusters | k8s Operator Database Changes Log Events IoT Data Web Events Other Events DATA INTEGRATION REAL-TIME APPLICATIONS Datacenter Public Cloud Confluent Cloud Confluent Platform Security & Resiliency RBAC | Audit Logs | Schema Validation | Multi-Region Clusters | Replicator | Cluster Linking Development & Connectivity Connectors | Non-Java Clients | REST Proxy | Schema Registry | ksqlDB Confluent fully-managed Customer self-managed Hadoop Database Data Warehouse CRM Other Customer 360 Fraud Detection Inventory Management Analytics & ML Other Management & Monitoring Control Center | Proactive Support COMMUNITY FEATURES COMMERCIAL FEATURES OPEN SOURCE FEATURES
  16. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Most stream processing architectures are complex DB CONNECTOR CONNECTOR APP APP DB STREAM PROCESSING CONNECTOR APP DB
  17. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Most stream processing architectures are complex DB CONNECTOR CONNECTOR APP APP DB STREAM PROCESSING CONNECTOR APP DB 1 2 3 4
  18. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Our unfair advantage Confluent Processing Runtime Schema Kafka Streams Confluent Schema Registry Query Event Capture Replication Event Storage Kafka Core Cluster Linking Kafka Connect State Stores
  19. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Data in motion with Confluent Kafka producer/ consumer Kafka Streams ksqlDB
  20. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Stream processing approach comparison Kafka producer/consumer Kafka Streams ksqlDB ConsumerRecords<String, String> records = consumer.poll(100); Map<String, Integer> counts = new DefaultMap<String, Integer>(); for (ConsumerRecord<String, Integer> record : records) { String key = record.key(); int c = counts.get(key) c += record.value() counts.put(key, c) } for (Map.Entry<String, Integer> entry : counts.entrySet()) { int stateCount; int attempts; while (attempts++ < MAX_RETRIES) { try { stateCount = stateStore.getValue(entry.getKey()) stateStore.setValue(entry.getKey(), entry.getValue() + stateCount) break; } catch (StateStoreException e) { RetryUtils.backoff(attempts); } } } builder .stream("input-stream", Consumed.with(Serdes.String(), Serdes.String())) .groupBy((key, value) -> value) .count() .toStream() .to("counts", Produced.with(Serdes.String(), Serdes.Long())); SELECT x, count(*) FROM stream GROUP BY x EMIT CHANGES;
  21. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Stream processing technology organization ksqlDB Kafka producer/consumer Kafka Streams ksqlDB Each layer encapsulates and uses the layer beneath it
  22. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. An architecture fewer moving parts DB APP APP DB APP PULL PUSH CONNECTORS STREAM PROCESSING STATE STORES ksqlDB
  23. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. An architecture fewer moving parts DB APP APP DB APP PULL PUSH CONNECTORS STREAM PROCESSING STATE STORES ksqlDB 1 2
  24. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. Build a complete streaming app with 4 SQL statements Serve lookups against materialized views Create materialized views Perform continuous transformations CREATE SOURCE CONNECTOR jdbcConnector WITH ( ‘connector.class’ = '...JdbcSourceConnector', ‘connection.url’ = '...', …); CREATE STREAM purchases AS SELECT viewtime, userid,pageid, TIMESTAMPTOSTRING(viewtime, 'yyyy-MM-dd HH:mm:ss.SSS') FROM pageviews; CREATE TABLE orders_by_country AS SELECT country, COUNT(*) AS order_count, SUM(order_total) AS order_total FROM purchases WINDOW TUMBLING (SIZE 5 MINUTES) LEFT JOIN purchases ON purchases.customer_id = user_profiles.customer_id GROUP BY country EMIT CHANGES; SELECT * FROM orders_by_country WHERE country='usa'; Capture data
  25. Copyright 2021, Confluent, Inc. All rights reserved. This document may

    not be reproduced in any manner without the express written permission of Confluent, Inc. DEMO TIME