Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
[DevNexus-2018] Apache Kafka A Streaming Data Platform
Viktor Gamov
February 22, 2018
Technology
2
180
[DevNexus-2018] Apache Kafka A Streaming Data Platform
Viktor Gamov
February 22, 2018
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
230
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
240
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
56
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
140
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
180
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
150
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
87
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
380
[Atlanta JUG] Testing containers with TestContainers
vikgamov
0
730
Other Decks in Technology
See All in Technology
合同IT企業説明会から学ぶエンジニア向けの広報戦略
nagutabby
1
200
データ分析のためのAWS Well-Architected -Data Analytics Lens-
maru1981
0
230
DevelopersIO 2022 俺のTerraform Pipeline
takakuni
0
430
やってみたLT会 Fleet Managerのススメ
yukiiiiikuma
PRO
0
380
Oracle Cloud Infrastructure:2022年7月度サービス・アップデート
oracle4engineer
PRO
0
180
EKS AnywhereとIAM Anywhereを組み合わせてみた
regmarmcem
0
260
ECS on EC2 で Auto Scaling やってみる!
sayjoy
1
140
Backlog × RPAでいろいろ捗った話
z_tetsu
0
380
ロボットの実行すらメンドクサイ!?
kou12092
0
130
〇〇みたいな検索作ってと言われたときに考えること / thinking before developing search system like that one
ryook
5
2.7k
ぼくらが選んだ次のMySQL 8.0 / MySQL80 Which We Choose
line_developers
PRO
7
2.9k
年700万円損するサーバレスの 認可システムをご紹介します!!
higuuu
3
330
Featured
See All Featured
Optimizing for Happiness
mojombo
365
64k
Become a Pro
speakerdeck
PRO
3
900
BBQ
matthewcrist
74
7.9k
Designing with Data
zakiwarfel
91
4k
WebSockets: Embracing the real-time Web
robhawkes
57
5.5k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
29
4.4k
Building Flexible Design Systems
yeseniaperezcruz
310
34k
The Web Native Designer (August 2011)
paulrobertlloyd
75
2k
4 Signs Your Business is Dying
shpigford
169
20k
Faster Mobile Websites
deanohume
294
28k
Practical Orchestrator
shlominoach
178
8.7k
Agile that works and the tools we love
rasmusluckow
319
19k
Transcript
@ Apache Kafka A Streaming Data Platform
@ @gamussa @confluentinc Who am I?
@ @gamussa @confluentinc Solutions Architect Who am I?
@ @gamussa @confluentinc Solutions Architect Developer Advocate Who am I?
@ @gamussa @confluentinc Solutions Architect Developer Advocate @gamussa in internetz
Who am I?
@ @gamussa @confluentinc Solutions Architect Developer Advocate @gamussa in internetz
Hey you, yes, you, go follow me in twitter © Who am I?
@ @gamussa @confluentinc
@ @gamussa @confluentinc A company is build on
@ @gamussa @confluentinc A company is build on DATA FLOWS
but All we have is DATA STORES
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.
Process
@ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.
Process
@ @gamussa @confluentinc Core abstraction
@ @gamussa @confluentinc Core abstraction DB - table
@ @gamussa @confluentinc Core abstraction DB - table Hadoop -
file
@ @gamussa @confluentinc Core abstraction DB - table Hadoop -
file Messaging -?
@ @gamussa @confluentinc LOGS
@ @gamussa @confluentinc Producing to Kafka Time
@ @gamussa @confluentinc Producing to Kafka Time C C C
@ @gamussa @confluentinc Producing to Kafka - With Key Time
A B C D hash(key) % numPartitions = N
@ @gamussa @confluentinc Producing to Kafka - No Key Time
Messages will be produced in a round robin fashion
@ @gamussa @confluentinc Producing to Kafka - No Key Time
Messages will be produced in a round robin fashion
@ @gamussa @confluentinc Producing to Kafka - No Key Time
Messages will be produced in a round robin fashion
@ @gamussa @confluentinc Producing to Kafka - No Key Time
Messages will be produced in a round robin fashion
@ @gamussa @confluentinc Consuming From Kafka - Single Consumer C
@ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C
C C1 C C C2
@ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C
C C C
@ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0
1 2 3
@ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0
1 2 3
@ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0,
3 1 2 3
@ @gamussa @confluentinc Producers Consumers
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc Kafka Connect does hard work so you
don’t
@ @gamussa @confluentinc Kafka Connect does hard work so you
don’t 1. Scale out
@ @gamussa @confluentinc Kafka Connect does hard work so you
don’t 1. Scale out
@ @gamussa @confluentinc Kafka Connect does hard work so you
don’t 1. Scale out
@ @gamussa @confluentinc Kafka Connect does hard work so you
don’t 1. Scale out
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.
Process
@ @gamussa @confluentinc Why Store?
@ @gamussa @confluentinc Scalability of a filesystem
@ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s
@ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s
TBs per server
@ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s
TBs per server Commodity Hardware
@ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s
TBs per server Commodity Hardware O(1) writes
@ @gamussa @confluentinc Guarantees of a database
@ @gamussa @confluentinc Guarantees of a database Persistence
@ @gamussa @confluentinc Guarantees of a database Persistence Strict ordering
@ @gamussa @confluentinc Distributed by Design
@ @gamussa @confluentinc Replication Distributed by Design
@ @gamussa @confluentinc Replication Fault Tolerance Distributed by Design
@ @gamussa @confluentinc Replication Fault Tolerance Partitioning Distributed by Design
@ @gamussa @confluentinc Replication Fault Tolerance Partitioning Scale Distributed by
Design
@ @gamussa @confluentinc
@ @gamussa @confluentinc Partition Leadership and Replication Broker 1 Topic1
partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
@ @gamussa @confluentinc Partition Leadership and Replication - node failure
Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
@ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.
Process
@ @gamussa @confluentinc What is Stream Processing? A machine for
combining streams of events
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc https://www.confluent.io/download/
@ @gamussa @confluentinc We are hiring! https://www.confluent.io/careers/
@ @gamussa @confluentinc One more thing…
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc
@ @gamussa @confluentinc A Major New Paradigm
@ @gamussa @confluentinc Thanks! questions? @gamussa viktor@confluent.io We are hiring!
https://www.confluent.io/careers/