Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Apache Kafka & Kafka Connectを使ったデータ連携パターン(改めETL...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Keigo Suda
November 25, 2016
10
0
Share
Apache Kafka & Kafka Connectを使ったデータ連携パターン(改めETLの実装)
Keigo Suda
November 25, 2016
More Decks by Keigo Suda
See All by Keigo Suda
フレームワークを意識させないワークショップづくり
keigosuda
0
370
Professional Serviceという働き方
keigosuda
0
28
パッケージ構成っていつでも悩ましい
keigosuda
0
8
スマートファクトリーを⽀えるIoTインフラをつくった話
keigosuda
0
18
Kafka logをオブジェクトストレージに連携する⽅法まとめ
keigosuda
0
10
20161212jawsbigdata-161214152052.pdf
keigosuda
0
19
基幹業務もHadoopで!!
keigosuda
0
19
Featured
See All Featured
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.2k
SEO for Brand Visibility & Recognition
aleyda
0
4.5k
The Invisible Side of Design
smashingmag
303
52k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.3k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
1k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4k
Speed Design
sergeychernyshev
33
1.6k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.4k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
800
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
180
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
0
270
Transcript
Apache Kafka & Kafka ConnectΛ ʹͬͨσʔλ࿈ܞύλʔϯ(վΊETLͷ࣮) Future Architect ,Inc Keigo
Suda 2016/11/25 D&S Data Night vol.04
12 ! Kafka ConnectΛETLπʔϧͱͯ͠ར༻͢Δػձ͕͋ͬͨͷͰͦͷࡍʹ ܦݧͨ͠ҎԼʹ͍ͭͯڞ༗(Ұ௨Γ·Γͨ͠) ͍Ͳ͜Ζ (؆୯ʹ)࣮ϙΠϯτ/ϋϚΓͲ͜Ζ ͖ͯ͘
13 ࣗݾհ ! Future Architect ,Inc ਢా ܡޔ (ͩ͢ ͚͍͝)
! ΤϯλʔϓϥΠζͳͱ͜ΖͰେ͖͍σʔλΛѻ͏ࣄΛͯ͠·͢ ! ࠷ۙIoT·ΘΓͷσʔλج൫Λ࡞ͬͯ·͢
14 ຊʹ͍Δલʹ
15 Kafka Connect͝ଘͰ͔͢ʁ
16 ! Kafka ver 0.9͔Βಋೖ͞Εͨ৽ػೳ ! Kafkaͱͷ࿈ܞϑϨʔϜϫʔΫ ! ϓϥΨϒϧͳػ !
ઃఆϑΝΠϧʹΑΔૢ࡞ ! σʔλΛͲ͏ൈ͔͘/ೖΕΔ͔ͷΈ࣮ Kafka Connect ͜͜ͷ෦
17 Connectors https://www.confluent.io/product/connectors/
ͲΕͲΕɺKafka ConnectͰݕࡧͬͱ
None
ͳΜ͔ͩ࿐ࠎʹ͋Ε͚ͩͲɺࢼ͠
21 ใ͕΄΅օແ
22 ! ࠃ֎ɾࠃͱʹར༻ࣄྫগͳ͍ҹ ! υΩϡϝϯτࣗମΠϚΠνɺࡉ͔͍֬ೝιʔεΛ֬ೝ ! ͍Ζ͍ΖͱϓϥάΠϯ͕ެ։͞Ε͖͍ͯͯΔ͕ɺΠϚΠν࣮ํ๏౷ Ұ͕ͱΕ͍ͯͳ͍ײ͡Ͱ͓खຊ͕গͳ͍ Kafka Connectͷݱঢ়(ݸਓతҙݟ)
23 ETLʹ͓͚ΔKakfa Connect(ݸਓతҙݟ) P T // / / AC KM
I M KM
Kafka ConnectͷΩϗϯ
25 http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started Kakfa Connectͷجຊ ͕͜͜Kafka Connect
26 σʔλ࿈ܞݩ σʔλ࿈ܞͷδϣϒ ࣮ࡍͷσʔλ࿈ܞ୯Ґ ࣮ࡍʹσʔλίϐʔΛߦ͏ॲཧ Kakfa Connectͷجຊ http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
27 Stream & Partition(RDBͷྫ) http://www.slideshare.net/KaufmanNg/data-pipelines-with-kafka-connect
28 σʔλ࿈ܞݩ ࣮ࡍͷσʔλ࿈ܞ୯Ґ ࣮ࡍʹσʔλίϐʔΛߦ͏ॲཧ N:1 Stream & Partition(RDBͷྫ) http://www.slideshare.net/KaufmanNg/data-pipelines-with-kafka-connect
29 σʔλ࿈ܞݩ σʔλ࿈ܞͷδϣϒ ࣮ࡍͷσʔλ࿈ܞ୯Ґ ࣮ࡍʹσʔλίϐʔΛߦ͏ॲཧ Kakfa Connectͷجຊ http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
30 Worker & Connector http://docs.confluent.io/2.0.0/connect/userguide.html#getting-started
31 ͋ ͋ ͊ ͜ Ε Ҏ ্ ༻ ޠ
Λ ૿ ͢ ͳ ɾ ɾ ɾ CONNECTER WORKER STREAM PARTITION STANDALONE MODE DISTRIBUTED MODE TASK
32 $ bin/connect-standalone.sh config/connect-standalone.properties connector1.properties ࣮ࡍͷར༻(Standalone mode) ! ίϚϯυϥΠϯ͔ΒσʔϞϯͱͯ͠ىಈ !
ҎԼΛىಈ࣌ʹࢦఆ͢Δඞཁ͋Γ ! ࣮ߦϓϩηεͰ͋ΔWorkerͷઃఆϑΝΠϧ ! ࣮ࡍʹར༻͢ΔConnectorಛ༗ͷઃఆϑΝΠϧ Workerઃఆ Connectorઃఆ
33 ઃఆϑΝΠϧͷத name=local-file-source connector.class=FileStreamSource tasks.max=1 file=test.txt topic=connect-test bootstrap.servers=localhost:9092 key.converter=org.apache.kafka.connect.json.JsonConverter value.converter=org.apache.kafka.connect.json.JsonConverter
... Workerઃఆ Connectorઃఆ
࣮
35 ! ϓϥάΠϯ৭ʑͱެ։͞ΕΔΑ͏ʹͳ͖ͬͯͨ ! ͨͩ͠ɺ·ͩ͜Ε͔Βͳײ͡ͳͷͰࣗͰॻ͘ػձଟ͍ ! ࣮ͷओཁͳϙΠϯτͱETLͷੜ͔͠ํʹ͍ͭͯ؆୯ʹ ͔͜͜Βͷͳ͠
36 ! Kafka ConnectSourceॲཧͱSinkॲཧ͕͋Γ·͢ɻ ! Sink/Sourceߟ͑ํಉ͡Ͱ͕͢ɺSinkͷํ্͕࣮গ͠Θ͔Γ͢ ͍ͷͰͪ͜ΒΛྫʹ࣮ʹ͍ͭͯઆ໌͠·͢ɻ Kafka Connectͷ࣮ Source
Sink
37 ! ࠷ݶɺConnector/TaskΛ࣮͢Δ͚ͩͰOK ! ྆ํͱ͢ͰʹAPI͕༻ҙ͞Ε͍ͯΔͷͰதΛຒΊΔ͚ͩͰOK σʔλ࿈ܞͷδϣϒ(͜͜) ࣮ࡍʹσʔλίϐʔΛߦ͏ॲཧ(͜͜) ࣮Օॴ
38 Connector(΄΅ςϯϓϨ)
39 େࣄͳͷ͜͜!!(sourceͷ߹pull) Task
40 ! ࣮ͨ͠TaskͱConnectorΛίϯύΠϧ ! Ϋϥεύε͕௨͍ͬͯΔͱ͜Ζʹஔ͢Δ͚ͩʂ Deploy name=sample-sink connector.class=SampleSinkConnector tasks.max=1 topic=connect-test
ɾɾɾɾ Connectorઃఆ
·ͱΊ
42 ! KafkaͷΈΛ͠Βͳͯ͘Ͱ͖Δ୯७ͳσʔλίϐʔϑΟϧλϦ ϯάΛ؆୯ʹ࣮Ͱ͖Δɻ ! ઃఆϑΝΠϧϕʔεͰͷૢ࡞͕ՄೳͰɺ։ൃऀؒͰͷ࣮ͷόϥ͖ͭΛ ͓͑͞ΒΕΔɻ Kafka Connectͷྑ͍ͱ͜
43 ! ෳࡶͳॲཧͰ͖ͳ͘ͳ͍ͳ͕ɺͦͦใগͳ͍͠ɺ࣮ྫগͳ ͍ͷͰɺجຊιʔεΛ͍ͳ͕Βͷ࣮ʢͩͬͨΒී௨ʹAPIͬͨํ ͕ͤʣ Kafka Connectͷѱ͍ͱ͜
44 ࢦఆͨ͠ΠϯΫϦϝϯτϞʔυΛผ ͋ͱΫΤϦΛΈཱ͍ͯͯΔ͚ͩ Μʁ͜Μͳແअؾʹେৎʁ ࠩөྫ(kafka-connect-jdbc)
45 ! ศརϓϥάΠϯΛ͍͍ͨ࣌ɺ΄ͱΜͲ͕ConfluentͰɺΕͳ͘ Confuluent Platformͷػೳ܈ʹґଘͤ͑͞ΔΛ͑ͳ͍ɻ Kafka Connectͷѱ͍ͱ͜
46 ! KakfaΛհͨ͠୯७ͳσʔλίϐʔʹΉ͍͍ͯΔ(ͱࢥ͏) ! Confluent…. ·ͱΊ