Upgrade to Pro — share decks privately, control downloads, hide ads and more …

YugabyteDB CDC と Spring による Stream Processing

Takeshi
January 31, 2023

YugabyteDB CDC と Spring による Stream Processing

YugabyteDB Japan Hour 2023-01-31

Takeshi

January 31, 2023
Tweet

More Decks by Takeshi

Other Decks in Technology

Transcript

  1. YugabyteDB Japan Hour | 2023-01-31 YugabyteDB CDC と Spring による

    Stream Processing OGAWA,Takeshi Tagbangers,Inc. https://github.com/ogawa-takeshi/yb-cdc-spring
  2. ©2023 Tagbangers,Inc. Agenda • Change Data Capture (CDC) とは •

    YugabyteDB CDC について • Spring Cloud Stream で YugabyteDB の CDC Stream を扱う • Spring Cloud Data Flow で Stream Pipeline を作成する 2
  3. ©2023 Tagbangers,Inc. Change Data Capture (CDC) とは 3 INSERT UPDATE

    DELETE Full-text search Data Warehouse Microservices Replication … Database Change Data Capture Messaging Actions wσʔλϕʔεͷมߋΛ௥੻͢ΔςΫχοΫͰଟ͘ͷσʔλϕʔε͕αϙʔτ͍ͯ͠Δ wσʔλϕʔε͝ͱʹ$%$ͷ࣮૷ํ๏΍ػೳʹҧ͍͕͋Δ Kafka 
 RabbitMQ 
 Amazon Kinesis etc …
  4. ©2023 Tagbangers,Inc. 今回ご紹介する CDC の構成 4 YugabyteDB Spring Boot Kafka

    CDC Stream Change Event Database Change Data Capture Messaging
  5. ©2023 Tagbangers,Inc. YugabyteDB CDC の特徴 • Debezium で Stream を消費する

    • Tablet ごとの順序保証 • At-least-once による配信保証 • Time basesd, Disk size based による Stream の有効期限 7
  6. ©2023 Tagbangers,Inc. YugabyteDB CDC の変更イベントを扱う⽅法 8 Yugabyte Debezium Connector Yugabyte

    Kafka Connector Yugabyte CDCSDK Server Your App Debezium 
 Engine Change Event スタンドアローンで動作 Kafka Connect クラスターで動作 アプリケーションに組込み
  7. ©2023 Tagbangers,Inc. Debezium CDC Message Example 10 { "op": "u",

    "source": { .. . }, "ts_ms" : " ... ", "before" : { "field1" : "oldvalue1", "field2" : "oldvalue2" }, "after" : { "field1" : "newvalue1", "field2" : "newvalue2" } }
  8. ©2023 Tagbangers,Inc. 今回ご紹介する CDC の構成 - Spring Cloud Steam 11

    Spring 
 Cloud 
 Stream YugabyteDB Spring Boot Kafka Database Change Data Capture Messaging
  9. ©2023 Tagbangers,Inc. Functional Programming Model 12 @Bean Function<String, String> upper()

    { return String :: toUpperCase; } @Bean Consumer<String> cons() { return x - > System.out :: println; } @Bean Supplier<Long> currentTime() { return System :: currentTimeMillis; } Messaging 基盤を抽象化
  10. ©2023 Tagbangers,Inc. Spring Cloud Stream Applications 14 Spring Boot Source

    Processor Sink java.util.function 
 Supplier java.util.function 
 Function java.util.function 
 Consumer Application Type Programming Model Spring Boot Spring Boot
  11. ©2023 Tagbangers,Inc. 今回ご紹介する CDC の構成 - Spring Cloud Data Flow

    15 Spring Cloud Data Flow Source Processor Sink Sink YugabyteDB Spring Boot Spring Boot Spring Boot Spring Boot Kafka Kafka
  12. ©2023 Tagbangers,Inc. Spring Cloud Data Flow • Local, Cloud Foundry,

    Kubernetes で動作する • Stream Processing と Batch Processing に対応 • API / CLI • Dashboard (Web UI) 16
  13. ©2023 Tagbangers,Inc. Steps • YugabyteDB CDC Stream の作成 • Spring

    Cloud Stream で YugabyteDB の CDC Source Application を作る • Spring Cloud Data Flow に Deploy する 18
  14. ©2023 Tagbangers,Inc. Steps • YugabyteDB CDC Stream の作成 • Spring

    Cloud Stream で YugabyteDB の CDC Source Application を作る • Spring Cloud Data Flow に Deploy する 19
  15. ©2023 Tagbangers,Inc. Yugabyte CDC Stream の作成 20 > yb-admin \

    -- master_addresses <master-addresses> \ create_change_data_stream ysql.<namespace_name> CDC Stream ID: d540f5e4890c4d3b812933cbfd703ed3 CDC Stream
  16. ©2023 Tagbangers,Inc. Steps • YugabyteDB CDC Stream の作成 • Spring

    Cloud Stream で YugabyteDB の CDC Source Application を作る • Spring Cloud Data Flow に Deploy する 21
  17. ©2023 Tagbangers,Inc. Spring Initializr 22 > curl -G https: //

    start.spring.io/starter.tgz \ -d type=maven-project \ -d bootVersion=2.7.8 \ -d jvmVersion=17 \ -d groupId=playground \ -d artifactId=yb-cdc-source \ -d name=yb-cdc-source \ -d packageName=playground \ -d dependencies=cloud-stream,kafka-streams,… \ -o yb-cdc-source.zip
  18. ©2023 Tagbangers,Inc. Maven Dependencies 23 <dependencies> <dependency> <groupId>org.springframework.cloud.fn </ groupId>

    <artifactId>cdc-debezium-supplier </ artifactId> <version>1.2.1 </ version> </ dependency> <dependency> <groupId>io.debezium </ groupId> <artifactId>debezium-connector-yugabytedb < / artifactId> <version>1.9.5.y.15 </ version> </ dependency> ... </ dependencies> <build> <extensions> <extension> <groupId>com.yugabyte </ groupId> <artifactId>maven-s3-wagon </ artifactId> <version>0.1.3 </ version> </ extension> </ extensions> ... </ build> <repositories> <repository> <id>maven.release.yugabyte.repo </ id> <url>s3: / / repository.yugabyte.com/maven/release </ url> </ repository> <repository> <id>maven.yugabyte.repo </ id> <url>s3: / / repository.yugabyte.com/maven/ </ url> <releases> <enabled>true </ enabled> <updatePolicy>never </ updatePolicy> </ releases> </ repository> </ repositories>
  19. ©2023 Tagbangers,Inc. Reusing CDC Debezium Supplier 24 @SpringBootApplication @Import(CdcSupplierConfiguration.class) public

    class YbCdcSource { public static void main(String[] args) { SpringApplication.run(YbCdcSource.class, args); } } 👈 これを追加 👍 Spring Cloud Stream Applications の CDC Debezium (Supplier) を組み込むだけ
  20. ©2023 Tagbangers,Inc. Yugabyte CDC Source Application 25 Spring Cloud Stream

    CDC Debezium Supplier Yugabyte 
 Debezium Connector Spring Boot Kafka Spring Cloud Stream 
 Kafka Binder CDC Stream YugabyteDB
  21. ©2023 Tagbangers,Inc. Steps • YugabyteDB CDC Stream の作成 • Spring

    Cloud Stream で YugabyteDB の CDC Source Application を作る • Spring Cloud Data Flow に Deploy する • Application を登録する • Stream を作成する • Stream を Deploy する 26
  22. ©2023 Tagbangers,Inc. Application を登録する 27 dataflow :> app register --

    type source -- name yb-cdc-source -- uri maven: // playground:yb-cdc-source:jar:0.0.1-SNAPSHOT -- metadata-uri maven: // playground:yb-cdc-source:jar:metadata:0.0.1-SNAPSHOT
  23. ©2023 Tagbangers,Inc. spring-cloud-dataflow-apps-metadata-plugin 28 <build> <plugins> ... <plugin> <groupId>org.springframework.cloud </

    groupId> <artifactId>spring-cloud-dataflow-apps-metadata-plugin </ artifactId> <version>${spring-cloud-dataflow-apps-metadata-plugin.version} </ version> <executions> <execution> <id>aggregate-metadata </ id> <phase>compile </ phase> <goals> <goal>aggregate-metadata </ goal> </ goals> </ execution> </ executions> </ plugin> </ plugins> </ build>
  24. ©2023 Tagbangers,Inc. Stream を作成する 29 dataflow :> stream create --

    name yb-cdc -- definition "yb-cdc-source -- spring.cloud.function.definition=cdcSupplier -- spring.cloud.stream.kafka.default.producer.messageKeyExpression=headers['cdc_key'] -- cdc.name=yugabyte -- cdc.offset.storage=kafka -- cdc.config.connector.class=io.debezium.connector.yugabytedb.YugabyteDBConnector -- cdc.config.bootstrap.servers=kafka:9092 -- cdc.config.snapshot.mode=never -- cdc.config.offset.storage.partitions=1 -- cdc.config.offset.storage.replication.factor=1 -- cdc.config.offset.storage.topic=yb-cdc-source-offsets -- cdc.config.table.include.list=public.demo -- cdc.config.database.hostname=yb-tserver -- cdc.config.database.port=5433 -- cdc.config.database.user=yugabyte -- cdc.config.database.password=yugabyte -- cdc.config.database.dbname=yugabyte -- cdc.config.database.streamid=74dba4a58d4c4cc0b0a3b5914c47ec30 -- cdc.config.database.master.addresses=yb-master:7100 -- cdc.config.database.server.id=1 -- cdc.config.database.server.name=yugabyte -- cdc.schema=false -- cdc.connector=postgres | log"
  25. ©2023 Tagbangers,Inc. まとめ 31 • YugabyteDB は CDC もスケーラブルで可⽤性が⾼い •

    Spring Cloud Stream を使うことで Spring の豊富なエコシステムを活⽤して CDC を使った Stream pipeline が構築できる https://github.com/ogawa-takeshi/yb-cdc-spring