Apache Kafka is one of the most useful OSS for the message hub in micro services and can easily connect various data sources.
This feature can be applied to the realization of Transactional Outbox and Strangler Fig patterns.
pattern: one of methods for achieving distributed transaction. Kafka Connectivity ➢ Apache Kafka can easily connect to various data sources. ➢ Consistency with Kafka & DB event is useful for microservices. Use in microservices Development Migration Strangler Fig pattern: synchronizing DB for update and DB for query (CQRS pattern). Change data Caption(CDC) Kafka and Debezium • Low code: Using provided image & config files, Kafka connector can be built. • Source & Sink Connector: Data into Kafka (Source) & from Kafka (Sink) connector for data integration. • Various Data source: Self-developed service, managed service, DB and so on. Kafka Connector Kafka can connect various data sources.
Why can Apache Kafka be adopted for the hub of microservices architecture? https://www.confluent.jp/what-is-apache-kafka/ Microservices Architecture • Divide the system into smaller units called services and loosely couple them. Overview • Improve the agility of business speed by allowing systems to change flexibly. Merit Hub • There are many examples of microservices using Apache Kafka as a hub. Why?
Overview ➢ Apache Kafka is one of the most famous OSSs for distributed event streaming platform. • Kafka can easily handle dozens of nodes and distributed processing by "partitions" 1. High scalability 1 2 3 4 5 • One message is distributed into several partitions. • Messages can be sent by several semantics. 2. Fault tolerance • Services are loosely coupled by using Kafka as message hub. • Kafka has a rebalancing function for service configuration changes. 3. Loosely coupled • Kafka ecosystem provides low-code integration functions to connect to external components. 4. Connectivity Used by thousands of companies for • high-performance data pipelines • streaming analytics • data integration • mission-critical applications.
1. Kafka has “Topic” consisted of several partitions. 1 2 3 4 5 ➢ Kafka distributes partitions across multiple brokers to hold messages for high scalability. Produces messages to Topic Subscribe messages from Topic 2. One partition is distributed to several broker nodes. 3. Messages are performed on a per-partition. -> Then the performance is scalable. Kafka messaging
One message is replicated into several partitions against message lost. Message store 1 2 3 4 5 ➢ One message is distributed into several partitions. ➢ Messages can be sent by several semantics. Types of message semantics At most once Exactly once At least once A message is delivered to consumers at most once. -> the message can be lost. A message is delivered to consumers at least once. -> the message can be delivered many times. A message is delivered to consumers exactly once. Attention) this is guaranteed only inside Kafka. - A producer can produce the same message many times. - A consumer cannot send ACK to Kafka after processing the message (e.g. NW disconnection) Designing idempotent is necessary.
1 2 3 4 5 ➢ Services are loosely coupled by using Kafka as message hub. ➢ Kafka has a rebalancing function for configuration changes. Service configuration changes don’t influence other services. New service If there is not a messaging hub, connected services must change after adding new service. Rebalancing Kafka automatically adapts the configuration changes inside Service. Loose Coupling between services. Loose Coupling between service and Kafka.
- Connectivity Source Connector 1 2 3 4 5 ➢ Kafka ecosystem provides low-code integration functions to connect to external components (Kafka Connect). DB Service worker 1 worker 2 worker 3 Sink Connector worker 1 worker 2 worker 3 DB Service Managed service Managed service Kafka Need provided image & config file to build Image files are mainly provided by Confluent or Github.
service Monolith 1 2 3 4 ➢ In microservices architecture, each service typically has its own database for loosely coupling. Microservices Component 1 Component 2 Component 3 Component 4 DB Service 1 DB 1 Service 2 DB 2 Service 3 DB 3 • There is one DB. • All components can access it. • Each service has its own DB. • Each DB can be accessed only by the relevant service.
1 2 3 4 ➢ When multiple DBs are updated, consistent state of the data is not maintained. ⇒ Commit (at step #2) can’t be rollbacked Service 1 Service 2 1. Request 2. Update & Commit 3. Request 5. Failure Occur 6. Rollback 4. Update DB 1 DB 2 Problem Multiple DBs cannot be committed at the same time.
transaction Saga pattern ➢ Typical solution of this issue is distributed transaction, just like Saga / TCC pattern does TCC(try confirm-cancel) pattern • Update is committed in each service, respectively. • Compensation transaction in failure cancels the committed data. • 2 phase across services: (i) commit OK/NG check (ii) commit request after all services responding OK 1 2 3 4
TCC 1 2 3 4 ➢ We evaluated that Saga is superior for large system from viewpoints about throughput and loose coupling. Saga TCC Process Outline Compensation transaction 2 phase commit Throughput Good 1 phase process in normal Not Good Always 2 phase process Loose coupling Good Independent commit at respective services Not Good Commit after checks at all other services Consistency Not Good Inconsistent state can exist Good Only temporary data can exist Problem For early resolution of inconsistent state, (i) data commit and (ii) requests to other services in individual services must be done without conflict. The larger the system size, the greater the degree of performance problem.
Saga pattern ➢ (i) data commit and (ii) requests to other services in individual services can be temporarily inconsistent. Service 1 1. Request 2.Update & Commit 3. Request DB 1 Problem & Solution Process 2 & 3 must be done without conflict. => Kafka is suitable for the solution. After 2, the system state becomes temporarily inconsistent. Service 2 2 & 3 must be processed without conflict. if 2 fails => 3 must not be processed. if 2 successes => 3 must be processed. Kafka 1 2 3 4 5
pattern ➢ One of the implementation method for Saga pattern is transactional outbox pattern. Message relay Updating DB & request to other services are processed without conflict using outbox table. 3. Read RDB Transactional outbox pattern 1. Service 1 updates business data table. 2. Service 1 inserts into the outbox table and commits. 3. Message relay reads the outbox table. 4. The data of outbox table is sent to other services by the Message relay. Other services 6 6 business data outbox table 1. CUD 2. C Service 1 Sequence 4. Request Is there any tool? 1 2 3 4 5
pattern with Debezium ➢ Debezium provides the Kafka connector for the message relay of transactional outbox pattern and the library for services. Debezium Connector Kafka 4. Produce RDB Transactional outbox pattern with Debezium connector 1. Service 1 update its DB(business data). 2. The library insert updating info. Into outbox table. 3. 1&2 is committed at the same time. 4. Debezium connector detects the change of outbox table. 5. the data of outbox table is sent to Kafka by Debezium connector. Other services 5. Consume 6 6 business data outbox table 1. CUD 2. C debezium-quarkus-outbox library Service 1 Sequence 1 2 3 4 5
A sample of the transactional outbox pattern is provided in Github (E-commerce). https://github.com/debezium/debezium-examples/tree/master/saga 1 2 3 4 5
outbox table ➢ The outbox table has metadata and payload about updating. ➢ One of metadata indicates the Kafka topic to be sent. # Column Content Example 1 ID ID of outbox table f9a78388-883a-45d7-b4f6-7d47f8c89187 2 AGGREGAT ETYPE Process name used for indicating Kafka topic (defined by business logic) credit-approval 3 AGGREGATE ID ID of process event (defined by business logic) 57974d15-a2f5-49a7-b18a-71c636193bdd 4 TYPE Request type in process (defined by business logic) Request 5 PAYLOAD updating information about business data. (omitted) 6 TIMESTAMP timestamp of request 2021-08-16 07:12:42.605371 7 TRACING SPAN CONTEXT ID for distributed tracing (Jaeger) uber-trace- id=76d506f0a064b52d¥:a7c822eda5e591 86¥:76d506f0a064b52d¥:1 1 2 3 4 5
compensation transaction ➢ When order service receives the error response from one service, it sends the compensating transaction to the other. Order service Kafka Customer service Payment service 1. requests 2. request 3. request 5. error 4. success 6. responses 7. compensating transaction request 8. compensating transaction request Communication for compensating transaction case The library sends compensating transactions to the other services except for the service where error occurred. 1 2 3 4 5
case ➢ Debezium connector is utilized in loosely coupling the monolithic system. 6 Data of Service 1 Transaction Log 1. CUD Debezium Connector Kafka 2. Replica 3. Produce RDB ・・・ Service 1 Debezium Connector System Other services 4. Consume Debezium Connector receive whole transaction logs from RDB, by setting it as replica of RDB. ’ in migration. 1 2 3 4 5
for microservices message hub Kafka Characteristics ➢ The Apache Kafka with Debezium is useful for microservices arch. on not only dev. phase but migration phase. (1) Updating data & (2)requests to other services become consistent Transactional Outbox pattern Strangler Fig pattern Functions in legacy system are gradually split into services z z z 1. High scalability 2. Fault tolerance 3. Loosely coupled 4. Connectivity Especially important
Hadoop®, Hadoop, Apache Kafka®, Kafka, Apache ZooKeeper™, ZooKeeper, and associated open-source project names are trademarks of the Apache Software Foundation. • Debezium name and logo are trademarks of Red Hat. • Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. • MySQL, and Oracle are registered trademarks of Oracle and/or its affiliates. • Twitter is registered trademark of Twitter, Inc. • All other company names, product names, service names, and other proper nouns mentioned herein are trademarks or registered trademarks of their respective companies • TM and ® marks are not indicated in the text and figures in this presentation