Apache Kafka is one of the most useful OSS for the message hub in micro services and can easily connect various data sources. This feature can be applied to the realization of Transactional Outbox and Strangler Fig patterns.
© Hitachi, Ltd. 2021. All rights reserved.Apache Kafka for Microservices2021.10.29Hitachi Ltd., R&D GroupKiminori KuriharaOSS Tech Talk
View Slide
© Hitachi, Ltd. 2021. All rights reserved.SummaryTransactional Outbox pattern:one of methods for achieving distributed transaction.Kafka Connectivity➢ Apache Kafka can easily connect to various data sources.➢ Consistency with Kafka & DB event is useful for microservices.Use in microservicesDevelopmentMigrationStrangler Fig pattern:synchronizing DB for update and DB for query (CQRS pattern).Change data Caption(CDC)Kafka and Debezium• Low code:Using provided image & config files,Kafka connector can be built.• Source & Sink Connector:Data into Kafka (Source) & fromKafka (Sink) connector for dataintegration.• Various Data source:Self-developed service, managedservice, DB and so on.Kafka ConnectorKafka can connect various data sources.
© Hitachi, Ltd. 2021. All rights reserved.Agenda1. Background2. Apache Kafka1. Apache Kafka Overview2. High scalability3. Fault tolerance4. Loosely coupled5. Connectivity3. Distributed transaction in microservices1. Database per service2. Problem: Consistency3. Solution: Distributed transaction1. Saga pattern2. TCC pattern4. Saga vs. TCC4. Using Kafka in microservices1. Problem in Saga pattern2. Transactional Outbox pattern3. Debezium + Kafka4. Transactional Outbox pattern with Debezium1. Sample5. Other use case1. Strangler Fig pattern5. Conclusion
© Hitachi, Ltd. 2021. All rights reserved.1. Background➢ Why can Apache Kafka be adopted for the hub of microservicesarchitecture?https://www.confluent.jp/what-is-apache-kafka/Microservices Architecture• Divide the system into smaller units calledservices and loosely couple them.Overview• Improve the agility of business speed byallowing systems to change flexibly.MeritHub• There are many examples of microservicesusing Apache Kafka as a hub.Why?
© Hitachi, Ltd. 2021. All rights reserved.2-1. Apache Kafka Overview➢ Apache Kafka is one of the most famous OSSs for distributedevent streaming platform.• Kafka can easily handle dozens of nodes anddistributed processing by "partitions"1. High scalability1 2 3 4 5• One message is distributed into several partitions.• Messages can be sent by several semantics.2. Fault tolerance• Services are loosely coupled by using Kafka asmessage hub.• Kafka has a rebalancing function for serviceconfiguration changes.3. Loosely coupled• Kafka ecosystem provides low-code integrationfunctions to connect to external components.4. ConnectivityUsed by thousands of companies for• high-performance data pipelines• streaming analytics• data integration• mission-critical applications.
© Hitachi, Ltd. 2021. All rights reserved.2-2. High scalability1. Kafka has “Topic” consisted of several partitions.1 2 3 4 5➢ Kafka distributes partitions across multiple brokers to holdmessages for high scalability.Produces messagesto TopicSubscribe messagesfrom Topic2. One partition is distributed to several broker nodes.3. Messages are performed on a per-partition.-> Then the performance is scalable.Kafka messaging
© Hitachi, Ltd. 2021. All rights reserved.2-3. Fault toleranceOne message is replicated into severalpartitions against message lost.Message store1 2 3 4 5➢ One message is distributed into several partitions.➢ Messages can be sent by several semantics.Types of message semanticsAt most onceExactly onceAt least onceA message is delivered to consumers at most once.-> the message can be lost.A message is delivered to consumers at least once.-> the message can be delivered many times.A message is delivered to consumers exactly once.Attention) this is guaranteed only inside Kafka.- A producer can produce the same message many times.- A consumer cannot send ACK to Kafka after processingthe message (e.g. NW disconnection)Designing idempotent is necessary.
© Hitachi, Ltd. 2021. All rights reserved.2-4. Loosely coupled 1 2 3 4 5➢ Services are loosely coupled by using Kafka as message hub.➢ Kafka has a rebalancing function for configuration changes.Service configuration changes don’tinfluence other services.New serviceIf there is not a messaging hub,connected services must changeafter adding new service.Rebalancing Kafka automatically adapts theconfiguration changes inside Service.Loose Coupling between services. Loose Coupling between service and Kafka.
© Hitachi, Ltd. 2021. All rights reserved.2-5. Apache Kafka - ConnectivitySource Connector1 2 3 4 5➢ Kafka ecosystem provides low-code integration functionsto connect to external components (Kafka Connect).DBServiceworker 1worker 2worker 3Sink Connectorworker 1worker 2worker 3DBServiceManagedserviceManagedserviceKafkaNeedprovided image &config fileto build Image files are mainly provided by Confluent or Github.
© Hitachi, Ltd. 2021. All rights reserved.3-1. Database per serviceMonolith1 2 3 4➢ In microservices architecture, each service typically has its own databasefor loosely coupling.MicroservicesComponent 1Component 2Component 3Component 4DBService 1DB 1Service 2DB 2Service 3DB 3• There is one DB.• All components can access it.• Each service has its own DB.• Each DB can be accessed only by the relevantservice.
© Hitachi, Ltd. 2021. All rights reserved.3-2. Problem: consistency 1 2 3 4➢ When multiple DBs are updated, consistent state of the data is notmaintained.⇒ Commit (at step #2)can’t be rollbackedService 1 Service 21. Request2. Update &Commit3. Request 5. Failure Occur6. Rollback4. UpdateDB 1 DB 2Problem Multiple DBs cannot be committed at the same time.
© Hitachi, Ltd. 2021. All rights reserved.3-3. Solution: Distributed transactionSaga pattern➢ Typical solution of this issue is distributed transaction,just like Saga / TCC pattern doesTCC(try confirm-cancel) pattern• Update is committed in each service, respectively.• Compensation transaction in failure cancels thecommitted data.• 2 phase across services:(i) commit OK/NG check(ii) commit request after all services responding OK1 2 3 4
© Hitachi, Ltd. 2021. All rights reserved.3-3-1. Saga pattern➢ Solution 1. Saga pattern:Send compensating transaction in failure occurred.Service 1 Service 21. Request2.Update &Commit3. Request5. Failure Occur6. Rollback7. CompensatingTransaction Request8. Commit forCancelingstep #24. UpdateDB 1 DB 2Solution Send compensating transaction in failure cancels the committed data.1 2 3 4
© Hitachi, Ltd. 2021. All rights reserved.3-3-2. TCC pattern – normal case➢ Solution 2. TCC pattern:2 phase commit across services.DB 1 DB 2Solution 2 phase( (i) commit OK/NG check, (ii) confirm/cancel request) across services.Service 1 Service 21. Request3. Request5. Commit OK6. Commit4. CommitPreparation8. Commit7. Commit Direction2. CommitPreparation1 2 3 4
© Hitachi, Ltd. 2021. All rights reserved.3-3-2. TCC pattern – failure case➢ Solution 2. TCC pattern:2 phase commit across services.DB 1 DB 2Solution 2 phase( (i) commit OK/NG check, (ii) confirm/cancel request) across services.Service 1 Service 21. Request3. Request7. Commit NG8. Cancel CommitPreparation4. CommitPreparation6. Rollback2. CommitPreparation5. Failure occur1 2 3 4
© Hitachi, Ltd. 2021. All rights reserved.3-4. Saga vs. TCC 1 2 3 4➢ We evaluated that Saga is superior for large system from viewpointsabout throughput and loose coupling.Saga TCCProcess Outline Compensation transaction 2 phase commitThroughputGood1 phase process in normalNot GoodAlways 2 phase processLoose couplingGoodIndependent commit at respective servicesNot GoodCommit after checks at all other servicesConsistencyNot GoodInconsistent state can existGoodOnly temporary data can existProblemFor early resolution of inconsistent state, (i)data commit and (ii) requests to other servicesin individual services must be done withoutconflict.The larger the system size, the greater thedegree of performance problem.
© Hitachi, Ltd. 2021. All rights reserved.4-1. Problem in Saga pattern➢ (i) data commit and (ii) requests to other services in individual servicescan be temporarily inconsistent.Service 11. Request2.Update &Commit3. RequestDB 1Problem & Solution Process 2 & 3 must be done without conflict. => Kafka is suitable for the solution.After 2, the system state becomes temporarily inconsistent.Service 22 & 3 must be processed without conflict.if 2 fails => 3 must not be processed.if 2 successes => 3 must be processed.Kafka1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-2. Transactional outbox pattern➢ One of the implementation method for Saga pattern is transactionaloutbox pattern.MessagerelayUpdating DB & request to other services are processed without conflictusing outbox table.3. ReadRDBTransactional outbox pattern1. Service 1 updates business data table.2. Service 1 inserts into the outbox table and commits.3. Message relay reads the outbox table.4. The data of outbox table is sent to other services by the Message relay.Otherservices6 6business data outbox table1. CUD2. CService 1Sequence4. RequestIs there any tool?1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-3. Debezium + Kafka➢ z “ ” “ ”Debezium connector systemhttps://access.redhat.com/documentation/ja-jp/red_hat_integration/2020-q2/html-single/debezium_user_guide/index• Debezium is OSS about distributed platform for Change Data Capture(CDC).• Debezium provides Kafka Connector -> “Debezium connector”.• Debezium connector converts “update of RDB” to “event to Kafka”.1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-4. Transactional outbox pattern with Debezium➢ Debezium provides the Kafka connector for the message relay oftransactional outbox pattern and the library for services.DebeziumConnectorKafka4. ProduceRDBTransactional outbox pattern with Debezium connector1. Service 1 update its DB(business data).2. The library insert updating info. Into outbox table.3. 1&2 is committed at the same time.4. Debezium connector detects the change ofoutbox table.5. the data of outbox table is sent to Kafka byDebezium connector.Otherservices5. Consume6 6business data outbox table1. CUD2. Cdebezium-quarkus-outbox libraryService 1Sequence1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-4-1. Sample➢ A sample of the transactional outbox pattern is provided inGithub (E-commerce).https://github.com/debezium/debezium-examples/tree/master/saga1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-4-1-1. Sample - outbox table➢ The outbox table has metadata and payload about updating.➢ One of metadata indicates the Kafka topic to be sent.# Column Content Example1 ID ID of outbox table f9a78388-883a-45d7-b4f6-7d47f8c891872 AGGREGATETYPEProcess name used for indicating Kafka topic(defined by business logic)credit-approval3 AGGREGATEIDID of process event(defined by business logic)57974d15-a2f5-49a7-b18a-71c636193bdd4 TYPE Request type in process(defined by business logic)Request5 PAYLOAD updating information about business data. (omitted)6 TIMESTAMP timestamp of request 2021-08-16 07:12:42.6053717 TRACINGSPANCONTEXTID for distributed tracing(Jaeger)uber-trace-id=76d506f0a064b52d¥:a7c822eda5e59186¥:76d506f0a064b52d¥:11 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-4-1-2. Sample – compensation transaction➢ When order service receives the error response from oneservice, it sends the compensating transaction to the other.OrderserviceKafkaCustomerservicePaymentservice1. requests2. request3. request5. error4. success6. responses7. compensatingtransactionrequest8. compensatingtransactionrequestCommunication for compensating transaction caseThe library sends compensating transactions to the other servicesexcept for the service where error occurred.1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-5. Other use case➢ Debezium connector is utilized in loosely coupling the monolithic system.6Data ofService 1TransactionLog1. CUDDebeziumConnectorKafka2. Replica 3. ProduceRDB・・・Service 1Debezium Connector SystemOtherservices4. ConsumeDebezium Connector receive whole transaction logs from RDB,by setting it as replica of RDB.’ in migration.1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.4-5-1. Other use case – strangler fig pattern➢ Strangler Fig Pattern:Functions in legacy system are gradually split into services.CRUDCDC-based Strangler Fig Pattern Example: separating update and query processes in monolithComponent 1Component 2Component 3Component 4RDBCUDComponent 1Component 2Component 3Component 4RDBLoosely coupled(DBs are constructed by CQRS pattern)ProxyRData SourceKafkaDebeziumConnector ConnectorUpdate QueryLegacy Monolith1 2 3 4 5
© Hitachi, Ltd. 2021. All rights reserved.5. Conclusionsuitable for microservices message hubKafka Characteristics➢ The Apache Kafka with Debezium is useful for microservicesarch. on not only dev. phase but migration phase.(1) Updating data & (2)requests to other services becomeconsistentTransactional Outbox pattern Strangler Fig patternFunctions in legacy system are gradually split into serviceszzz1. High scalability 2. Fault tolerance 3. Loosely coupled 4. ConnectivityEspecially important
© Hitachi, Ltd. 2021. All rights reserved.Trademark• Apache Hadoop®, Hadoop, Apache Kafka®, Kafka, Apache ZooKeeper™,ZooKeeper, and associated open-source project names are trademarks of theApache Software Foundation.• Debezium name and logo are trademarks of Red Hat.• Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S.• MySQL, and Oracle are registered trademarks of Oracle and/or its affiliates.• Twitter is registered trademark of Twitter, Inc.• All other company names, product names, service names, and other propernouns mentioned herein are trademarks or registered trademarks of theirrespective companies• TM and ® marks are not indicated in the text and figures in this presentation