Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kafka for Microservices

Apache Kafka for Microservices

Apache Kafka is one of the most useful OSS for the message hub in micro services and can easily connect various data sources.
This feature can be applied to the realization of Transactional Outbox and Strangler Fig patterns.

Kiminori Kurihara

October 26, 2021
Tweet

Other Decks in Technology

Transcript

  1. © Hitachi, Ltd. 2021. All rights reserved.
    Apache Kafka for Microservices
    2021.10.29
    Hitachi Ltd., R&D Group
    Kiminori Kurihara
    OSS Tech Talk

    View Slide

  2. © Hitachi, Ltd. 2021. All rights reserved.
    Summary
    Transactional Outbox pattern:
    one of methods for achieving distributed transaction.
    Kafka Connectivity
    ➢ Apache Kafka can easily connect to various data sources.
    ➢ Consistency with Kafka & DB event is useful for microservices.
    Use in microservices
    Development
    Migration
    Strangler Fig pattern:
    synchronizing DB for update and DB for query (CQRS pattern).
    Change data Caption(CDC)
    Kafka and Debezium
    • Low code:
    Using provided image & config files,
    Kafka connector can be built.
    • Source & Sink Connector:
    Data into Kafka (Source) & from
    Kafka (Sink) connector for data
    integration.
    • Various Data source:
    Self-developed service, managed
    service, DB and so on.
    Kafka Connector
    Kafka can connect various data sources.

    View Slide

  3. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  4. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  5. © Hitachi, Ltd. 2021. All rights reserved.
    1. Background
    ➢ Why can Apache Kafka be adopted for the hub of microservices
    architecture?
    https://www.confluent.jp/what-is-apache-kafka/
    Microservices Architecture
    • Divide the system into smaller units called
    services and loosely couple them.
    Overview
    • Improve the agility of business speed by
    allowing systems to change flexibly.
    Merit
    Hub
    • There are many examples of microservices
    using Apache Kafka as a hub.
    Why?

    View Slide

  6. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  7. © Hitachi, Ltd. 2021. All rights reserved.
    2-1. Apache Kafka Overview
    ➢ Apache Kafka is one of the most famous OSSs for distributed
    event streaming platform.
    • Kafka can easily handle dozens of nodes and
    distributed processing by "partitions"
    1. High scalability
    1 2 3 4 5
    • One message is distributed into several partitions.
    • Messages can be sent by several semantics.
    2. Fault tolerance
    • Services are loosely coupled by using Kafka as
    message hub.
    • Kafka has a rebalancing function for service
    configuration changes.
    3. Loosely coupled
    • Kafka ecosystem provides low-code integration
    functions to connect to external components.
    4. Connectivity
    Used by thousands of companies for
    • high-performance data pipelines
    • streaming analytics
    • data integration
    • mission-critical applications.

    View Slide

  8. © Hitachi, Ltd. 2021. All rights reserved.
    2-2. High scalability
    1. Kafka has “Topic” consisted of several partitions.
    1 2 3 4 5
    ➢ Kafka distributes partitions across multiple brokers to hold
    messages for high scalability.
    Produces messages
    to Topic
    Subscribe messages
    from Topic
    2. One partition is distributed to several broker nodes.
    3. Messages are performed on a per-partition.
    -> Then the performance is scalable.
    Kafka messaging

    View Slide

  9. © Hitachi, Ltd. 2021. All rights reserved.
    2-3. Fault tolerance
    One message is replicated into several
    partitions against message lost.
    Message store
    1 2 3 4 5
    ➢ One message is distributed into several partitions.
    ➢ Messages can be sent by several semantics.
    Types of message semantics
    At most once
    Exactly once
    At least once
    A message is delivered to consumers at most once.
    -> the message can be lost.
    A message is delivered to consumers at least once.
    -> the message can be delivered many times.
    A message is delivered to consumers exactly once.
    Attention) this is guaranteed only inside Kafka.
    - A producer can produce the same message many times.
    - A consumer cannot send ACK to Kafka after processing
    the message (e.g. NW disconnection)
    Designing idempotent is necessary.

    View Slide

  10. © Hitachi, Ltd. 2021. All rights reserved.
    2-4. Loosely coupled 1 2 3 4 5
    ➢ Services are loosely coupled by using Kafka as message hub.
    ➢ Kafka has a rebalancing function for configuration changes.
    Service configuration changes don’t
    influence other services.
    New service
    If there is not a messaging hub,
    connected services must change
    after adding new service.
    Rebalancing Kafka automatically adapts the
    configuration changes inside Service.
    Loose Coupling between services. Loose Coupling between service and Kafka.

    View Slide

  11. © Hitachi, Ltd. 2021. All rights reserved.
    2-5. Apache Kafka - Connectivity
    Source Connector
    1 2 3 4 5
    ➢ Kafka ecosystem provides low-code integration functions
    to connect to external components (Kafka Connect).
    DB
    Service
    worker 1
    worker 2
    worker 3
    Sink Connector
    worker 1
    worker 2
    worker 3
    DB
    Service
    Managed
    service
    Managed
    service
    Kafka
    Need
    provided image &
    config file
    to build Image files are mainly provided by Confluent or Github.

    View Slide

  12. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  13. © Hitachi, Ltd. 2021. All rights reserved.
    3-1. Database per service
    Monolith
    1 2 3 4
    ➢ In microservices architecture, each service typically has its own database
    for loosely coupling.
    Microservices
    Component 1
    Component 2
    Component 3
    Component 4
    DB
    Service 1
    DB 1
    Service 2
    DB 2
    Service 3
    DB 3
    • There is one DB.
    • All components can access it.
    • Each service has its own DB.
    • Each DB can be accessed only by the relevant
    service.

    View Slide

  14. © Hitachi, Ltd. 2021. All rights reserved.
    3-2. Problem: consistency 1 2 3 4
    ➢ When multiple DBs are updated, consistent state of the data is not
    maintained.
    ⇒ Commit (at step #2)
    can’t be rollbacked
    Service 1 Service 2
    1. Request
    2. Update &
    Commit
    3. Request 5. Failure Occur
    6. Rollback
    4. Update
    DB 1 DB 2
    Problem Multiple DBs cannot be committed at the same time.

    View Slide

  15. © Hitachi, Ltd. 2021. All rights reserved.
    3-3. Solution: Distributed transaction
    Saga pattern
    ➢ Typical solution of this issue is distributed transaction,
    just like Saga / TCC pattern does
    TCC(try confirm-cancel) pattern
    • Update is committed in each service, respectively.
    • Compensation transaction in failure cancels the
    committed data.
    • 2 phase across services:
    (i) commit OK/NG check
    (ii) commit request after all services responding OK
    1 2 3 4

    View Slide

  16. © Hitachi, Ltd. 2021. All rights reserved.
    3-3-1. Saga pattern
    ➢ Solution 1. Saga pattern:
    Send compensating transaction in failure occurred.
    Service 1 Service 2
    1. Request
    2.Update &
    Commit
    3. Request
    5. Failure Occur
    6. Rollback
    7. Compensating
    Transaction Request
    8. Commit for
    Canceling
    step #2
    4. Update
    DB 1 DB 2
    Solution Send compensating transaction in failure cancels the committed data.
    1 2 3 4

    View Slide

  17. © Hitachi, Ltd. 2021. All rights reserved.
    3-3-2. TCC pattern – normal case
    ➢ Solution 2. TCC pattern:
    2 phase commit across services.
    DB 1 DB 2
    Solution 2 phase( (i) commit OK/NG check, (ii) confirm/cancel request) across services.
    Service 1 Service 2
    1. Request
    3. Request
    5. Commit OK
    6. Commit
    4. Commit
    Preparation
    8. Commit
    7. Commit Direction
    2. Commit
    Preparation
    1 2 3 4

    View Slide

  18. © Hitachi, Ltd. 2021. All rights reserved.
    3-3-2. TCC pattern – failure case
    ➢ Solution 2. TCC pattern:
    2 phase commit across services.
    DB 1 DB 2
    Solution 2 phase( (i) commit OK/NG check, (ii) confirm/cancel request) across services.
    Service 1 Service 2
    1. Request
    3. Request
    7. Commit NG
    8. Cancel Commit
    Preparation
    4. Commit
    Preparation
    6. Rollback
    2. Commit
    Preparation
    5. Failure occur
    1 2 3 4

    View Slide

  19. © Hitachi, Ltd. 2021. All rights reserved.
    3-4. Saga vs. TCC 1 2 3 4
    ➢ We evaluated that Saga is superior for large system from viewpoints
    about throughput and loose coupling.
    Saga TCC
    Process Outline Compensation transaction 2 phase commit
    Throughput
    Good
    1 phase process in normal
    Not Good
    Always 2 phase process
    Loose coupling
    Good
    Independent commit at respective services
    Not Good
    Commit after checks at all other services
    Consistency
    Not Good
    Inconsistent state can exist
    Good
    Only temporary data can exist
    Problem
    For early resolution of inconsistent state, (i)
    data commit and (ii) requests to other services
    in individual services must be done without
    conflict.
    The larger the system size, the greater the
    degree of performance problem.

    View Slide

  20. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  21. © Hitachi, Ltd. 2021. All rights reserved.
    4-1. Problem in Saga pattern
    ➢ (i) data commit and (ii) requests to other services in individual services
    can be temporarily inconsistent.
    Service 1
    1. Request
    2.Update &
    Commit
    3. Request
    DB 1
    Problem & Solution Process 2 & 3 must be done without conflict. => Kafka is suitable for the solution.
    After 2, the system state becomes temporarily inconsistent.
    Service 2
    2 & 3 must be processed without conflict.
    if 2 fails => 3 must not be processed.
    if 2 successes => 3 must be processed.
    Kafka
    1 2 3 4 5

    View Slide

  22. © Hitachi, Ltd. 2021. All rights reserved.
    4-2. Transactional outbox pattern
    ➢ One of the implementation method for Saga pattern is transactional
    outbox pattern.
    Message
    relay
    Updating DB & request to other services are processed without conflict
    using outbox table.
    3. Read
    RDB
    Transactional outbox pattern
    1. Service 1 updates business data table.
    2. Service 1 inserts into the outbox table and commits.
    3. Message relay reads the outbox table.
    4. The data of outbox table is sent to other services by the Message relay.
    Other
    services
    6 6
    business data outbox table
    1. CUD
    2. C
    Service 1
    Sequence
    4. Request
    Is there any tool?
    1 2 3 4 5

    View Slide

  23. © Hitachi, Ltd. 2021. All rights reserved.
    4-3. Debezium + Kafka
    ➢ z “ ” “ ”
    Debezium connector system
    https://access.redhat.com/documentation/ja-jp/red_hat_integration/2020-q2/html-single/debezium_user_guide/index
    • Debezium is OSS about distributed platform for Change Data Capture(CDC).
    • Debezium provides Kafka Connector -> “Debezium connector”.
    • Debezium connector converts “update of RDB” to “event to Kafka”.
    1 2 3 4 5

    View Slide

  24. © Hitachi, Ltd. 2021. All rights reserved.
    4-4. Transactional outbox pattern with Debezium
    ➢ Debezium provides the Kafka connector for the message relay of
    transactional outbox pattern and the library for services.
    Debezium
    Connector
    Kafka
    4. Produce
    RDB
    Transactional outbox pattern with Debezium connector
    1. Service 1 update its DB(business data).
    2. The library insert updating info. Into outbox table.
    3. 1&2 is committed at the same time.
    4. Debezium connector detects the change of
    outbox table.
    5. the data of outbox table is sent to Kafka by
    Debezium connector.
    Other
    services
    5. Consume
    6 6
    business data outbox table
    1. CUD
    2. C
    debezium-quarkus-outbox library
    Service 1
    Sequence
    1 2 3 4 5

    View Slide

  25. © Hitachi, Ltd. 2021. All rights reserved.
    4-4-1. Sample
    ➢ A sample of the transactional outbox pattern is provided in
    Github (E-commerce).
    https://github.com/debezium/debezium-examples/tree/master/saga
    1 2 3 4 5

    View Slide

  26. © Hitachi, Ltd. 2021. All rights reserved.
    4-4-1-1. Sample - outbox table
    ➢ The outbox table has metadata and payload about updating.
    ➢ One of metadata indicates the Kafka topic to be sent.
    # Column Content Example
    1 ID ID of outbox table f9a78388-883a-45d7-b4f6-7d47f8c89187
    2 AGGREGAT
    ETYPE
    Process name used for indicating Kafka topic
    (defined by business logic)
    credit-approval
    3 AGGREGATE
    ID
    ID of process event
    (defined by business logic)
    57974d15-a2f5-49a7-b18a-71c636193bdd
    4 TYPE Request type in process
    (defined by business logic)
    Request
    5 PAYLOAD updating information about business data. (omitted)
    6 TIMESTAMP timestamp of request 2021-08-16 07:12:42.605371
    7 TRACING
    SPAN
    CONTEXT
    ID for distributed tracing
    (Jaeger)
    uber-trace-
    id=76d506f0a064b52d¥:a7c822eda5e591
    86¥:76d506f0a064b52d¥:1
    1 2 3 4 5

    View Slide

  27. © Hitachi, Ltd. 2021. All rights reserved.
    4-4-1-2. Sample – compensation transaction
    ➢ When order service receives the error response from one
    service, it sends the compensating transaction to the other.
    Order
    service
    Kafka
    Customer
    service
    Payment
    service
    1. requests
    2. request
    3. request
    5. error
    4. success
    6. responses
    7. compensating
    transaction
    request
    8. compensating
    transaction
    request
    Communication for compensating transaction case
    The library sends compensating transactions to the other services
    except for the service where error occurred.
    1 2 3 4 5

    View Slide

  28. © Hitachi, Ltd. 2021. All rights reserved.
    4-5. Other use case
    ➢ Debezium connector is utilized in loosely coupling the monolithic system.
    6
    Data of
    Service 1
    Transaction
    Log
    1. CUD
    Debezium
    Connector
    Kafka
    2. Replica 3. Produce
    RDB
    ・・・
    Service 1
    Debezium Connector System
    Other
    services
    4. Consume
    Debezium Connector receive whole transaction logs from RDB,
    by setting it as replica of RDB.
    ’ in migration.
    1 2 3 4 5

    View Slide

  29. © Hitachi, Ltd. 2021. All rights reserved.
    4-5-1. Other use case – strangler fig pattern
    ➢ Strangler Fig Pattern:
    Functions in legacy system are gradually split into services.
    CRUD
    CDC-based Strangler Fig Pattern Example: separating update and query processes in monolith
    Component 1
    Component 2
    Component 3
    Component 4
    RDB
    CUD
    Component 1
    Component 2
    Component 3
    Component 4
    RDB
    Loosely coupled
    (DBs are constructed by CQRS pattern)
    Proxy
    R
    Data Source
    Kafka
    Debezium
    Connector Connector
    Update Query
    Legacy Monolith
    1 2 3 4 5

    View Slide

  30. © Hitachi, Ltd. 2021. All rights reserved.
    Agenda
    1. Background
    2. Apache Kafka
    1. Apache Kafka Overview
    2. High scalability
    3. Fault tolerance
    4. Loosely coupled
    5. Connectivity
    3. Distributed transaction in microservices
    1. Database per service
    2. Problem: Consistency
    3. Solution: Distributed transaction
    1. Saga pattern
    2. TCC pattern
    4. Saga vs. TCC
    4. Using Kafka in microservices
    1. Problem in Saga pattern
    2. Transactional Outbox pattern
    3. Debezium + Kafka
    4. Transactional Outbox pattern with Debezium
    1. Sample
    5. Other use case
    1. Strangler Fig pattern
    5. Conclusion

    View Slide

  31. © Hitachi, Ltd. 2021. All rights reserved.
    5. Conclusion
    suitable for microservices message hub
    Kafka Characteristics
    ➢ The Apache Kafka with Debezium is useful for microservices
    arch. on not only dev. phase but migration phase.
    (1) Updating data & (2)requests to other services become
    consistent
    Transactional Outbox pattern Strangler Fig pattern
    Functions in legacy system are gradually split into services
    z
    z
    z
    1. High scalability 2. Fault tolerance 3. Loosely coupled 4. Connectivity
    Especially important

    View Slide

  32. © Hitachi, Ltd. 2021. All rights reserved.
    Trademark
    • Apache Hadoop®, Hadoop, Apache Kafka®, Kafka, Apache ZooKeeper™,
    ZooKeeper, and associated open-source project names are trademarks of the
    Apache Software Foundation.
    • Debezium name and logo are trademarks of Red Hat.
    • Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S.
    • MySQL, and Oracle are registered trademarks of Oracle and/or its affiliates.
    • Twitter is registered trademark of Twitter, Inc.
    • All other company names, product names, service names, and other proper
    nouns mentioned herein are trademarks or registered trademarks of their
    respective companies
    • TM and ® marks are not indicated in the text and figures in this presentation

    View Slide

  33. View Slide