Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event Sourcing & CQRS, Kafka, Rabbit MQ

Event Sourcing & CQRS, Kafka, Rabbit MQ

Building Cloud-Native App Series - Part 2 of 15
Microservices Architecture Series
Event Sourcing & CQRS,
Kafka, Rabbit MQ
Kafka Topics
Kafka Connect
Kafka Streams
Case Studies
- E-Commerce App,
- Movie Streaming,
- Ticket Booking,
- Restaurant,
- Hospital Management

Araf Karsh Hamid

June 01, 2022
Tweet

More Decks by Araf Karsh Hamid

Other Decks in Technology

Transcript

  1. @arafkarsh arafkarsh
    8 Years
    Network &
    Security
    6+ Years
    Microservices
    Blockchain
    8 Years
    Cloud
    Computing
    8 Years
    Distributed
    Computing
    Architecting
    & Building Apps
    a tech presentorial
    Combination of
    presentation & tutorial
    ARAF KARSH HAMID
    Co-Founder / CTO
    MetaMagic Global Inc., NJ, USA
    @arafkarsh
    arafkarsh
    1
    Microservice
    Architecture Series
    Building Cloud Native Apps
    Messaging Systems
    Kafka Architecture
    Kafka Connect / Steams
    Event Storming
    Part 2 of 15

    View full-size slide

  2. @arafkarsh arafkarsh 2
    Slides are color coded based on the topic colors.
    Messaging Systems
    & Event Streaming
    Rabbit MQ
    1
    Kafka Topics
    High Availability
    Fault Tolerance
    2
    Kafka Connect
    Kafka Streams
    3
    Event Storming
    Case Studies
    4

    View full-size slide

  3. @arafkarsh arafkarsh
    Agile
    Scrum (4-6 Weeks)
    Developer Journey
    Monolithic
    Domain Driven Design
    Event Sourcing and CQRS
    Waterfall
    Optional
    Design
    Patterns
    Continuous Integration (CI)
    6/12 Months
    Enterprise Service Bus
    Relational Database [SQL] / NoSQL
    Development QA / QC Ops
    3
    Microservices
    Domain Driven Design
    Event Sourcing and CQRS
    Scrum / Kanban (1-5 Days)
    Mandatory
    Design
    Patterns
    Infrastructure Design Patterns
    CI
    DevOps
    Event Streaming / Replicated Logs
    SQL NoSQL
    CD
    Container Orchestrator Service Mesh

    View full-size slide

  4. @arafkarsh arafkarsh
    Messaging / Event Streaming
    • Problem Statement
    • Rabbit MQ
    • Kafka
    4
    1

    View full-size slide

  5. @arafkarsh arafkarsh
    Problem Statement – Synchronous Calls
    5
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Complex and Error prone.
    2. Tightly Coupled Systems
    3. Performance Issues
    4. Scalability
    Issues

    View full-size slide

  6. @arafkarsh arafkarsh
    Problem Statement – Async Calls : Queue Based
    6
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    • Scalability Issues
    • Multiple Sub Scribers are not
    allowed (to the same topic)
    Issues

    View full-size slide

  7. @arafkarsh arafkarsh
    Rabbit MQ Messaging System
    • Fanout Exchange
    • Direct Exchange
    • Topic Exchange
    • Header Exchange
    7

    View full-size slide

  8. @arafkarsh arafkarsh
    Async Calls : Fanout Exchange
    8
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Loosely Coupled Systems
    2. Scalable
    Exchange
    Duplicates the message &
    sends it to respective
    Queues Binding key
    Binding key
    Binding key

    View full-size slide

  9. @arafkarsh arafkarsh
    Async Calls : Direct Exchange
    9
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Loosely Coupled Systems
    2. Scalable
    Exchange
    Duplicates the message &
    sends it to respective
    Queues Binding key
    Binding key
    Binding key
    Message Contains Routing
    Key which needs to match
    with Binding Key

    View full-size slide

  10. @arafkarsh arafkarsh
    Async Calls : Topic Exchange
    10
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Loosely Coupled Systems
    2. Scalable
    Exchange
    Duplicates the message &
    sends it to respective
    Queues Order.any
    Binding key
    order.any
    Message Contains Routing Key
    which says order.phone then it can
    do a partial match with order.any

    View full-size slide

  11. @arafkarsh arafkarsh
    Async Calls : Header Exchange
    11
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Loosely Coupled Systems
    2. Scalable
    Exchange
    Duplicates the message &
    sends it to respective
    Queues Order.any
    Binding key
    order.any
    Message Routing happens
    based on the Header

    View full-size slide

  12. @arafkarsh arafkarsh
    Async Calls : Default Exchange
    12
    Check Out
    Order Inventory
    Notification Service
    eMail SMS
    Cart
    1. Loosely Coupled Systems
    2. Scalable
    Exchange
    Duplicates the message &
    sends it to respective
    Queues Binding key
    Binding key
    Binding key
    Message is moved forward
    if the Routing Key matches
    the Queue Name que-inv
    que-inv
    que-ord
    que-notification

    View full-size slide

  13. @arafkarsh arafkarsh
    Discovering Microservices Principles….
    13
    Components
    via
    Services
    Organized around
    Business
    Capabilities
    Products
    NOT
    Projects
    Smart
    Endpoints
    & Dumb Pipes
    Decentralized
    Governance &
    Data Management
    Infrastructure
    Automation
    Design for
    Failure
    Evolutionary
    Design
    How does the routing rules defy Microservices Principles?

    View full-size slide

  14. @arafkarsh arafkarsh
    Discovering Microservices Principles….
    14
    Components
    via
    Services
    Organized around
    Business
    Capabilities
    Products
    NOT
    Projects
    Smart
    Endpoints
    & Dumb Pipes
    Decentralized
    Governance &
    Data Management
    Infrastructure
    Automation
    Design for
    Failure
    Evolutionary
    Design
    How does the routing rules defy Microservices Principles?

    View full-size slide

  15. @arafkarsh arafkarsh
    How Kafka Solves the Problems
    • Kafka Topics
    • Kafka Consumer Group
    15

    View full-size slide

  16. @arafkarsh arafkarsh
    Traditional Queue / Pub-Sub Vs. Kafka
    16
    0 1 2 3 4 5 6 7 8 9
    8
    7
    9 Consumer 1
    Consumer 2
    Consumer 3
    Queues
    Data
    Data can be partitioned for scalability for parallel
    processing by same type of consumers
    Pros:
    Cons:
    1. Queues are NOT multi subscribers compare to
    Pub Sub.
    2. Once a Consumer reads the data, it’s gone from
    the queue.
    3. Ordering of records will be lost in asynchronous
    parallel processing.
    0 1 2 3 4 5 6 7 8 9
    9
    9
    9 Consumer 1
    Consumer 2
    Consumer 3
    Pub – Sub
    Data
    Multiple subscribers can get the same data.
    Pros:
    Scaling is difficult as every message goes to every
    subscriber.
    Cons:

    View full-size slide

  17. @arafkarsh arafkarsh
    Async Calls : Kafka Solution
    17
    Check Out
    Cart
    1. Highly Scalable
    2. Multi Subscriber
    3. Loosely Coupled Systems
    4. Data Durability (Replication)
    5. Ordering Guarantee (Per Partition)
    Use Partition Key
    Kafka
    Producer API
    Kafka
    Consumer API
    eMail SMS
    1 2 3 4
    1 2 3 4
    1 2 Service
    Instances
    Order Topic (Total Partitions 6)
    Kafka Storage
    Replicated Logs
    Kafka Cluster
    5 6 7 8
    7 8
    What will happen to
    Inventory Instance 7 and 8?
    Order Consumer Group Inv Consumer Group
    Notification Consumer Multiple Subscriber
    As there are only 6 Partitions
    Kafka can serve ONLY 6
    consumers within a partition

    View full-size slide

  18. @arafkarsh arafkarsh
    Async Calls : Kafka Solution
    18
    Check Out
    Cart
    4. Data Durability (Replication)
    5. Ordering Guarantee (Per
    Partition)
    Use Partition Key
    Kafka
    Producer API
    Kafka
    Consumer API
    1 2 3 4
    1 2 3 4
    3 4
    Service
    Instances
    Order Topic (Total Partitions 6)
    Kafka Storage
    Replicated Logs
    Kafka Cluster
    5 6 7 8
    7 8
    What will happen to
    Inventory Instance 7 and 8?
    Order Consumer Group Inv Consumer Group Multiple Subscriber
    As there are only 6 Partitions
    Kafka can serve ONLY 6
    consumers within a partition
    2 5
    1
    Broadcast Orders to following
    Consumers
    All the above Consumers will get same
    orders available in the Order Topic
    1. Highly Scalable
    2. Multi Subscriber
    3. Loosely Coupled
    Systems

    View full-size slide

  19. @arafkarsh arafkarsh
    2
    Kafka
    o Architecture
    o Topic Durability
    o High Availability
    o Fault Tolerance
    19

    View full-size slide

  20. @arafkarsh arafkarsh
    Kafka Architecture
    o Core Concepts & Kafka APIs
    o Producer / Consumer & Kafka Cluster with Zoo-Keeper
    o Zoo-Keeper /Kraft
    o Producer / Consumer & Kafka Cluster with KRaft
    20

    View full-size slide

  21. @arafkarsh arafkarsh
    Kafka Core Concepts
    21
    Publish & Subscribe
    Read and write streams of data
    like a messaging system
    Process
    Write scalable stream processing
    apps that react to events in real-
    time.
    Store
    Store streams of data safely in a
    distributed, replicated, fault
    tolerant cluster.
    Producers
    Consumers
    Broker

    View full-size slide

  22. @arafkarsh arafkarsh
    Kafka APIs
    22
    Source : https://kafka.apache.org/documentation/#gettingStarted
    • The Producer API allows an application to publish a
    stream of records to one or more Kafka topics.
    • The Consumer API allows an application to subscribe
    to one or more topics and process the stream of
    records produced to them.
    • The Streams API allows an application to act as
    a stream processor, consuming an input stream from
    one or more topics and producing an output stream
    to one or more output topics, effectively transforming
    the input streams to output streams.
    • The Connector API allows building and running
    reusable producers or consumers that connect Kafka
    topics to existing applications or data systems. For
    example, a connector to a relational database might
    capture every change to a table.

    View full-size slide

  23. @arafkarsh arafkarsh
    Traditional Queue / Pub-Sub Vs. Kafka
    23
    0 1 2 3 4 5 6 7 8 9
    8
    7
    9 Consumer 1
    Consumer 2
    Consumer 3
    Queues
    Data
    Data can be partitioned for scalability for parallel
    processing by same type of consumers
    Pros:
    Cons:
    1. Queues are NOT multi subscribers compare to
    Pub Sub.
    2. Once a Consumer reads the data, it’s gone from
    the queue.
    3. Ordering of records will be lost in asynchronous
    parallel processing.
    0 1 2 3 4 5 6 7 8 9
    9
    9
    9 Consumer 1
    Consumer 2
    Consumer 3
    Pub – Sub
    Data
    Multiple subscribers can get the same data.
    Pros:
    Scaling is difficult as every message goes to every
    subscriber.
    Cons:

    View full-size slide

  24. @arafkarsh arafkarsh
    Producers / Consumers / Kafka Cluster
    24
    Broker 1
    Broker 2
    Broker 3
    Broker n
    Zoo-Keeper 1 Zoo-Keeper 2 Zoo-Keeper 3
    Producer 1
    Producer 2
    Producer n
    Consumer 1
    Consumer 2
    Consumer 3
    Consumer 3
    Consumer 3
    Consumer 3
    Consumer 3

    View full-size slide

  25. @arafkarsh arafkarsh
    Zoo-Keeper
    25
    1. Broker Registration: Each broker gets registered in ZooKeeper, and
    ZooKeeper keeps track of the brokers that are functional in the Kafka
    cluster.
    2. Topic Configuration: ZooKeeper keeps track of all the topics, partitions,
    replicas, and their respective configuration in the Kafka cluster.
    3. Controller Election: The Kafka cluster elects one of the brokers as the
    controller, which is responsible for managing the states of partitions and
    replicas, and ZooKeeper helps in this controller election process.
    4. ACLs (Access Control Lists): ZooKeeper stores all the ACLs to secure the
    Kafka cluster from unauthorized access.
    5. Membership of Kafka Cluster: It maintains a list of all the active and dead
    brokers.

    View full-size slide

  26. @arafkarsh arafkarsh
    KRaft
    26
    1. Role and Features: The aim of KRaft mode is to absorb the functions ZooKeeper used to perform
    into Kafka itself, such as maintaining configuration, managing cluster membership, and handling
    partition leadership. This is achieved by using the Raft consensus algorithm for maintaining
    replicated logs of metadata across different brokers in the Kafka cluster.
    2. Benefits: The primary benefits of KRaft are simplification and improvement in scalability and
    performance. It simplifies the Kafka architecture by removing the need to manage an external
    ZooKeeper service. This also simplifies the configuration, deployment, and operational aspects
    of running Kafka. Performance and scalability can potentially be improved, as the Kafka
    community can now optimize the metadata layer in the way that best fits Kafka’s requirements.
    3. Difference from ZooKeeper: ZooKeeper is a general-purpose coordination service, whereas
    KRaft mode is a specific mode built into Kafka itself to meet its own needs for cluster
    coordination and metadata management. KRaft uses the Raft consensus algorithm, which is
    designed to be simple and easy to understand, while ZooKeeper uses a different consensus
    protocol called Zab.
    4. Separate Running of KRaft: Unlike ZooKeeper, which runs as a separate service, KRaft mode is
    integrated into Kafka itself. When you start a Kafka broker in KRaft mode, there's no need to
    start a separate ZooKeeper service.

    View full-size slide

  27. @arafkarsh arafkarsh
    Producers / Consumers / Kafka Cluster
    27
    Broker 1
    Broker 2
    Broker 3
    Broker n
    Producer 1
    Producer 2
    Producer n
    Consumer 1
    Consumer 2
    Consumer 3
    Consumer 3
    Consumer 3
    Consumer 3
    Consumer 3
    Based on Kraft – Raft Algorithm
    Kraft integrated into each Broker

    View full-size slide

  28. @arafkarsh arafkarsh
    Kafka Topic and Durability
    1. Anatomy of Topic
    2. Partition Log Segment
    3. Cluster – Topic and Partitions
    4. Record Commit Process
    5. Consumer Access & Retention Policy
    28

    View full-size slide

  29. @arafkarsh arafkarsh
    Topic / Partitions / Segments
    29
    Topic 1
    Topic 2
    Topic 3
    Partitions Segments
    Topics
    Topic 2 has
    4 Partitions
    Topic 2, Partition 2 has
    4 Segments
    Topic n
    0 1 2 3 4 5 6 7 8 9
    Data
    0 1 2 3 4 5 6 7 8 9
    0 1 2 3 4 5 6 7 8 9
    0 1 2 3 4 5 6 7 8 9
    0
    1
    2
    3
    6
    3
    0 Segment 0
    Segment 3
    Segment 6
    9 Segment 9 - Active

    View full-size slide

  30. @arafkarsh arafkarsh
    Anatomy of a Topic
    30
    Source : https://kafka.apache.org/intro
    • A Topic is a category or feed name to which
    records are published.
    • Topics in Kafka are always multi subscriber.
    • Each Partition is an ordered, immutable
    sequence of records that is continually
    appended to—a structured commit log.
    • A Partition is nothing but a directory of Log
    Files
    • The records in the partitions are each assigned a sequential id number called
    the offset that uniquely identifies each record within the partition.

    View full-size slide

  31. @arafkarsh arafkarsh 31
    Partition Log Segment
    • Partition (Kafka’s Storage unit) is Directory of
    Log Files.
    • A partition cannot be split across multiple
    brokers or even multiple disks
    • Partitions are split into Segments
    • Segments are two files: 000.log & 000.index
    • Segments are named by their base offset.
    The base offset of a segment is an offset
    greater than offsets in previous segments and
    less than or equal to offsets in that segment.
    0 1 2 3 4 5 6 7 8 9
    Partition
    Data
    6
    3
    0 Segment 0
    Segment 3
    Segment 6
    9 Segment 9 - Active
    $ tree kafka-logs | head -n 6
    kafka-logs
    |──── SigmaHawk-2
    | |──── 00000000006109871597.index
    | |──── 00000000006109871597.log
    | |──── 00000000007306321253.index
    | |──── 00000000007306321253.log
    Topic /
    Partition
    Segment 1
    Segment 2
    4 Bytes 4 Bytes

    View full-size slide

  32. @arafkarsh arafkarsh 32
    Partition Log Segment
    • Indexes store offsets relative to its segments base offset
    • Indexes map each offset to their message position in the log and
    they are used to look up messages.
    • Purging of data is based on oldest segment and one segment at
    a time.
    Rel.Offset, Position Offset, Position, Size, Payload
    0000.index 0000.log
    0 0 0 0 7 ABCDE67
    1 7 1 7 4 ABC4
    2 11 2 11 9 ABCDEF89
    4 Bytes 4 Bytes
    $ tree kafka-logs | head -n 6
    kafka-logs
    |──── SigmaHawk-2
    | |──── 00000000006109871597.index
    | |──── 00000000006109871597.log
    | |──── 00000000007306321253.index
    | |──── 00000000007306321253.log
    Topic /
    Partition
    Segment 1
    Segment 2
    3 20 3 20 3 AB3

    View full-size slide

  33. @arafkarsh arafkarsh
    Consumer Access & Data Retention
    33
    Source : https://kafka.apache.org/intro
    • For example, if the retention policy is set to 2 days, then for the two days
    after a record is published, it is available for consumption, after which it
    will be discarded to free up space.
    • The Kafka cluster retains all published records—whether or not they
    have been consumed—using a configurable retention period
    • Kafka's performance is effectively constant with respect to data
    size so storing data for a long time is not a problem.
    • Only metadata retained on a per-consumer basis is the offset or position of that consumer
    in the log. This offset is controlled by the consumer: normally a consumer will advance its
    offset linearly as it reads records, but, in fact, since the position is controlled by the
    consumer it can consume records in any order it likes.
    777743
    777742
    777741
    777740
    777739
    777738
    777737
    777736
    Producer
    Consumer
    Consumer
    Consumer
    • Producers Push Data
    • Consumers Poll Data
    Writes
    Reads
    Offset=37
    Offset=38
    Offset=41

    View full-size slide

  34. @arafkarsh arafkarsh
    Kafka High Availability
    o High Availability
    o Fault Tolerance
    34

    View full-size slide

  35. @arafkarsh arafkarsh
    Kafka Replication, HA & Load Balancing
    35
    Order Consumer Group
    1 2 3
    3 Instances of
    Order Service
    Server 2
    P1 P3 P6
    Server 1
    P2 P4 P5
    Order Topic Partitions 6 – Split into 2 Servers
    P1 P3 P6 P2 P4 P5
    Replication
    1. Partitions are replicated in both Server 1 and
    Server 2
    2. P1, P3, P6 are Leaders in Server 1 and followers
    in Server 2
    3. P2, P4, P5 are Leaders in Server 2 and followers
    in Server 1
    High Availability
    1. If Server 1 goes down then followers in Server 2
    will become the leader and vice versa.
    Load Balancing, Performance and Scalability
    1. Horizontal Scalability is achieved by adding more
    servers.
    2. Partitions from both servers are assigned to
    various consumers of Order Service.
    Order Consumer
    • C1 = P1, P2
    • C2 = P3, P4
    • C3 = P6, P5
    2 Partitions
    each
    Pn
    Leader in
    Server 1
    Follower in
    Server 2
    Pn
    Leader in
    Server 2
    Follower in
    Server 1

    View full-size slide

  36. @arafkarsh arafkarsh
    Kafka – Ordering Guarantees
    36
    1. Messages that require relative ordering needs to be sent to
    the same Partition. Kafka takes care of this part.
    2. Supply the same key for all the messages that require Relative
    Ordering.
    3. For example if the Customer Order requires relative ordering
    then Order ID will be the Key. All the messages for the same
    Order ID will be sent to the same partition.
    4. To maintain a Global Ordering without a key then use a Single
    Partition Topic. This will limit the scalability,

    View full-size slide

  37. @arafkarsh arafkarsh
    Traditional Queue, Pub Sub Vs. Kafka
    37
    Order Consumer Group Inv Consumer Group
    1 2 3 4
    1 2 3 Service
    Instances
    Order Topic Total Partitions 6 – Split into 2 Servers
    Server 1 Server 2
    P1 P3 P6 P2 P4 P5
    Queue Implementation
    1. Partition replaces the Queues & Consumer
    (within a Group) will retrieve the message from
    1 or more Partition.
    2. Each Consumer from within a Group will be
    assigned different partitions.
    3. Load Balancing (Assigning Partitions to
    Consumer) happens based on the number of
    Consumers in the Group.
    4. If a Consumer drops out the partitions will re-
    assigned to another consumer within the group.
    5. No. of Partitions must be greater than the
    Consumers within a Group.
    Pub Sub Implementation
    1. With Multiple Consumer Group (Ex., Order &
    Inventory) the Same Message (Event) is available
    to all the groups subscribed to the same Topic.
    • Order Consumer
    • C1 = P1, P3
    • C2 = P6, P2
    • C3 = P4, P5
    2
    Partitions
    each
    • Inventory Consumer
    • I1 = P1, P4
    • I2 = P3, P5
    • I3= P6
    • I4 = P2
    2
    Partitions
    1Partition each

    View full-size slide

  38. @arafkarsh arafkarsh
    Async Calls : Kafka Solution
    38
    Check Out
    Order Consumer Group Inv Consumer Group
    Order Topic (Total Partitions 6)
    1 2 3 4
    1 2 3
    Kafka
    Producer API
    Kafka
    Consumer API
    Kafka Storage
    Replicated Logs
    Service
    Instances
    Kafka Cluster
    Server 1 Server 2
    P1 P3 P6 P2 P4 P5
    • Each Order Consumer
    has 2 Partitions each
    • C1 = P1, P3
    • C2 = P6, P2
    • C3 = P4, P5
    • Inventory Consumer has
    • I1 = P1, P4
    • I2 = P3, P5
    • I3= P3
    • I4 = P6
    1. Highly Scalable
    2. Multi Subscriber
    3. Loosely Coupled
    Systems
    4. Data Durability
    (Replication)
    5. Ordering
    Guarantee (Per
    Partition)
    Use Partition Key

    View full-size slide

  39. @arafkarsh arafkarsh
    Kafka – Fault Tolerance
    Data Durability in Kafka
    39

    View full-size slide

  40. @arafkarsh arafkarsh
    Kafka Cluster
    40
    m1
    m2
    m3
    Leader (A)
    m1
    m2
    Follower (B)
    m1
    Follower (C)
    A,B,C are 3
    servers in
    Kafka Cluster
    m1
    m2
    m3
    Leader (A)
    m1
    m2
    Follower (B)
    m1
    Follower (C)
    m1
    m2
    m3
    Leader (A)
    m1
    m2
    Leader (B)
    m1
    Follower (C) Server B
    becomes the
    new Leader
    Server A Fails
    m2

    View full-size slide

  41. @arafkarsh arafkarsh
    Kafka Cluster – Topics & Partitions
    • The partitions of the log are distributed over the servers in the Kafka cluster with each server handling
    data and requests for a share of the partitions.
    Source : https://kafka.apache.org/intro
    m1, m2
    Broker 1
    Leader (A)
    Broker 2
    Follower (B)
    m1,m2
    Broker 3
    Follower C
    p1
    Broker 4
    Follower (B,C)
    m1
    p1,p2
    Broker 5
    Leader A
    p1,p2
    Partition 1
    Partition 0
    Topic ABC
    • Each server acts as a leader for some of its partitions and a follower for others so load is well balanced
    within the cluster.
    • Each partition has one server which acts as the "leader" and zero or more servers which act as "followers".
    41

    View full-size slide

  42. @arafkarsh arafkarsh
    Record Commit Process
    Broker 1
    Leader
    Topic 1
    Broker 2
    Follower
    Producer
    Consumer
    3
    3
    Commit
    2
    ack
    • Each partition is replicated across a configurable
    number of servers for fault tolerance.
    • The leader handles all read and write requests for
    the partition while the followers passively replicate
    the leader.
    • If the leader fails, one of the followers will
    automatically become the new leader.
    1
    Message with Offset
    4
    777743
    Broker 3
    Follower
    Data Durability From Kafka v0.8.0 onwards
    acks Acknowledgement Description
    0
    If set to zero then the producer will NOT wait for any
    acknowledgment from the server at all. The record will be
    immediately added to the socket buffer and considered sent.
    No guarantee can be made that the server has received the
    record in this case, and the retries configuration will not take
    effect (as the client won't generally know of any failures). The
    offset given back for each record will always be set to -1.
    1
    This will mean the leader will write the record to its local log
    but will respond without awaiting full acknowledgement
    from all followers. In this case should the leader fail
    immediately after acknowledging the record but before the
    followers have replicated it then the record will be lost.
    All /
    -1
    This means the leader will wait for the full set of in-sync
    replicas to acknowledge the record. This guarantees that the
    record will not be lost as long as at least one in-sync replica
    remains alive. This is the strongest available guarantee. This is
    equivalent to the acks=-1 setting.
    Source: https://kafka.apache.org/documentation/#topicconfigs
    acks Steps
    0 1
    1 1,2
    -1 1,2,3
    Producer Configuration
    42

    View full-size slide

  43. @arafkarsh arafkarsh
    Message Acknowledgements
    43
    m1
    Follower (B)
    m2 m3 m4
    m1
    Follower (C)
    m2 m3 m4
    m1
    Leader (A)
    m2 m3 m4
    Producer
    acks=0 m5
    Ack
    m1
    Follower (B)
    m2 m3 m4 m5
    m1
    Follower (C)
    m2 m3 m4 m5
    m1
    Leader (A)
    m2 m3 m4 m5
    Producer
    acks=all m5
    Ack
    m1
    Follower (B)
    m2 m3 m4
    m1
    Follower (C)
    m2 m3 m4
    m1
    Leader (A)
    m2 m3 m4 m5
    Producer
    acks=1 m5
    Ack
    Producer get Ack
    before even the
    message reached
    the Leader.
    Producer get Ack
    after the Leader
    commits the
    message.
    Producer get Ack
    after all the ISR (In
    Sync Replicas)
    confirms the
    commit.

    View full-size slide

  44. @arafkarsh arafkarsh
    Message Acknowledgements
    44
    m1
    Follower (B)
    m2 m3 m4 m5
    m1
    Follower (C)
    m2 m3 m4
    m1
    Leader (A)
    m2 m3 m4 m5
    Producer
    acks=all m5
    m1
    Follower (B)
    m2 m3 m4 m5
    m1
    Follower (C)
    m2 m3 m4
    m1
    Leader (A)
    m2 m3 m4 m5
    Producer
    acks=all
    min.insync.replicas=2
    m5
    Ack
    Producer get Ack
    after the available
    ISR = min in sync
    replicas = X
    Producer won’t get
    Ack as all the ISR(In
    Sync Replica) are
    not available.
    Because all the 3 ISR (In Sync Replicas) are Alive. Kafka Broker will send the Ack back ONLY after
    receiving the ack from all the three ISRs.
    Why is the Ack
    Not Coming –
    even after the
    min in sync
    replicas = 2?
    m1
    Follower (B)
    m2 m3 m4 m5
    m1
    Follower (C)
    m2 m3 m4
    m1
    Leader (A)
    m2 m3 m4 m5
    Producer
    acks=all m5
    min.insync.replicas=2

    View full-size slide

  45. @arafkarsh arafkarsh
    Replication
    m1
    m2
    m3
    L(A)
    m1
    m2
    F(B)
    m1
    F(C)
    ISR = (A, B, C)
    Leader A commits Message
    m1. Message m2 & m3 not
    yet committed.
    1
    m1
    m2
    F(C)
    m1
    m2
    L(B)
    m1
    m2
    m3
    L(A)
    ISR = (B,C)
    A fails and B is the new
    Leader. B commits m2
    2
    m1
    m2
    m3
    L(A)
    m1
    m2
    L(B)
    m4
    m5
    m1
    m2
    F(C)
    m4
    m5
    ISR = (B,C)
    B commits new messages
    m4 and m5
    3
    m1
    m2
    L(B)
    m4
    m5
    m1
    m2
    F(C)
    m4
    m5
    m1
    F(A)
    ISR = (A, B,C)
    A comes back, restores to
    last commit and catches
    up to latest messages.
    4
    m1
    m2
    L(B)
    m4
    m5
    m1
    m2
    F(C)
    m4
    m5
    m1
    m2
    F(A)
    m4
    m5
    ISR – In-sync Replica
    • Instead of majority vote, Kafka
    dynamically maintains a set of in-sync
    replicas (ISR) that are caught-up to the
    leader.
    • Only members of this set are eligible for
    election as leader.
    • A write to a Kafka partition is not
    considered committed until all in-sync
    replicas have received the write.
    • This ISR set is persisted to ZooKeeper
    whenever it changes. Because of this, any
    replica in the ISR is eligible to be elected
    leader.
    45

    View full-size slide

  46. @arafkarsh arafkarsh
    Kafka Data Structures
    Mentoring Session
    • Kafka Record v1
    • Kafka Record v2
    • Kafka Record Batch
    46

    View full-size slide

  47. @arafkarsh arafkarsh
    Kafka Record / Message Structure
    47
    Magic Attr
    CRC
    int64
    int32 int8
    Timestamp
    Header
    Key (Variable Length)
    Value (Variable Length)
    Payload
    v1 (Supported since 0.10.0)
    Field Description
    CRC
    The CRC is the CRC32 of the remainder of the message bytes. This
    is used to check the integrity of the message on the broker and
    consumer.
    Magic Byte This is a version id used to allow backwards compatible evolution
    of the message binary format. The current value is 2.
    Attributes
    Bit 0-2 Compression Codec
    0 No Compression
    1 Gzip Compression
    2 Snappy Compression
    Bit 3 Timestamp Type: 0 for Create Time Stamp,
    1 for Log Append Time Stamp
    Bit. 4 is Transactional (0 means Transactional)
    Bit 5 is Control Batch (0 means Control Batch)
    Bit >5. Un used
    Timestamp
    This is the timestamp of the message. The timestamp type is
    indicated in the attributes. Unit is milliseconds since beginning of
    the epoch (midnight Jan 1, 1970 (UTC)).
    Key The key is an optional message key that was used for partition
    assignment. The key can be null.
    Value
    The value is the actual message contents as an opaque byte array.
    Kafka supports recursive messages in which case this may itself
    contain a message set. The message can be null.
    int8
    Source: https://kafka.apache.org/documentation/#messages

    View full-size slide

  48. @arafkarsh arafkarsh
    Kafka Record Structure
    48
    v2 (Supported since 0.11.0)
    Length (varint) Attr
    int8
    Timestamp Delta (varint)
    Offset Delta (varint) Key Length (varint)
    Key (varint) Value Length (varint)
    Value (varint) Headers (Header Array)
    Header Key Length (varint) Header Key (varint)
    Header Value Length (varint) Header Value (varint)
    Header
    Record
    • In Kafka 0.11, the structure of the 'Message
    Set' and 'Message' were significantly
    changed.
    • A 'Message Set' is now called a 'Record
    Batch', which contains one or more 'Records'
    (and not 'Messages').
    • The recursive nature of the previous versions
    of the message format was eliminated in
    favor of a flat structure.
    • When compression is enabled, the Record
    Batch header remains uncompressed, but the
    Records are compressed together.
    • Multiple fields in the 'Record' are varint
    encoded, which leads to significant space
    savings for larger batches.

    View full-size slide

  49. @arafkarsh arafkarsh
    Kafka Record Batch Structure
    49
    v2 (Supported since 0.11.0)
    Field Description
    First Offset Denotes the first offset in the Record Batch. The 'offset Delta' of each
    Record in the batch would be be computed relative to this First Offset.
    Partition
    Leader Epoch
    this is set by the broker upon receipt of a produce request and is
    used to ensure no loss of data when there are leader changes
    with log truncation.
    Attributes
    The fifth lowest bit indicates whether the Record Batch is part of a
    transaction or not. 0 indicates that the Record Batch is not
    transactional, while 1 indicates that it is. (since 0.11.0.0)
    Last Offset
    Delta
    The offset of the last message in the Record Batch. This is used by
    the broker to ensure correct behavior even when Records within a
    batch are compacted out.
    First
    Timestamp
    The timestamp of the first Record in the batch. The timestamp of
    each Record in the Record Batch is its 'Timestamp Delta' + 'First
    Timestamp'.
    Max
    Timestamp
    The timestamp of the last Record in the batch. This is used by the
    broker to ensure the correct behavior even when Records within
    the batch are compacted out.
    Producer ID This is the broker assigned producer Id received by the 'Init
    Producer Id' request.
    Producer
    Epoch
    This is the broker assigned producer Epoch received by the 'Init
    Producer Id' request.
    First Sequence
    This is the producer assigned sequence number which is used by
    the broker to de-duplicate messages. The sequence number for
    each Record in the Record Batch is its Offset Delta + First
    Sequence.
    First Offset
    int64
    Length
    int32
    Partition Leader Epoch
    int32
    Magic
    int8
    CRC
    int32
    Attr
    int16
    Last offset Delta
    int32
    First Timestamp
    int64
    Max Timestamp
    int64
    Producer
    Epoch
    int16
    Producer ID
    int64
    First Sequence
    int32
    Records (Record Array)

    View full-size slide

  50. @arafkarsh arafkarsh
    Kafka Operations
    Kafka Setup
    Kafka Producer
    Kafka Consumer
    50

    View full-size slide

  51. @arafkarsh arafkarsh
    Kafka Quick Setup & Demo
    51
    1. install the most recent version from Kafka download page
    2. Extract the binaries into a /…./Softwares/kafka folder. For the current version it's kafka_2.11-1.0.0.0.
    3. Change your current directory to point to the new folder.
    4. Start the Zookeeper server by executing the command:
    bin/zookeeper-server-start.sh config/zookeeper.properties.
    5. Start the Kafka server by executing the command:
    bin/kafka-server-start.sh config/server.properties.
    6. Create a Test topic that you can use for testing:
    bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic KafkaSigmaTest
    7. Start a simple console Consumer that can consume messages published to a given topic, such as KafkaSigmaTest:
    bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic KafkaSigmaTest --from-beginning.
    8. Start up a simple Producer console that can publish messages to the test topic:
    bin/kafka-console-producer.sh --broker-list localhost:9092 --topic KafkaSigmaTest
    9. Try typing one or two messages into the producer console. Your messages should show in the consumer console.

    View full-size slide

  52. @arafkarsh arafkarsh
    Kafka Setup
    52

    View full-size slide

  53. @arafkarsh arafkarsh
    Kafka
    Producer
    53

    View full-size slide

  54. @arafkarsh arafkarsh
    Kafka
    Consumer
    54

    View full-size slide

  55. @arafkarsh arafkarsh
    Kafka Consumer
    for REST API’s
    55

    View full-size slide

  56. @arafkarsh arafkarsh
    Kafka
    Producer
    Template
    56

    View full-size slide

  57. @arafkarsh arafkarsh
    Kafka Topic
    creator
    57

    View full-size slide

  58. @arafkarsh arafkarsh
    Kafka
    Partition
    Manager
    58

    View full-size slide

  59. @arafkarsh arafkarsh
    Kafka
    Producer
    Controllers
    59

    View full-size slide

  60. @arafkarsh arafkarsh
    ProtoBuf v3.0
    60

    View full-size slide

  61. @arafkarsh arafkarsh
    ProtoBuf - Java
    61

    View full-size slide

  62. @arafkarsh arafkarsh
    ProtoBuf – Object to JSON & JSON to Object
    62
    Object
    to JSON
    JSON to
    Object

    View full-size slide

  63. @arafkarsh arafkarsh
    Kafka Performance
    • Kafka / Pulsar / RabbitMQ
    • LinkedIn Kafka Cluster
    • Uber Kafka Cluster
    63

    View full-size slide

  64. @arafkarsh arafkarsh
    Kafka Use Cases – High Volume Events
    64
    1. Social Media
    2. E-Commerce – especially on a Single Day Sale
    3. Location Sharing – Ride Sharing Apps
    4. Data Gathering
    1. Music Streaming Service
    2. Web Site Analytics

    View full-size slide

  65. @arafkarsh arafkarsh
    Kafka / RabbitMQ / Pulsar
    65
    Tests Kafka Pulsar
    Rabbit MQ
    (Mirrored)
    Peak Throughput (MB/s)
    605
    MB/s
    305
    MB/s
    38
    MB/s
    p99 Latency (ms)
    5 ms
    (200 MB/s load)
    25 ms
    (200 MB/s load)
    1 ms*
    (reduced 30 MB/s
    load)
    Source: https://www.confluent.io/blog/kafka-fastest-messaging-system/

    View full-size slide

  66. @arafkarsh arafkarsh
    LinkedIn Kafka Cluster
    66
    Brokers
    60
    Partitions
    50K
    Messages / Second
    800K
    MB / Second inbound
    300
    MB / Second Outbound
    1024
    The tuning looks fairly
    aggressive, but all of the
    brokers in that cluster
    have a 90% GC pause
    time of about 21ms, and
    they’re doing less than 1
    young GC per second.

    View full-size slide

  67. @arafkarsh arafkarsh
    Uber Kafka Cluster
    67
    Topics
    10K+
    Events / Second
    11M
    Petabytes of Data
    1PB+

    View full-size slide

  68. @arafkarsh arafkarsh
    Tuning Kafka
    68
    1. Session Timeout (session.timeout.ms) and Heartbeat Interval (heartbeat.interval.ms): These
    parameters are related to the health check mechanism of Kafka consumers. If the consumer fails to send
    heartbeats to the Kafka broker within the session timeout, the consumer is considered dead, and a
    rebalance is triggered. To prevent unnecessarily rebalances, you can increase the session timeout, but
    this will delay the detection of failed consumers. You can also decrease the heartbeat interval so that the
    consumer sends heartbeats more frequently, but this will increase the network traffic between the
    consumers and the Kafka brokers.
    2. Max Poll Interval (max.poll.interval.ms): This is the maximum amount of time a consumer can go
    without polling for messages before the consumer is considered dead, and a rebalance is triggered. For
    applications with a long processing time, increasing this parameter can prevent unnecessary rebalances.
    3. Max Poll Records (max.poll.records): This is the maximum number of records that a single call to poll()
    will return. Reducing this value can help to balance the load more evenly among the consumers but may
    cause more frequent calls to poll(), which may result in more CPU utilization.
    4. Partition Assignment Strategy (partition.assignment.strategy): This is the class name of the partition
    assignment strategy that the client will use to distribute partition ownership amongst consumer
    instances when group management is used. Depending on your workload and access patterns, different
    strategies might be beneficial. Examples include RoundRobinAssignor, RangeAssignor, and StickyAssignor.

    View full-size slide

  69. @arafkarsh arafkarsh
    Partition Assignment Strategies: partition.assignment.strategy
    69
    1. RoundRobinAssignor: This is the simplest partition assignment strategy. It assigns each partition to a consumer in
    a round-robin fashion, disregarding the current assignment. If you have 2 consumers, C1 and C2, and 4 partitions,
    P1, P2, P3, and P4, a possible assignment could be: (org.apache.kafka.clients.consumer.RoundRobinAssignor)
    1. C1: P1, P3
    2. C2: P2, P4 This strategy can be effective when each message takes about the same amount of time to
    process, and you want to distribute the processing load evenly across all consumers.
    2. RangeAssignor: This strategy sorts both the consumers and partitions by their ids. It then divides the number of
    partitions by the total number of consumers to calculate the number of partitions to assign to each consumer. It
    assigns each partition to a consumer in this order. If there are 2 consumers, C1 and C2, and 4 partitions, P1, P2,
    P3, and P4, a possible assignment could be: (org.apache.kafka.clients.consumer.RangeAssignor)
    1. C1: P1, P2
    2. C2: P3, P4 This strategy can be effective when the order of message processing is important because it
    ensures that all partitions assigned to a specific consumer are processed in order.
    3. StickyAssignor: This strategy tries to assign partitions to consumers so as to minimize the movement of partitions
    across consumers during a rebalance. When a consumer leaves or joins the group, the assignment algorithm tries
    to assign partitions to the remaining or new consumers in a way that minimizes the number of partitions that
    need to change assignment. This is useful in situations where startup cost of processing a partition (like
    establishing database connections, caching, etc.) is high, and you want to minimize the impact of rebalances.
    (org.apache.kafka.clients.consumer.StickyAssignor)

    View full-size slide

  70. @arafkarsh arafkarsh
    Kafka Summary
    70
    1. Combined Best of Queues and Pub / Sub Model.
    2. Data Durability
    3. Fastest Messaging Infrastructure
    4. Streaming capabilities
    5. Replication

    View full-size slide

  71. @arafkarsh arafkarsh
    2
    Kafka Connect
    o Architecture
    o Features
    o Source Connectors
    o Sink Connectors
    71

    View full-size slide

  72. @arafkarsh arafkarsh
    Kafka Connect Architecture
    72
    Kafka Cluster
    Kafka Connect Kafka Connect
    Data Source Data Sink

    View full-size slide

  73. @arafkarsh arafkarsh
    Kafka Connect Features
    73
    Kafka Connect is a component of Apache Kafka, an open-source stream-processing software
    platform developed by LinkedIn and donated to the Apache Software Foundation. Kafka
    Connect aims to make it easy to move data in and out of Kafka by providing a scalable and
    fault-tolerant framework for connecting Kafka with other systems.
    1. Scalable and Distributed: Kafka Connect can distribute data streaming tasks across
    multiple worker nodes, enabling horizontal scalability.
    2. Fault-Tolerant: It offers fault tolerance, automatically recovering from failures and
    ensuring that data is not lost.
    3. Connector Plugins: A rich ecosystem of pre-built connectors for various databases, key-
    value stores, and APIs exists. You can also create custom connectors.
    4. Converters: It provides the ability to convert data formats on the fly. This is useful when
    integrating systems that use different data formats.
    5. Configuration-based Approach: Configuration settings allow you to specify how data
    should be read/written without writing code for it. This makes it easier to adapt to
    different systems.

    View full-size slide

  74. @arafkarsh arafkarsh
    Kafka Connect Features
    74
    6. Rest API: It provides a REST API for easy integration and management, enabling
    automated setups, monitoring, and scaling.
    7. Schema Management: Kafka Connect can integrate with Kafka’s Schema Registry,
    allowing for structured data to be moved around in a type-safe manner.
    8. Reusability and Extensibility: Kafka Connect is designed to be reusable across multiple
    deployments and extensible to support a broad array of configurations, making it
    powerful for varied use-cases.
    9. Exactly-Once Semantics: Some connectors support exactly-once delivery semantics,
    which ensures that records are neither lost nor seen more than once.
    10. Batch and Stream Processing: Kafka Connect supports both batch and real-time data
    processing. You can set it up to stream data in real-time or schedule it for batch
    processing.
    11. Data Transformation: It supports Single Message Transformation (SMT) to apply basic
    transformations to messages as they flow through the system.

    View full-size slide

  75. @arafkarsh arafkarsh
    Kafka Connectors
    75
    1. File Source Connector: Used to load data from a file into Kafka.
    2. File Sink Connector: Used to dump data from Kafka to a file.
    3. Amazon S3 Source/Sink Connectors: Connectors for AWS S3 to either load data
    into Kafka from S3 (Source) or write data from Kafka to S3 (Sink).
    4. HDFS Sink Connector: Connects Kafka to Hadoop Distributed File System (HDFS)
    and allows storing data from Kafka to HDFS.
    5. JDBC Source/Sink Connectors: Allows the user to pull data (source) from a
    database into Kafka, or push data (sink) from Kafka into a database. It works with
    any JDBC-compliant database.
    6. Elasticsearch Sink Connector: Used to export data from Kafka to Elasticsearch.
    7. MQTT Source Connector: It's used to import data from MQTT into Kafka.

    View full-size slide

  76. @arafkarsh arafkarsh
    Kafka Connectors
    76
    8. Debezium Source Connectors: Debezium offers a variety of source connectors for
    different databases to capture and stream database changes into Kafka, e.g., MySQL,
    PostgreSQL, MongoDB, SQL Server, Oracle, etc.
    9. Confluent's Replicator Source Connector: A specialized Connector for replicating data
    across Kafka clusters.
    10. Apache Cassandra Sink Connector: Allows users to write data from Kafka to Apache
    Cassandra.
    11. Google Cloud Pub/Sub Source/Sink Connectors: Enable integration of Kafka with
    Google Cloud Pub/Sub.
    12. Amazon Kinesis Source Connector: Allows users to import data from Amazon Kinesis
    into Kafka.
    13. Apache Pulsar Source/Sink Connectors: Allow integration of Kafka with Apache Pulsar.

    View full-size slide

  77. @arafkarsh arafkarsh
    Use cases and Workflow
    77
    Common Use-Cases
    • Data Ingestion: Importing data into Kafka from various sources like databases,
    message queues, and file logs.
    • Data Export: Exporting data from Kafka to sink systems like databases and cloud
    storage.
    • Real-time Analytics: Streaming data between various real-time analytics
    applications.
    • Data Aggregation: Aggregating data from multiple sources, transforming it, and then
    sending it to a sink.
    Workflow
    1. Source Connector: Pulls data from an external system and pushes it to Kafka topics.
    2. Sink Connector: Pulls data from Kafka topics and pushes it to external systems.

    View full-size slide

  78. @arafkarsh arafkarsh
    Source Database Connectors
    78
    • connection.url, connection.user,
    and connection.password are
    used to connect to the MySQL
    database.
    • table.whitelist specifies the table
    from which to source data.
    • mode and
    incrementing.column.name set
    the connector to use an
    incrementing mode, pulling only
    new rows based on the "id"
    column.
    • topic.prefix specifies the prefix
    for the Kafka topic where the
    data will be stored.

    View full-size slide

  79. @arafkarsh arafkarsh
    Source File Connector
    79
    • connector.class specifies the class
    for the file source connector.
    • tasks.max specifies the maximum
    number of tasks to use for this
    connector.
    • file specifies the full path of the
    file to read data from.
    • topic specifies the Kafka topic
    where the data will be produced.

    View full-size slide

  80. @arafkarsh arafkarsh
    Sink Database Connector
    80
    • connection.url,
    connection.user, and
    connection.password are used
    to connect to the MySQL
    database.
    • topics specifies the Kafka topic
    from which to pull data.
    • auto.create allows the
    connector to create a table in
    the database if it doesn't exist.

    View full-size slide

  81. @arafkarsh arafkarsh
    Sink S3 Connector
    81
    • connector.class specifies the class for the S3
    sink connector.
    • tasks.max specifies the maximum number of
    tasks to use for this connector.
    • topics specifies the Kafka topic from which to
    pull data.
    • s3.region and s3.bucket.name specify the
    AWS S3 bucket and region.
    • s3.part.size specifies the part size for S3
    multi-part uploads.
    • flush.size specifies the number of records per
    partition to trigger flush.
    • storage.class and format.class specify the
    storage and format classes.
    • partitioner.class specifies how the data will
    be partitioned in the S3 bucket.

    View full-size slide

  82. @arafkarsh arafkarsh
    Setup
    Source
    Database
    Connector for
    PostgreSQL
    82

    View full-size slide

  83. @arafkarsh arafkarsh
    83
    Source: https://github.com/arafkarsh/ms-springboot-272-java-8

    View full-size slide

  84. @arafkarsh arafkarsh
    3
    Kafka Streams
    84

    View full-size slide

  85. @arafkarsh arafkarsh
    Kafka Streams
    85
    Kafka Streams is a library for building real-time, highly scalable, and fault-tolerant stream
    processing applications using Apache Kafka as the underlying data store. Kafka Streams is
    part of the Kafka project and natively integrates with Kafka clusters, providing a robust way
    to consume, process, and produce data within the Kafka ecosystem.
    Common Use Cases
    • Real-time Analytics: Processing logs, aggregating counts, and updating dashboards in
    real-time.
    • Data Enrichment: Joining disparate data streams to produce more informative output
    streams.
    • ETL: Real-time data transformation and enrichment before loading into data lakes or
    databases.
    • Anomaly Detection: Identifying outliers or unusual patterns in real-time data streams.
    • Complex Event Processing: Correlating across multiple streams to detect complex events
    or patterns.

    View full-size slide

  86. @arafkarsh arafkarsh
    Kafka Streams Features
    86
    1. Stream DSL: Offers a high-level domain-specific language (DSL) for
    transforming and processing data streams.
    2. Processor API: Provides a lower-level processor API for building
    custom stateful operations beyond what the DSL offers.
    3. Stateful and Stateless Processing: Supports both stateless
    operations like mapping, filtering, and stateful operations like
    joins, windows, and aggregations.
    4. Distributed Processing: Kafka Streams applications can run on
    multiple nodes, and work is distributed among them.
    5. Fault Tolerance: Built-in fault tolerance via state replication and
    fast failover.

    View full-size slide

  87. @arafkarsh arafkarsh
    Kafka Streams Features
    87
    6. Event-Time Processing: Support for event-time and windowed
    operations, allowing complex temporal queries.
    7. Exactly-Once Semantics: Guarantees exactly-once processing semantics
    to ensure that records are neither lost nor seen more than once.
    8. Integration with Kafka: No separate processing cluster is required; a
    Kafka Streams application can run on any machine connecting to a
    Kafka cluster.
    9. Lightweight: Kafka Streams is a library rather than a framework, giving
    developers more control over deployment and operational aspects.
    10. Interactive Queries: Allows querying of application state on the fly,
    turning your stream processing application into a lightweight
    embedded database.

    View full-size slide

  88. @arafkarsh arafkarsh
    Kafka
    Stream Setup
    88

    View full-size slide

  89. @arafkarsh arafkarsh
    Kafka SSE
    For REST
    APIs
    89

    View full-size slide

  90. @arafkarsh arafkarsh
    Kafka
    Streams
    – Ex.1
    90

    View full-size slide

  91. @arafkarsh arafkarsh
    Kafka
    Streams
    - Ex. 2
    91

    View full-size slide

  92. @arafkarsh arafkarsh
    4
    Event Storming
    o Event Storming and CQRS
    o Distributed Transactions
    o Case Studies
    92

    View full-size slide

  93. @arafkarsh arafkarsh
    Mind Shift : From Object Modeling to Process Modeling
    93
    Developers with Strong Object Modeling experience
    will have trouble making Events a first-class citizen.
    • How do I start Event Sourcing?
    • Where do I Start on Event Sourcing / CQRS?
    The Key is:
    1. App User’s Journey
    2. Business Process
    3. Ubiquitous Language – DDD
    4. Capability Centric Design
    5. Outcome Oriented
    The Best tool to define your process and its tasks.
    How do you define your End User’s Journey & Business Process?
    • Think It
    • Build It
    • Run IT

    View full-size slide

  94. @arafkarsh arafkarsh 94
    Process
    • Define your Business Processes. Eg. Various aspects of Order
    Processing in an E-Commerce Site, Movie Ticket Booking,
    Patient visit in Hospital.
    1
    Commands • Define the Commands (End-User interaction with your App) to
    execute the Process. Eg. Add Item to Cart is a Command.
    2
    Event Sourced
    Aggregate
    • Current state of the Aggregate is always derived from the Event
    Store. Eg. Shopping Cart, Order etc. This will be part of the Rich
    Domain Model (Bounded Context) of the Micro Service.
    4
    Projections
    • Projections focuses on the View perspective of the Application.
    As the Read & Write are different Models, you can have
    different Projections based on your View perspective.
    5
    Write
    Data
    Read
    Data
    Events • Commands generates the Events to be stored in Event Store.
    Eg. Item Added Event (in the Shopping Cart).
    3
    Event Storming – Concept

    View full-size slide

  95. @arafkarsh arafkarsh
    Event Sourcing Intro
    95
    Standard CRUD Operations – Customer Profile – Aggregate Root
    Profile
    Address
    Title
    Profile Created
    Profile
    Address
    New Title
    Title Updated
    Profile
    New
    Address
    New Title
    New Address added
    Derived
    Profile
    Address
    Notes
    Notes Removed
    Time T1 T2 T4
    T3
    Event Sourcing and Derived Aggregate Root
    Commands
    1. Create Profile
    2. Update Title
    3. Add Address
    4. Delete Notes
    2
    Events
    1. Profile Created Event
    2. Title Updated Event
    3. Address Added Event
    4. Notes Deleted Event
    3
    Profile
    Address
    New Title
    Current State of the
    Customer Profile
    4
    Snapshot
    Event store
    Single Source of Truth
    Greg
    Young

    View full-size slide

  96. @arafkarsh arafkarsh 96
    Event Sourcing & CQRS
    (Command and Query Responsibility Segregation) Presentation Tier
    Logic Tier
    • Request Validations
    • Commands / Domain Logic
    • Data Persistence
    Database Tier
    The same Schema
    is used for both
    Read and Write.
    Read Model
    Write Model
    Queries
    (DTOs)
    Presentation Tier
    Logic Tier
    • Request Validations
    • Commands / Domain Logic
    • Data Persistence
    Database Tier
    Read & Write Schema
    are different
    Separate Process
    Updates the Read DB
    Write
    Model
    Queries
    (DTOs)
    Read
    Model
    • In traditional data management systems, both
    commands (updates to the data) and queries
    (requests for data) are executed against the
    same entities in a single data repository.
    • CQRS is a pattern segregating the operations
    that read data (Queries) from the operations
    that update data (Commands) using separate
    interfaces.
    • CQRS should only be used on specific portions
    of a system in Bounded Context (in DDD).
    • CQRS should be used along with Event
    Sourcing.
    MSDN – Microsoft https://msdn.microsoft.com/en-us/library/dn568103.aspx |
    Martin Fowler : CQRS – http://martinfowler.com/bliki/CQRS.html
    Axon
    Framework
    For Java
    Java Axon Framework Resource : http://www.axonframework.org
    CQS: Bertrand Meyer
    Greg
    Young

    View full-size slide

  97. @arafkarsh arafkarsh
    Differences between Commands, Events & Queries
    97
    Behavior /
    Stage Change
    Includes a
    Response
    Command Requested to Happen Maybe
    Event Just Happened Never
    Query None Always
    1. Events are Facts and Notification
    2. Event wear 2 hats: Data Hats (Fact) and Notification Hats

    View full-size slide

  98. @arafkarsh arafkarsh
    Case Study: Shopping Site – Event Sourcing / CQRS
    98
    Catalogue Shopping Cart Order Payment
    • Search Products
    • Add Products
    • Update Products
    Commands
    • Add to Cart
    • Remove Item
    • Update Quantity
    Customer
    • Select Address
    • Select Delivery Mode
    • Process Order
    Events
    • Product Added
    • Product Updated
    • Product Discontinued
    • Item Added
    • Item Removed /
    Discontinued
    • Item Updated
    • Order Initiated
    • Address Selected
    • Delivery Mode Selected
    • Order Created
    • Confirm Order for
    Payment
    • Proceed for Payment
    • Cancel Order
    • Payment Initiated
    • Order Cancelled
    • Order Confirmed
    • OTP Send
    • Payment Approved
    • Payment Declined
    Microservices
    • Customer
    • Shopping Cart
    • Order
    Customer Journey thru Shopping Process
    2
    Processes 1
    Customers will browse through the Product catalogue to find the products, its ratings and reviews. Once the product is narrowed
    down the customer will add the product to shopping cart. Once the customer is ready for the purchase, he/she will start the
    order processing by selecting the Delivery address, delivery method, payment option. Once the payment is done customer will
    get the order tracking details.
    ES Aggregate 4
    Core Domain
    Supporting Domain Supporting Domain Supporting Domain Generic Domain
    3

    View full-size slide

  99. @arafkarsh arafkarsh
    Distributed Tx
    o Saga Design Pattern
    o Features
    o Handling Invariants
    99
    o Forward recovery
    o Local Saga Feature
    o Distributed Saga

    View full-size slide

  100. @arafkarsh arafkarsh
    Distributed Tx: SAGA Design Pattern instead of 2PC
    100
    Long Lived Transactions (LLTs) hold on to DB resources for relatively long periods of time, significantly delaying
    the termination of shorter and more common transactions.
    Source: SAGAS (1987) Hector Garcia Molina / Kenneth Salem,
    Dept. of Computer Science, Princeton University, NJ, USA
    T1 T2 Tn
    Local Transactions
    C1 C2 Cn-1
    Compensating Transaction
    Divide long–lived, distributed transactions into quick local ones with compensating actions for
    recovery.
    Travel : Flight Ticket & Hotel Booking Example
    BASE (Basic Availability, Soft
    State, Eventual Consistency)
    Room Reserved
    T1
    Room Payment
    T2
    Seat Reserved
    T3
    Ticket Payment
    T4
    Cancelled Room Reservation
    C1
    Cancelled Room Payment
    C2
    Cancelled Ticket Reservation
    C3
    Error
    Error
    Error

    View full-size slide

  101. @arafkarsh arafkarsh
    SAGA Design Pattern Features
    101
    1. Backward Recovery (Rollback)
    T1
    T2
    T3
    T4
    C3
    C2
    C1
    Order Processing, Banking
    Transactions, Ticket Booking
    Examples
    Updating individual scores in
    a Team Game.
    2. Forward Recovery with Save Points
    T1
    (sp) T2
    (sp) T3
    (sp)
    • To recover from Hardware Failures, SAGA needs to be persistent.
    • Save Points are available for both Forward and Backward Recovery.
    Type
    Source: SAGAS (1987) Hector Garcia Molina / Kenneth Salem, Dept. of Computer Science, Princeton University, NJ, USA

    View full-size slide

  102. @arafkarsh arafkarsh
    Handling Invariants – Monolithic to Micro Services
    102
    In a typical Monolithic App
    Customer Credit Limit info and
    the order processing is part of
    the same App. Following is a
    typical pseudo code.
    Order Created
    T1
    Order
    Microservice
    Credit Reserved
    T2
    Customer
    Microservice
    In Micro Services world with Event Sourcing, it’s a
    distributed environment. The order is cancelled if
    the Credit is NOT available. If the Payment
    Processing is failed then the Credit Reserved is
    cancelled.
    Payment
    Microservice
    Payment Processed
    T3
    Order Cancelled
    C1
    Error
    Credit Cancelled due to
    payment failure
    C2
    Error
    Begin Transaction
    If Order Value <= Available
    Credit
    Process Order
    Process Payments
    End Transaction
    Monolithic 2 Phase Commit
    https://en.wikipedia.org/wiki/Invariant_(computer_science)

    View full-size slide

  103. @arafkarsh arafkarsh 103
    Use Case : Restaurant – Forward Recovery
    Domain
    The example focus on a
    concept of a Restaurant
    which tracks the visit of
    an individual or group
    to the Restaurant. When
    people arrive at the
    Restaurant and take a
    table, a table is opened.
    They may then order
    drinks and food. Drinks
    are served immediately
    by the table staff,
    however food must be
    cooked by a chef. Once
    the chef prepared the
    food it can then be
    served.
    Payment
    Billing
    Dining
    Source: http://cqrs.nu/tutorial/cs/01-design
    Soda Cancelled
    Table Opened
    Juice Ordered
    Soda Ordered
    Appetizer Ordered
    Soup Ordered
    Food Ordered
    Juice Served
    Food Prepared
    Food Served
    Appetizer Served
    Table Closed
    Aggregate Root : Dinning Order
    Billed Order
    T1
    Payment CC
    T2
    Payment Cash
    T3
    T1
    (sp) T2
    (sp) T3
    (sp)
    Event Stream
    Aggregate Root : Food Bill
    Transaction doesn't rollback if one payment
    method is failed. It moves forward to the
    NEXT one.
    sp
    Network
    Error
    C1 sp

    View full-size slide

  104. @arafkarsh arafkarsh
    Choreography Vs. Orchestration
    104
    1. Choreography: In this model, each local transaction
    publishes domain events that trigger local transactions in
    other services. There is no central coordination. The logic
    is distributed among the participants, which means that
    it must be a collaborative process.
    2. Orchestration: In this model, an orchestrator (object)
    tells the participants what local transactions to execute.
    The logic for the distributed transaction is centralized in
    the orchestrator.
    Comparison of Event Choreography and Orchestration Techniques in Microservices Architecture
    Chaitanya K Rudrabhatla, Executive Director, Solution Architect, Media & Entertainment Domain, LA USA

    View full-size slide

  105. @arafkarsh arafkarsh
    Choreography: Local SAGA Features
    105
    1. Part of the Micro Services
    2. Local Transactions and Compensation
    Transactions
    3. SAGA State is persisted
    4. All the Local transactions are based on
    Single Phase Commit (1 PC)
    5. Developers need to ensure that
    appropriate compensating
    transactions are Raised in the event of
    a failure.
    API Examples
    @StartSaga(name=“HotelBooking”)
    public void reserveRoom(…) {
    }
    @EndSaga(name=“HotelBooking”)
    public void payForTickets(…) {
    }
    @AbortSaga(name=“HotelBooking”)
    public void cancelBooking(…) {
    }
    @CompensationTx()
    public void cancelReservation(…) {
    }

    View full-size slide

  106. @arafkarsh arafkarsh
    Choreography: Use Case: Travel Booking
    106
    Travel : Hotel Booking / Car Booking / Flight Booking Example
    Room Reserved
    T1
    Room Payment
    T2
    Car Reserved
    T3
    Car Payment
    T4
    Cancelled Room Reservation
    C1
    Cancelled Room Payment
    C2
    Cancelled Car Reservation
    C3
    Error
    Error
    Error
    Seat Reserved
    T5
    Ticket Payment
    T6
    Cancelled Seat Reservation
    C4
    Error
    Cancelled Ticket Payment
    C5
    Error
    Distributed Tx Done
    Hotel
    Microservice
    Rental
    Microservice
    Travel
    Microservice

    View full-size slide

  107. @arafkarsh arafkarsh
    Orchestration: SAGA Execution Container
    107
    1. SEC is a separate Process
    2. Stateless in nature and Saga state is stored in a
    messaging system (Kafka is a Good choice).
    3. SEC process failure MUST not affect Saga Execution as
    the restart of the SEC must start from where the Saga
    left.
    4. SEC – No Single Point of Failure (Master Slave Model).
    5. Distributed SAGA Rules are defined using a DSL.

    View full-size slide

  108. @arafkarsh arafkarsh
    Orchestration: Use Case : Travel Booking – Distributed Saga (SEC)
    108
    Hotel Booking
    Car Booking
    Flight Booking
    Saga
    Execution
    Container
    Start Saga
    {Booking Request}
    Payment
    End
    Saga
    Start
    Saga
    Start Hotel
    End Hotel
    Start Car
    End Car
    Start Flight
    End Flight
    Start Payment
    End Payment
    Saga Log
    End Saga
    {Booking Confirmed}
    SEC knows the structure of the
    distributed Saga and for each
    of the Request Which Service
    needs to be called and what
    kind of Recovery mechanism it
    needs to be followed.
    SEC can parallelize the calls
    to multiple services to
    improve the performance.
    The Rollback or Roll forward
    will be dependent on the
    business case.
    Source: Distributed Sagas By Catitie McCaffrey, June 6, 2017

    View full-size slide

  109. @arafkarsh arafkarsh
    Orchestration: Use Case : Travel Booking – Rollback
    109
    Hotel Booking
    Car Booking
    Flight Booking
    Saga
    Execution
    Container
    Start Saga
    {Booking Request}
    Payment
    Start
    Comp
    Saga
    End
    Comp
    Saga
    Start Hotel
    End Hotel
    Start Car
    Abort Car
    Cancel Hotel
    Cancel Flight
    Saga Log
    End Saga
    {Booking Cancelled}
    Kafka is a good choice to
    implement the SEC log.
    SEC is completely STATELESS in
    nature. Master Slave model
    can be implemented to avoid
    the Single Point of Failure.
    Source: Distributed Sagas By Catitie McCaffrey, June 6, 2017

    View full-size slide

  110. @arafkarsh arafkarsh
    Choreography Service: Use Case: Travel Booking
    110
    Travel : Hotel Booking / Car Booking / Flight Booking Example
    Room
    Reserved
    T2
    Room
    Payment
    T3
    Car
    Reserved
    T4
    Car
    Payment
    T5
    Cancelled Room
    Reservation
    C1
    Cancelled Room
    Payment
    C2
    Cancelled Car
    Reservation
    C3
    Error
    Error
    Error
    Seat
    Reserved
    T6
    Ticket
    Payment
    T7
    Cancelled Seat
    Reservation
    C4
    Error
    Cancelled Ticket
    Payment
    C5
    Error
    Distributed Tx Done
    Hotel
    Microservice
    Rental
    Microservice
    Travel
    Microservice
    Reservation
    Microservice
    Reservation
    Completed
    T8
    Reservation
    Incomplete
    T8
    Reservation
    Request
    T1
    S

    View full-size slide

  111. @arafkarsh arafkarsh
    Scalability Requirement in Cloud
    111
    1. Availability and Partition Tolerance is more important
    than immediate Consistency.
    2. Eventual Consistency is more suitable in a highly
    scalable Cloud Environment
    3. Two Phase Commit has its limitations from Scalability
    perspective and it’s a Single Point of Failure.
    4. Scalability examples from eBay, Amazon, Netflix, Uber,
    Airbnb etc.

    View full-size slide

  112. @arafkarsh arafkarsh
    Summary:
    112
    1. 2 Phase Commit
    It doesn’t scale well in the cloud environment
    2. SAGA Design Pattern
    Raise compensating events when the local transaction fails.
    3. SAGA Supports Rollbacks & Roll Forwards
    Critical pattern to address distributed transactions.

    View full-size slide

  113. @arafkarsh arafkarsh
    Case Studies
    o Case Study: Shopping Portal
    o Case Study: Movie Streaming
    o Case Study: Patient Care
    o Case Study: Restaurant Dining
    o Case Study: Movie Ticket Booking
    113

    View full-size slide

  114. @arafkarsh arafkarsh
    Process
    • Define your Business Processes. Eg. Various aspects of Order
    Processing in an E-Commerce Site, Movie Ticket Booking,
    Patient visit in Hospital.
    1
    Commands • Define the Commands (End-User interaction with your App) to
    execute the Process. Eg. Add Item to Cart is a Command.
    2
    Event Sourced
    Aggregate
    • Current state of the Aggregate is always derived from the Event
    Store. Eg. Shopping Cart, Order etc. This will be part of the Rich
    Domain Model (Bounded Context) of the Micro Service.
    4
    Projections
    • Projections focuses on the View perspective of the Application.
    As the Read & Write are different Models, you can have
    different Projections based on your View perspective.
    5
    Write
    Data
    Read
    Data
    Events • Commands generates the Events to be stored in Event Store.
    Eg. Item Added Event (in the Shopping Cart).
    3
    Event Storming – Concept
    114

    View full-size slide

  115. @arafkarsh arafkarsh
    Shopping Portal Services – Code Packaging
    115
    Auth Products Cart Order
    Customer
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Packaging Structure
    Bounded Context
    Implementation
    (Repositories, Business Services, Web Services)
    Domain Models
    (Entities, Value Objects, DTOs)
    (Repositories, Business Services, Web Services)
    Entity Factories
    Interfaces (Ports)

    View full-size slide

  116. @arafkarsh arafkarsh
    Case Study: Shopping Site – Event Sourcing / CQRS
    116
    Catalogue Shopping Cart Order Payment
    • Search Products
    • Add Products
    • Update Products
    Commands
    • Add to Cart
    • Remove Item
    • Update Quantity
    Customer
    • Select Address
    • Select Delivery Mode
    • Process Order
    Events
    • Product Added
    • Product Updated
    • Product Discontinued
    • Item Added
    • Item Removed /
    Discontinued
    • Item Updated
    • Order Initiated
    • Address Selected
    • Delivery Mode Selected
    • Order Created
    • Confirm Order for
    Payment
    • Proceed for Payment
    • Cancel Order
    • Payment Initiated
    • Order Cancelled
    • Order Confirmed
    • OTP Send
    • Payment Approved
    • Payment Declined
    Microservices
    • Customer
    • Shopping Cart
    • Order
    Customer Journey thru Shopping Process
    2
    Processes 1
    Customers will browse through the Product catalogue to find the products, its ratings and reviews. Once the product is narrowed
    down the customer will add the product to shopping cart. Once the customer is ready for the purchase, he/she will start the
    order processing by selecting the Delivery address, delivery method, payment option. Once the payment is done customer will
    get the order tracking details.
    ES Aggregate 4
    Core Domain
    Sub Domain Sub Domain Sub Domain Generic Domain
    3

    View full-size slide

  117. @arafkarsh arafkarsh
    DDD: Use Case
    117
    Order Service
    Models
    Value Object
    • Currency
    • Item Value
    • Order Status
    • Payment Type
    • Record State
    • Audit Log
    Entity
    • Order (Aggregate Root)
    • Order Item
    • Shipping Address
    • Payment
    DTO
    • Order
    • Order Item
    • Shipping Address
    • Payment
    Domain Layer Adapters
    • Order Repository
    • Order Service
    • Order Web Service
    • Order Query Web Service
    • Shipping Address Web Service
    • Payment Web Service
    Adapters Consists of Actual
    Implementation of the Ports like
    Database Access, Web Services
    API etc.
    Converters are used to convert
    an Enum value to a proper
    Integer value in the Database.
    For Example Order Status
    Complete is mapped to integer
    value 100 in the database.
    Services / Ports
    • Order Repository
    • Order Service
    • Order Web Service
    Utils
    • Order Factory
    • Order Status Converter
    • Record State Converter
    • Order Query Web Service
    • Shipping Address Web Service
    • Payment Web Service
    Shopping Portal

    View full-size slide

  118. @arafkarsh arafkarsh
    Shopping Portal Design based on Hexagonal Architecture
    118
    Monolithic App Design using DDD
    Domain Driven Design helps you to migrate your monolithic App to Microservices based Apps

    View full-size slide

  119. @arafkarsh arafkarsh
    Movie Streaming Services – Code Packaging
    119
    Auth Streaming License Subscription
    Discovery
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Domain Layer
    • Models
    • Repo
    • Services
    • Factories
    Adapters
    • Repo
    • Services
    • Web Services
    Packaging Structure
    Bounded Context
    Implementation
    (Repositories, Business Services, Web Services)
    Domain Models
    (Entities, Value Objects, DTOs)
    (Repositories, Business Services, Web Services)
    Entity Factories
    Interfaces (Ports)

    View full-size slide

  120. @arafkarsh arafkarsh
    Case Study: Movie Streaming – Event Sourcing / CQRS
    120
    Subscription Payment
    • Search Movies
    • Add Movies
    • Update Movies
    Commands
    • Request Streaming
    • Start Movie Streaming
    • Pause Movie Streaming
    • Validate Streaming
    License
    • Validate Download
    License
    Events
    • Movie Added
    • Movie Updated
    • Movie Discontinued
    • Streaming Requested
    • Streaming Started
    • Streaming Paused
    • Streaming Done
    • Streaming Request
    Accepted
    • Streaming Request
    Denied
    • Subscribe Monthly
    • Subscribe Annually
    • Monthly
    Subscription Added
    • Yearly Subscription
    Added
    • Payment Approved
    • Payment Declined
    Discovery
    Microservices
    Customer will search for specific movie or pick up a new episode from a TV Series from the watch list. Once the streaming
    request is authorized by the license service, video streaming will start. Customer can pause, fast forward and restart the
    movie streaming. Movie streaming will be based on Customer subscription to the service.
    • Stream List
    • Favorite List
    Customer Journey thru Streaming Movie / TV Show
    The purpose of this example is to demonstrate the concept of ES / CQRS thru Event Storming principles.
    License
    Streaming
    Processes 1
    2 ES Aggregate 4
    Core Domain
    Sub Domain Sub Domain
    Sub Domain Generic Domain
    3

    View full-size slide

  121. @arafkarsh arafkarsh
    DDD: Use Case
    121
    Subscription Service
    Models
    Value Object
    • Currency
    • Subscription Value
    • Subscription Type
    • Subscription Status
    • Payment Type
    • Record State
    • Audit Log
    Entity
    • Subscription (Aggregate
    Root)
    • Customer
    • Payment
    DTO
    • Subscription
    • Payment
    Domain Layer Adapters
    • Order Repository
    • Order Service
    • Order Web Service
    • Order Query Web Service
    • Payment Web Service
    Adapters Consists of Actual
    Implementation of the Ports like
    Database Access, Web Services
    API etc.
    Converters are used to convert
    an Enum value to a proper
    Integer value in the Database.
    For Example Order Status
    Complete is mapped to integer
    value 100 in the database.
    Services / Ports
    • Order Repository
    • Order Service
    • Order Web Service
    Utils
    • Order Factory
    • Order Status Converter
    • Record State Converter
    • Order Query Web Service
    • Streaming Web Service
    • Payment Web Service
    Movie Streaming

    View full-size slide

  122. @arafkarsh arafkarsh
    Case Study: Patient Diagnosis and Treatment
    122
    Payment
    • Register
    Patient
    • Search Doctor
    Commands
    • Add Patient
    Info
    • Add Details
    • Add BP
    • Add Diagnosis
    • Add
    Prescription
    Events
    • Doctor
    Scheduled
    • Patient Added
    • Patient Info
    Added
    • Details Added
    • BP Added
    • Diagnosis
    Added
    • Prescription
    Added
    • Add
    Medicine
    • Add Bill
    • Medicine
    Added
    • Bill Prepared
    • Payment
    Approved
    • Payment Declined
    • Cash Paid
    Patient registers and takes an appointment with the doctor. Patient details and history is recorded. Doctor
    does the diagnosis and creates the prescription. Patient buys the medicine from the Pharmacy. If patient
    needs to be admitted, then ward appointment is scheduled and admitted to the ward. Once the treatment is
    over patient is discharged from the Hospital.
    Microservices
    • Diagnosis
    • Prescription
    • Hospital Bill
    • Discharge Summary
    Patient Journey thru Treatment Process
    Registration
    • Add Doctor
    • Add
    Appointment
    • Add Patient File
    • Doctor Added
    • Appointment
    Added
    • Patient File Added
    ES Aggregate
    2 4
    Processes 1
    Doctors Diagnosis Pharmacy
    Ward
    Patient
    • Add Checkup
    • Add Treatment
    • Add Food
    • Add Discharge
    • Checkup Added
    • Treatment
    Added
    • Food Added
    • Discharge Added
    Core Domain Sub Domain Sub Domain
    Sub Domain
    Sub Domain Generic Domain
    Sub Domain
    3

    View full-size slide

  123. @arafkarsh arafkarsh
    Case Study: Movie Booking – Event Sourcing / CQRS
    123
    Order Payment
    • Search Movies
    • Add Movies
    • Update Movies
    Commands
    • Select Movie
    • Select Theatre / Show
    • Select Seats
    • Process Order
    • Select Food
    • Food Removed
    • Skip Food
    • Process Order
    Events
    • Movie Added
    • Movie Updated
    • Movie Discontinued
    • Movie Added
    • Theatre / Show Added
    • Seats Added
    • Order Initiated
    • Popcorn Added
    • Drinks Added
    • Popcorn Removed
    • Order Finalized
    • Proceed for Payment
    • Confirm Order for
    Payment
    • Cancel Order
    • Payment Initiated
    • Order Cancelled
    • Order Confirmed
    • OTP Send
    • Payment Approved
    • Payment Declined
    Movies Theatres Food
    Microservices
    Customer's will Search for the Movies after selecting the City. Once the movie is selected then they will identify a theatre and
    check for the show Times and then select the seats. Once the seats are selected then a choice is given to add Snacks after
    that the Customer will proceed to payments. Once the payment is done then the tickets are confirmed.
    • Theatre
    • Show
    • Order
    Customer Journey thru booking Movie Ticket
    The purpose of this example is to demonstrate the concept of ES / CQRS thru Event Storming principles.
    Processes 1
    2 ES Aggregate 4
    Core Domain
    Sub Domain Sub Domain
    Sub Domain Generic Domain
    3

    View full-size slide

  124. @arafkarsh arafkarsh
    Case Study: Restaurant Dining – Event Sourcing and CQRS
    124
    Order Payment
    • Add Drinks
    • Add Food
    • Update Food
    Commands • Open Table
    • Add Juice
    • Add Soda
    • Add Appetizer 1
    • Add Appetizer 2
    • Serve Drinks
    • Prepare Food
    • Serve Food
    Events
    • Drinks Added
    • Food Added
    • Food Updated
    • Food Discontinued
    • Table Opened
    • Juice Added
    • Soda Added
    • Appetizer 1 Added
    • Appetizer 2 Added
    • Juice Served
    • Soda Served
    • Appetizer Served
    • Food Prepared
    • Food Served
    • Prepare Bill
    • Process
    Payment
    • Bill Prepared
    • Payment Processed
    • Payment Approved
    • Payment Declined
    • Cash Paid
    When people arrive at the Restaurant and take a table, a Table is opened. They may then order drinks and
    food. Drinks are served immediately by the table staff; however, food must be cooked by a chef. Once the
    chef prepared the food it can then be served. The Bill is prepared when the Table is closed.
    Microservices
    • Dinning Order
    • Billable Order
    Customer Journey thru Dinning Processes
    Food Menu Kitchen
    Dining
    • Remove Soda
    • Add Food 1
    • Add Food 2
    • Place Order
    • Close Table
    • Remove Soda
    • Food 1 Added
    • Food 2 Added
    • Order Placed
    • Table Closed
    ES Aggregate
    2 4
    Processes 1
    Core Domain
    Sub Domain Sub Domain
    Sub Domain Generic Domain
    3

    View full-size slide

  125. @arafkarsh arafkarsh
    Summary: User Journey / CCD / DDD / Event Sourcing & CQRS
    125
    User Journey
    Bounded
    Context
    1
    Bounded
    Context
    2
    Bounded
    Context
    3
    1. Bounded Contexts
    2. Entity
    3. Value Objects
    4. Aggregate Roots
    5. Domain Events
    6. Repository
    7. Service
    8. Factory
    Process
    1
    Commands
    2
    Projections
    5
    ES Aggregate
    4
    Events
    3
    Event Sourcing & CQRS
    Domain Expert Analyst Architect QA
    Design Docs Test Cases Code
    Developers
    Domain Driven Design
    Ubiquitous Language
    Core
    Domain
    Sub
    Domain
    Generic
    Domain
    Vertically sliced Product Team
    FE
    BE
    DB
    Business
    Capability 1
    QA Team
    PO
    FE
    BE
    DB
    Business
    Capability 2
    QA Team
    PO
    FE
    BE
    DB
    Business
    Capability n
    QA Team
    PO

    View full-size slide

  126. @arafkarsh arafkarsh 126
    100s Microservices
    1,000s Releases / Day
    10,000s Virtual Machines
    100K+ User actions / Second
    81 M Customers Globally
    1 B Time series Metrics
    10 B Hours of video streaming
    every quarter
    Source: NetFlix: : https://www.youtube.com/watch?v=UTKIT6STSVM
    10s OPs Engineers
    0 NOC
    0 Data Centers
    So what do NetFlix think about DevOps?
    No DevOps
    Don’t do lot of Process / Procedures
    Freedom for Developers & be Accountable
    Trust people you Hire
    No Controls / Silos / Walls / Fences
    Ownership – You Build it, You Run it.

    View full-size slide

  127. @arafkarsh arafkarsh 127
    Design Patterns are
    solutions to general
    problems that
    software developers
    faced during software
    development.
    Design Patterns

    View full-size slide

  128. @arafkarsh arafkarsh 128
    DREAM | AUTOMATE | EMPOWER
    Araf Karsh Hamid :
    India: +91.999.545.8627
    http://www.slideshare.net/arafkarsh
    https://www.linkedin.com/in/arafkarsh/
    https://www.youtube.com/user/arafkarsh/playlists
    http://www.arafkarsh.com/
    @arafkarsh
    arafkarsh

    View full-size slide

  129. @arafkarsh arafkarsh 129
    Source Code: https://github.com/MetaArivu Web Site: https://metarivu.com/ https://pyxida.cloud/

    View full-size slide

  130. @arafkarsh arafkarsh 130
    http://www.slideshare.net/arafkarsh

    View full-size slide

  131. @arafkarsh arafkarsh
    References
    131
    1. July 15, 2015 – Agile is Dead : GoTo 2015 By Dave Thomas
    2. Apr 7, 2016 - Agile Project Management with Kanban | Eric Brechner | Talks at Google
    3. Sep 27, 2017 - Scrum vs Kanban - Two Agile Teams Go Head-to-Head
    4. Feb 17, 2019 - Lean vs Agile vs Design Thinking
    5. Dec 17, 2020 - Scrum vs Kanban | Differences & Similarities Between Scrum & Kanban
    6. Feb 24, 2021 - Agile Methodology Tutorial for Beginners | Jira Tutorial | Agile Methodology Explained.
    Agile Methodologies

    View full-size slide

  132. @arafkarsh arafkarsh
    References
    132
    1. Vmware: What is Cloud Architecture?
    2. Redhat: What is Cloud Architecture?
    3. Cloud Computing Architecture
    4. Cloud Adoption Essentials:
    5. Google: Hybrid and Multi Cloud
    6. IBM: Hybrid Cloud Architecture Intro
    7. IBM: Hybrid Cloud Architecture: Part 1
    8. IBM: Hybrid Cloud Architecture: Part 2
    9. Cloud Computing Basics: IaaS, PaaS, SaaS
    1. IBM: IaaS Explained
    2. IBM: PaaS Explained
    3. IBM: SaaS Explained
    4. IBM: FaaS Explained
    5. IBM: What is Hypervisor?
    Cloud Architecture

    View full-size slide

  133. @arafkarsh arafkarsh
    References
    133
    Microservices
    1. Microservices Definition by Martin Fowler
    2. When to use Microservices By Martin Fowler
    3. GoTo: Sep 3, 2020: When to use Microservices By Martin Fowler
    4. GoTo: Feb 26, 2020: Monolith Decomposition Pattern
    5. Thought Works: Microservices in a Nutshell
    6. Microservices Prerequisites
    7. What do you mean by Event Driven?
    8. Understanding Event Driven Design Patterns for Microservices

    View full-size slide

  134. @arafkarsh arafkarsh
    References – Microservices – Videos
    134
    1. Martin Fowler – Micro Services : https://www.youtube.com/watch?v=2yko4TbC8cI&feature=youtu.be&t=15m53s
    2. GOTO 2016 – Microservices at NetFlix Scale: Principles, Tradeoffs & Lessons Learned. By R Meshenberg
    3. Mastering Chaos – A NetFlix Guide to Microservices. By Josh Evans
    4. GOTO 2015 – Challenges Implementing Micro Services By Fred George
    5. GOTO 2016 – From Monolith to Microservices at Zalando. By Rodrigue Scaefer
    6. GOTO 2015 – Microservices @ Spotify. By Kevin Goldsmith
    7. Modelling Microservices @ Spotify : https://www.youtube.com/watch?v=7XDA044tl8k
    8. GOTO 2015 – DDD & Microservices: At last, Some Boundaries By Eric Evans
    9. GOTO 2016 – What I wish I had known before Scaling Uber to 1000 Services. By Matt Ranney
    10. DDD Europe – Tackling Complexity in the Heart of Software By Eric Evans, April 11, 2016
    11. AWS re:Invent 2016 – From Monolithic to Microservices: Evolving Architecture Patterns. By Emerson L, Gilt D. Chiles
    12. AWS 2017 – An overview of designing Microservices based Applications on AWS. By Peter Dalbhanjan
    13. GOTO Jun, 2017 – Effective Microservices in a Data Centric World. By Randy Shoup.
    14. GOTO July, 2017 – The Seven (more) Deadly Sins of Microservices. By Daniel Bryant
    15. Sept, 2017 – Airbnb, From Monolith to Microservices: How to scale your Architecture. By Melanie Cubula
    16. GOTO Sept, 2017 – Rethinking Microservices with Stateful Streams. By Ben Stopford.
    17. GOTO 2017 – Microservices without Servers. By Glynn Bird.

    View full-size slide

  135. @arafkarsh arafkarsh
    References
    135
    Domain Driven Design
    1. Oct 27, 2012 What I have learned about DDD Since the book. By Eric Evans
    2. Mar 19, 2013 Domain Driven Design By Eric Evans
    3. Jun 02, 2015 Applied DDD in Java EE 7 and Open Source World
    4. Aug 23, 2016 Domain Driven Design the Good Parts By Jimmy Bogard
    5. Sep 22, 2016 GOTO 2015 – DDD & REST Domain Driven API’s for the Web. By Oliver Gierke
    6. Jan 24, 2017 Spring Developer – Developing Micro Services with Aggregates. By Chris Richardson
    7. May 17. 2017 DEVOXX – The Art of Discovering Bounded Contexts. By Nick Tune
    8. Dec 21, 2019 What is DDD - Eric Evans - DDD Europe 2019. By Eric Evans
    9. Oct 2, 2020 - Bounded Contexts - Eric Evans - DDD Europe 2020. By. Eric Evans
    10. Oct 2, 2020 - DDD By Example - Paul Rayner - DDD Europe 2020. By Paul Rayner

    View full-size slide

  136. @arafkarsh arafkarsh
    References
    136
    Event Sourcing and CQRS
    1. IBM: Event Driven Architecture – Mar 21, 2021
    2. Martin Fowler: Event Driven Architecture – GOTO 2017
    3. Greg Young: A Decade of DDD, Event Sourcing & CQRS – April 11, 2016
    4. Nov 13, 2014 GOTO 2014 – Event Sourcing. By Greg Young
    5. Mar 22, 2016 Building Micro Services with Event Sourcing and CQRS
    6. Apr 15, 2016 YOW! Nights – Event Sourcing. By Martin Fowler
    7. May 08, 2017 When Micro Services Meet Event Sourcing. By Vinicius Gomes

    View full-size slide

  137. @arafkarsh arafkarsh
    References
    137
    Kafka
    1. Understanding Kafka
    2. Understanding RabbitMQ
    3. IBM: Apache Kafka – Sept 18, 2020
    4. Confluent: Apache Kafka Fundamentals – April 25, 2020
    5. Confluent: How Kafka Works – Aug 25, 2020
    6. Confluent: How to integrate Kafka into your environment – Aug 25, 2020
    7. Kafka Streams – Sept 4, 2021
    8. Kafka: Processing Streaming Data with KSQL – Jul 16, 2018
    9. Kafka: Processing Streaming Data with KSQL – Nov 28, 2019

    View full-size slide

  138. @arafkarsh arafkarsh
    References
    138
    Databases: Big Data / Cloud Databases
    1. Google: How to Choose the right database?
    2. AWS: Choosing the right Database
    3. IBM: NoSQL Vs. SQL
    4. A Guide to NoSQL Databases
    5. How does NoSQL Databases Work?
    6. What is Better? SQL or NoSQL?
    7. What is DBaaS?
    8. NoSQL Concepts
    9. Key Value Databases
    10. Document Databases
    11. Jun 29, 2012 – Google I/O 2012 - SQL vs NoSQL: Battle of the Backends
    12. Feb 19, 2013 - Introduction to NoSQL • Martin Fowler • GOTO 2012
    13. Jul 25, 2018 - SQL vs NoSQL or MySQL vs MongoDB
    14. Oct 30, 2020 - Column vs Row Oriented Databases Explained
    15. Dec 9, 2020 - How do NoSQL databases work? Simply Explained!
    1. Graph Databases
    2. Column Databases
    3. Row Vs. Column Oriented Databases
    4. Database Indexing Explained
    5. MongoDB Indexing
    6. AWS: DynamoDB Global Indexing
    7. AWS: DynamoDB Local Indexing
    8. Google Cloud Spanner
    9. AWS: DynamoDB Design Patterns
    10. Cloud Provider Database Comparisons
    11. CockroachDB: When to use a Cloud DB?

    View full-size slide

  139. @arafkarsh arafkarsh
    References
    139
    Docker / Kubernetes / Istio
    1. IBM: Virtual Machines and Containers
    2. IBM: What is a Hypervisor?
    3. IBM: Docker Vs. Kubernetes
    4. IBM: Containerization Explained
    5. IBM: Kubernetes Explained
    6. IBM: Kubernetes Ingress in 5 Minutes
    7. Microsoft: How Service Mesh works in Kubernetes
    8. IBM: Istio Service Mesh Explained
    9. IBM: Kubernetes and OpenShift
    10. IBM: Kubernetes Operators
    11. 10 Consideration for Kubernetes Deployments
    Istio – Metrics
    1. Istio – Metrics
    2. Monitoring Istio Mesh with Grafana
    3. Visualize your Istio Service Mesh
    4. Security and Monitoring with Istio
    5. Observing Services using Prometheus, Grafana, Kiali
    6. Istio Cookbook: Kiali Recipe
    7. Kubernetes: Open Telemetry
    8. Open Telemetry
    9. How Prometheus works
    10. IBM: Observability vs. Monitoring

    View full-size slide

  140. @arafkarsh arafkarsh
    References
    140
    1. Feb 6, 2020 – An introduction to TDD
    2. Aug 14, 2019 – Component Software Testing
    3. May 30, 2020 – What is Component Testing?
    4. Apr 23, 2013 – Component Test By Martin Fowler
    5. Jan 12, 2011 – Contract Testing By Martin Fowler
    6. Jan 16, 2018 – Integration Testing By Martin Fowler
    7. Testing Strategies in Microservices Architecture
    8. Practical Test Pyramid By Ham Vocke
    Testing – TDD / BDD

    View full-size slide

  141. @arafkarsh arafkarsh 141
    1. Simoorg : LinkedIn’s own failure inducer framework. It was designed to be easy to extend and
    most of the important components are plug- gable.
    2. Pumba : A chaos testing and network emulation tool for Docker.
    3. Chaos Lemur : Self-hostable application to randomly destroy virtual machines in a BOSH-
    managed environment, as an aid to resilience testing of high-availability systems.
    4. Chaos Lambda : Randomly terminate AWS ASG instances during business hours.
    5. Blockade : Docker-based utility for testing network failures and partitions in distributed
    applications.
    6. Chaos-http-proxy : Introduces failures into HTTP requests via a proxy server.
    7. Monkey-ops : Monkey-Ops is a simple service implemented in Go, which is deployed into an
    OpenShift V3.X and generates some chaos within it. Monkey-Ops seeks some OpenShift
    components like Pods or Deployment Configs and randomly terminates them.
    8. Chaos Dingo : Chaos Dingo currently supports performing operations on Azure VMs and VMSS
    deployed to an Azure Resource Manager-based resource group.
    9. Tugbot : Testing in Production (TiP) framework for Docker.
    Testing tools

    View full-size slide

  142. @arafkarsh arafkarsh
    References
    142
    CI / CD
    1. What is Continuous Integration?
    2. What is Continuous Delivery?
    3. CI / CD Pipeline
    4. What is CI / CD Pipeline?
    5. CI / CD Explained
    6. CI / CD Pipeline using Java Example Part 1
    7. CI / CD Pipeline using Ansible Part 2
    8. Declarative Pipeline vs Scripted Pipeline
    9. Complete Jenkins Pipeline Tutorial
    10. Common Pipeline Mistakes
    11. CI / CD for a Docker Application

    View full-size slide

  143. @arafkarsh arafkarsh
    References
    143
    DevOps
    1. IBM: What is DevOps?
    2. IBM: Cloud Native DevOps Explained
    3. IBM: Application Transformation
    4. IBM: Virtualization Explained
    5. What is DevOps? Easy Way
    6. DevOps?! How to become a DevOps Engineer???
    7. Amazon: https://www.youtube.com/watch?v=mBU3AJ3j1rg
    8. NetFlix: https://www.youtube.com/watch?v=UTKIT6STSVM
    9. DevOps and SRE: https://www.youtube.com/watch?v=uTEL8Ff1Zvk
    10. SLI, SLO, SLA : https://www.youtube.com/watch?v=tEylFyxbDLE
    11. DevOps and SRE : Risks and Budgets : https://www.youtube.com/watch?v=y2ILKr8kCJU
    12. SRE @ Google: https://www.youtube.com/watch?v=d2wn_E1jxn4

    View full-size slide

  144. @arafkarsh arafkarsh
    References
    144
    1. Lewis, James, and Martin Fowler. “Microservices: A Definition of This New Architectural Term”, March 25, 2014.
    2. Miller, Matt. “Innovate or Die: The Rise of Microservices”. e Wall Street Journal, October 5, 2015.
    3. Newman, Sam. Building Microservices. O’Reilly Media, 2015.
    4. Alagarasan, Vijay. “Seven Microservices Anti-patterns”, August 24, 2015.
    5. Cockcroft, Adrian. “State of the Art in Microservices”, December 4, 2014.
    6. Fowler, Martin. “Microservice Prerequisites”, August 28, 2014.
    7. Fowler, Martin. “Microservice Tradeoffs”, July 1, 2015.
    8. Humble, Jez. “Four Principles of Low-Risk Software Release”, February 16, 2012.
    9. Zuul Edge Server, Ketan Gote, May 22, 2017
    10. Ribbon, Hysterix using Spring Feign, Ketan Gote, May 22, 2017
    11. Eureka Server with Spring Cloud, Ketan Gote, May 22, 2017
    12. Apache Kafka, A Distributed Streaming Platform, Ketan Gote, May 20, 2017
    13. Functional Reactive Programming, Araf Karsh Hamid, August 7, 2016
    14. Enterprise Software Architectures, Araf Karsh Hamid, July 30, 2016
    15. Docker and Linux Containers, Araf Karsh Hamid, April 28, 2015

    View full-size slide

  145. @arafkarsh arafkarsh
    References
    145
    16. MSDN – Microsoft https://msdn.microsoft.com/en-us/library/dn568103.aspx
    17. Martin Fowler : CQRS – http://martinfowler.com/bliki/CQRS.html
    18. Udi Dahan : CQRS – http://www.udidahan.com/2009/12/09/clarified-cqrs/
    19. Greg Young : CQRS - https://www.youtube.com/watch?v=JHGkaShoyNs
    20. Bertrand Meyer – CQS - http://en.wikipedia.org/wiki/Bertrand_Meyer
    21. CQS : http://en.wikipedia.org/wiki/Command–query_separation
    22. CAP Theorem : http://en.wikipedia.org/wiki/CAP_theorem
    23. CAP Theorem : http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
    24. CAP 12 years how the rules have changed
    25. EBay Scalability Best Practices : http://www.infoq.com/articles/ebay-scalability-best-practices
    26. Pat Helland (Amazon) : Life beyond distributed transactions
    27. Stanford University: Rx https://www.youtube.com/watch?v=y9xudo3C1Cw
    28. Princeton University: SAGAS (1987) Hector Garcia Molina / Kenneth Salem
    29. Rx Observable : https://dzone.com/articles/using-rx-java-observable

    View full-size slide