Using Kafka to Discover Events Hidden in your Database

100d575aae1eb745d4bfcd590f91450a?s=47 Anna McDonald
September 30, 2019

Using Kafka to Discover Events Hidden in your Database

Is your database the center of everything? Do you have multiple applications, batch jobs, and direct consumers hitting your database? In order to cover everywhere an event takes place would you need to update multiple legacy services? Moving to event-driven development can seem overwhelming when faced with this scenario; that’s where Kafka comes in!

Using Derivative Events, Change Data Capture (CDC) and Kafka Streams gives you access to every single event in your space—if you only know where to look.

In this talk we’ll discuss:
• The properties in a CDC message used in identifying events
• Techniques for defining high-level events and how to sniff them out using Kafka Streams
• How to define predicates for CDC events
• Recommendations on handling events that require more than one table

We’ll wrap up by discussing how to structure your event schema to handle a mix of data derived and more traditionally produced events.

100d575aae1eb745d4bfcd590f91450a?s=128

Anna McDonald

September 30, 2019
Tweet

Transcript

  1. 2.

    ARE YOU IN THE RIGHT TALK? Does your environment look

    like... Super Fancy Half Fancy help. 2 @jbfletch_
  2. 6.

    Event Sourcing Fundamental Test: Do you have the ability to

    blow away the current application state and rebuild from an event store? 6 @jbfletch_
  3. 7.

    A Tale of Two Types of Events Primary Derivative CC

    Image courtesy of Francoise on Flickr CC Image courtesy of OtterBox on Flickr 7 @jbfletch_
  4. 8.

    Primary Events Derivative Events Services are Easy to Deploy Requires

    4 live chickens and a complicated interpretive dance to be granted safe passage to deploy One Service per Function, No duplication Shadow Clone Jutsu! Code changes can be modeled as: {Easy to Make, Easy to Test, Easy to push} Code changes can be modeled as: {Complicated to Make, Brutal to Test, Bureaucratic to push} 8 @jbfletch_
  5. 11.

    “To find events one must know what to look for”

    me talking to myself, on August 27th, 2019 Know the Events that matter 11 @jbfletch_
  6. 12.

    “To find events one must know what to look for”

    me talking to myself, on August 27th, 2019 Know your Systems Know the Events that matter 12 @jbfletch_
  7. 13.

    “To find events one must know what to look for”

    me talking to myself, on August 27th, 2019 Know your System Know the Events that matter Capture Broad Categories of Events 13 @jbfletch_
  8. 14.
  9. 15.

    Observe Find a Durable Event Source Define an Event Profile

    System Logs DB User Logs 1 2 WANTED 15 @jbfletch_
  10. 18.

    CDC Overview MySQL inserts: op_type: I, after struct updates: op_type:

    U, before,after struct Debezium 18 @jbfletch_
  11. 19.

    CDC Overview MySQL inserts: op_type: I, after struct updates: op_type:

    U, before,after struct deletes: op_type: D, before struct Plus optional tombstone message Debezium 19 @jbfletch_
  12. 20.

    Observe Find a Durable Event Source Define an Event Profile

    System Logs DB User Logs 1 2 WANTED 20 @jbfletch_
  13. 21.

    Becoming an event profiler Trigger the event in the source

    system Review the CDC Messages generated during the event Find the event fingerprint that signifies completeness WANTED 21 @jbfletch_
  14. 22.

    Becoming an event profiler Trigger the event in the source

    system Review the CDC Messages generated during the event Find the event fingerprint that signifies completeness WANTED 22 @jbfletch_
  15. 23.

    Becoming an event profiler Trigger the event in the source

    system Review the CDC Messages generated during the event Find the event fingerprint that signifies completeness WANTED 23 @jbfletch_
  16. 24.

    Example Profiles signifying completeness Simple order placed event: • Op_type

    of insert, after has non null order_number 24 @jbfletch_
  17. 25.

    Example Profiles signifying completeness Simple order placed event: • Op_type

    of insert, after has non null order_number • Op_type of update, before has a null order_number, after has a non null order number 25 @jbfletch_
  18. 26.

    Example Profiles signifying completeness Simple order placed event: • Op_type

    of insert, after has non null order_number • Op_type of update, before has a null order_number, after has a non null order number • Op_type of update, before has a null crazy_random_field, after has a non null crazy_random_field and a non null order number 26 @jbfletch_
  19. 30.

    Plan of attack 30 We need both of these things

    to be true before we fire the event: Order # 42 Total # Items 3 Order Event Aggregate @jbfletch_
  20. 32.

    What is the minimum information we need to know to

    be able to determine event completeness for items? 32 @jbfletch_ Total Number of items per order
  21. 34.

    Basic Central Event Service Setup Filter using the event profile:

    op_type = “I” to create: KTable<OrderId, Order> orderTableKeyOrderId <- Orders that match our order created event profile KStream<ItemId,Items> itemsKeyedByItemIdStream <- Items that match our order created event profile KTable<OrderId,NumberOfItems> totalNumberofItemsTable <- Number of items in each order 34 1 2 3
  22. 35.

    groupBy + aggregate + join + filter KTable<OrderId, ArrayList<Items>> preItemsTable

    = itemsKeyedByItemIdStream(ItemId,Items) .groupBy(ORDER_ID) ← OrderId: 42 .aggregate(ArrayList::new, add(Items), return null for TS) .join(totalNumberofItemsTable) <- itemCount: 3 Optional join allows for the propagation of total order items to each item 35 1 2 3 @jbfletch_
  23. 36.

    groupBy + aggregate + join + filter KTable<OrderId, ArrayList<Items>> preItemsTable

    = itemsKeyedByItemIdStream<ItemId,Items> .groupBy(ORDER_ID) ← OrderId: 42 .aggregate(ArrayList::new, add(Items), return null for TS) .join(totalNumberofItemsTable) <- itemCount: 3 Optional join allows for the propagation of total order items to each item 36 1 2 3 @jbfletch_
  24. 37.

    groupBy + aggregate + join + filter KTable<OrderId, ArrayList<Items>> preItemsTable

    = itemsKeyedByItemIdStream(ItemId,Items) .groupBy(ORDER_ID) ← OrderId: 42 .aggregate(ArrayList::new, add(Items), return null for TS) .join(totalNumberofItemsTable) <- itemCount: 3 Optional join allows for the propagation of total order items to each item 37 1 2 3 @jbfletch_
  25. 38.

    groupBy + aggregate + join + filter KTable<OrderId, ArrayList<Items>> fullItemsTable

    = preItemsTable .filter((k,v)-> v.size() == v.get(0).get(“itemCount”).asInt()) <- This filter will block until all 3 item lines are in the array 38 4 @jbfletch_
  26. 39.

    Part Deux 39 How can we be sure the order

    message has arrived? Order # 42 Total # Items 3 Order Event Aggregate @jbfletch_
  27. 40.

    Thou shall not pass!!! Using inner joins 40 KTable fullOrderTable

    = orderTableKeyOrderId .join(fullItemsTable, <- only contains orders with all items arrived (orderNode, itemNodes) -> { construct and return order placed event aggregate }) @jbfletch_
  28. 42.

    Event Schemas What do I want to tell others and

    what information do I need to do it? 42 @jbfletch_
  29. 44.

    Example Event Schema Properties Event Name - “order placed” Event

    Production Time - 2019-10-16 17:37:20.000534 44 @jbfletch_
  30. 45.

    Example Event Schema Properties Event Name - “order placed” Event

    Production Time - 2019-10-16 17:37:20.000534 Database Operation Time - 2019-10-16 17:34:20.000534 45 @jbfletch_
  31. 46.

    Example Event Schema Properties Event Name - “order placed” Event

    Production Time - 2019-10-16 17:37:20.000534 Database Operation Time - 2019-10-16 17:34:20.000534 Source System - “old evil application” 46 @jbfletch_
  32. 47.

    Example Event Schema Properties Event Name - “order placed” Event

    Production Time - 2019-10-16 17:37:20.000534 Database Operation Time - 2019-10-16 17:34:20.000534 Source System - “old evil application” Event Creation System - “central order event service” 47 @jbfletch_
  33. 48.

    Example Event Schema Properties Event Name - “order placed” Event

    Production Time - 2019-10-16 17:37:20.000534 Database Operation Time - 2019-10-16 17:34:20.000534 Source System - “old evil application” Event Creation System - “central order event service” Details Block - Specific to event, area for event-carried state transfer, information needed for event store. 48 @jbfletch_
  34. 52.

    Presenter Info Anna McDonald @jbfletch_ Principal Software Developer, SAS Straight

    up Math, Event Sourcing, Apache Kafka, Integration Architecture 52