Kafka and Debezium at trivago | Code.Talks 2017 edition

6f676db35d9c4a3b701ca41f266d693c?s=47 Mario Mueller
September 29, 2017

Kafka and Debezium at trivago | Code.Talks 2017 edition

Video available: https://www.youtube.com/watch?v=cU0BCVl4bjo

Nowadays everyone solves the separation and modularisation of big monolithic legacy applications and databases with HTTP microservices. We at trivago take another approach using Apache Kafka, Debezium and Stream processing. This talk shows how we do this and why we are convinced, that this is a better way to enhance your architecture, why this exceeds HTTP microservices and is more then 1 million times faster.

6f676db35d9c4a3b701ca41f266d693c?s=128

Mario Mueller

September 29, 2017
Tweet

Transcript

  1. Apache Kafka and Debezium at

  2. Hello! René Kerner Software-Architect Stream Platform Services @rk3n3r Mario Müller

    Team Lead Stream Platform Services @xenji http://muppethub.com/wp-content/uploads/2014/02/Statler-and-Waldorf-1.png
  3. Who of you uses Apache Kafka in production? From: http://www.inwardconsulting.com/how-we-think/blog/taking-the-vote-to-the-office/

  4. Who of you uses Debezium in production? From: http://www.inwardconsulting.com/how-we-think/blog/taking-the-vote-to-the-office/

  5. A little bit of technological context • We heavily make

    use of Percona Server • We use Redis, Cassandra, Memcached, Elasticsearch, ... • Four on-premise datacenters (DUS, SFO, DCA, HGK) • Some projects in public cloud environments (AWS & GCP) • We use PHP, Java, Go, Python, Kotlin, JavaScript, ...
  6. Kafka at trivago • Applications • Clickstream and application log

    transport • Change set Streams, Transport + Storage • Stream Processing • Numbers • 28 production servers, spread over four datacenters • 9 TB permanent data • 200 Mbit/s peak • 240k msg/s peak
  7. Debezium at trivago • Debezium • Is a RedHat sponsored

    OSS project • Supports MySQL, PostgreSQL, Oracle and MongoDB • Ingests the change events into Kafka topics • Use-cases for us • Cross-technology and routed replication • Decouple writer (producer) and reader (consumer) • Stream Processing
  8. History

  9. Where do we come from? • Company Structure • Matrix

    organisation • Horizontal value chain through multiple teams (e. g. Advertiser → Content → Tech → Marketing → User) • Tech • One big PHP app, two big Java apps • Central MySQL schema for all with an unknown amount of readers and writers • ~120 employees (~20 tech) 2011, now >1300 employees (> 300 tech) • 1 dev location until 2012, 4 dev locations now
  10. “Any organization that designs a system (defined broadly) will produce

    a design whose structure is a copy of the organization's communication structure” - Conway’s law
  11. None
  12. https://fsmedia.imgix.net/45/3f/ed/06/84b7/4d42/b51a/40681c42eac3/picard-facepalmjpg.jpeg

  13. HTTP Microservices Are they really solving the problem?

  14. What is so hard about HTTP APIs? RyW - Read

    your own writes MR - Monotonic Reads MW - Monotonic Writes WfR - Write follows Read
  15. What is the result? • Invalid caches due to race

    conditions • How to solve? • Global Locks => Moves latency to the client • (Software) Transactions => Tons of complexity from: confluent.io
  16. https://pbs.twimg.com/profile_images/1927177125/image.jpg

  17. Apache Kafka Reactive Microservice Architecture

  18. Kafka Glossary • CDC = Change Data Capturing • Kafka

    Cluster, Broker, Producer, Consumer • Source = Datastore to Kafka Message producer • Sink = Kafka Message to Datastore writer • SP = Stream Processor (Kafka Streams API)
  19. Kafka Basic Internals • Stream = Topic (a stream of

    records) • Record = Message = Event • One record consists of a key, a value and a timestamp • 1 Topic can have “n” partitions (scaling)
  20. Kafka Characteristics • Distributed Streaming Platform: Publish & subscribe, process

    and store • Fault-tolerant and scalable (partitioning, replication) • Read: At-Least-Once-Delivery, Exactly-Once-Delivery (v0.11) • Tuneable write consistency (no messages/events get lost)
  21. Kafka Characteristics • Reactive: Change Events, Long-Polling consumers, Kafka Streams

    API for joining, filtering, aggregation • Unidirectional data flows => no race conditions (or at least easy manageable) • Strictly ordered per partition • Log compaction by: Time, Size or Key Compaction
  22. Kafka as write API MySQL Kafka Topic Debezium Protobuf Converter

    App 1 App 2 App 3 Source
  23. Kafka as read API Kafka Topic MySQL Redis MySQL Elastic-

    Search Debezium Protobuf Converter App 1 App 2 App 3 Cassandra Sink Adapter Protobuf Converter App 1 App 4 Sink
  24. Our solution approach

  25. Change Data Capturing (CDC) with Debezium • Change Capture the

    big central MySQL database • Externalise the schema using Google Protocol Buffers • Debezium reads from MySQL replication binlog (row replication) => Make each table available as stream of changesets from: confluent.io
  26. Stream Processors • Generate efficient views • Join different streams

    • Filter, map and aggregate data • Optimize data + structure for your application needs • Create pre-calculated data before hitting the datastore/s • Performance benefit (unfair): >1 mio. times faster from: confluent.io
  27. • Ownership of data • Focus on expertise for data

    inside the bounded context • Deliver best data quality • “Hunter not hunted” • Independence (data usage/ownership/scaling) • Focus on freely combining data • Gain new knowledge • Filter data • Join and aggregate • Focus on consumer / application needs
  28. http://i0.kym-cdn.com/photos/images/original/000/049/657/23icsb8.jpg

  29. Thank you!

  30. Q & A https://muppetmindset.files.wordpress.com/2016/05/question-mark.jpg

  31. Links and notes http://www.dbms2.com/2010/05/01/ryw-read-your-writes-consistency/ http://www.allthingsdistributed.com/2007/12/eventually_consistent.html http://www.cs.cornell.edu/courses/cs734/2000FA/cached%20papers/SessionGuaranteesPDIS_1.html https://www.slideshare.net/ConfluentInc/capture-the-streams-of-database-changes http://www.melconway.com/Home/Conways_Law.html