Slide 1

Slide 1 text

messaging → logs @apachekafka Jorge Quilcate Otoya @jeqo89

Slide 2

Slide 2 text

About me Jorge Quilcate Otoya Back-end/Integration Developer at Sysco Middleware @jeqo89 | github.com/jeqo | jeqo.github.io

Slide 3

Slide 3 text

Contexto

Slide 4

Slide 4 text

“Tecnología que permite comunicación asíncrona… Channels, también conocidos queues (colas), son la ruta lógica que conecta los programas y transmite los mensajes … El remitente o producer (productor) es el programa que envía mensajes, escribiendo el mensaje en un canal El receptor o consumer (consumidor) es el programa que recibe los mensajes, leyéndolo (y eliminandolo) del canal.” Context: Messaging Enterprise Integration Patterns - Gregor Hohpe and Bobby Woolf http://www.enterpriseintegrationpatterns.com/patterns/messaging/Introduction.html

Slide 5

Slide 5 text

Message Channels: Point-to-Point, Pub/Sub

Slide 6

Slide 6 text

Messaging use-case: Job Queues Fire and Forget Store and Forward (a.k.a. Push Model) Broker a cargo de la entrega confiable de mensajes Event sourcing and stream processing at scale - Martin Kleppmann https://martin.kleppmann.com/2016/01/29/event-sourcing-stream-proce ssing-at-ddd-europe.html Implementations: JMS/AMQP

Slide 7

Slide 7 text

Messaging Challenges Riesgo de mensajes Out-of-order cuando se re-intenta enviar un mensaje fallido Riesgo de inconsistencia en distintos clientes (producers and/or consumers)

Slide 8

Slide 8 text

Context: Logs Records (registros) son adjuntados al final del Log... Cada Record tiene un Key (llave)… Los Records están ordenados… El Orden define la noción de “tiempo”... El Contenido no es importante en este punto, podría ser cualquiera … Registran que ha pasado y cuando. The Log: What every software engineer should know about real-time data's unifying abstraction - Jay Kreps https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

Slide 9

Slide 9 text

Logs everywhere Cómo tu base de datos almacena información en disco de forma consistente? Utiliza un log. Cómo las réplicas de una base de datos sincronizan con otras réplicas? Utiliza un log. Cómo los datos una actividad quedan registrados en un sistema como Apache Kafka? Utiliza un log. Cómo la infraestructura de tu aplicación se mantendrá robusta a escala? Adivina cómo… Using logs to build a solid data infrastructure (or why dual writes are a bad idea) - Martin Kleppmann https://www.confluent.io/blog/using-logs-to-build-a-solid-data-infrastructure-or-why-dual-writes-are-a-bad-idea/ https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/

Slide 10

Slide 10 text

Log-Centric Architecture (a.k.a. Kappa) “Un sistema que asume un log externo está presente permite a los sistemas individuales abandonar una gran cantidad de complejidad y confiar en el log compartido.” The Log: What every software engineer should know about real-time data's unifying abstraction - Jay Kreps https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying http://milinda.pathirage.org/kappa-architecture.com/

Slide 11

Slide 11 text

Logs use-case: Event Log Pull Model Ordered stream de Eventos Consumers a cargo de obtener mensajes (poll) Event sourcing and stream processing at scale - Martin Kleppmann https://martin.kleppmann.com/2016/01/29/event-sourcing-stream-process ing-at-ddd-europe.html Implementations: Apache Kafka, Amazon Kinesis, Apache DistributedLog (incubating)

Slide 12

Slide 12 text

Solving Messaging Challenges with Logs Orden y Reprocesamiento

Slide 13

Slide 13 text

Apache Kafka A Distributed Streaming Platform

Slide 14

Slide 14 text

Apache Kafka: Hechos ➔ Nació de la necesidad de resolver el problema de data pipeline en LinkedIn. ➔ Primeros use-cases: Recolectar métricas de sistemas y monitorear la actividad de usuarios. 2010: Open-sourced 2011: Apache project 2012: Graduated from incubator in October 2014: Confluent Inc. founded Kafka: The Definitive Guide - Neha Narkhede, Gwen Shapira & Todd Palino

Slide 15

Slide 15 text

Apache Kafka: Use-cases ➔ Activity Tracking ➔ Messaging ➔ Metrics/Logging ➔ Commit Log ➔ Stream Processing ➔ Cloud Adoption

Slide 16

Slide 16 text

Apache Kafka Tour (v0.10.2.0) Log Records Kafka Cluster Kafka Producer API Kafka Consumer API Kafka Streams API Kafka Connect API Kafka ++

Slide 17

Slide 17 text

Kafka Core

Slide 18

Slide 18 text

Log Record

Slide 19

Slide 19 text

from Topics to Partitions http://kafka.apache.org/documentation

Slide 20

Slide 20 text

from Partitions to Segments https://www.confluent.io/apache-kafka-talk-series/deep-dive-into-apache-kafka/ https://www.confluent.io/apache-kafka-talk-series/

Slide 21

Slide 21 text

from Segments to Records https://www.confluent.io/apache-kafka-talk-series/deep-dive-into-apache-kafka/ https://www.confluent.io/apache-kafka-talk-series/

Slide 22

Slide 22 text

Log unit: Record https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

Slide 23

Slide 23 text

Lab: Log Record Record Structure: Key/Value Serialization/Deserialization Metadata: Offset/Timestamp

Slide 24

Slide 24 text

Schema Evolution: Why Avro? Reader’s schema y writer’s schema no requieren ser la misma Forward/Backward compatibility ➔ Agregar/eliminar campos con valores por defector ➔ Tipo `null` explicito (no optional/required markers) ➔ Posible cambiar data types ➔ Posible cambiar nombres (i.e. alias) Designing Data-Intensive Applications - Martin Kleppmann

Slide 25

Slide 25 text

Kafka Cluster

Slide 26

Slide 26 text

Servicio de Coordinación centralizado: consensus, group management, presence protocols, atomic broadcast “Fuente de verdad” interno de Kafka Usado para: ➔ Elección de Réplica Líder ➔ Sincronización réplicas (ISR) ➔ Y más Kafka Topology: Why Zookeeper? Distributed Consensus Reloaded: Apache Zookeeper and Replication in Kafka - Flavio Junqueira https://www.confluent.io/blog/distributed-consensus-reloaded-apache-zookeeper-and-replication-in-kafka/

Slide 27

Slide 27 text

Balance Availability and Consistency Use case: Activity Tracking ➔ Retención: 3 días ➔ Más particiones ➔ Menor factor de replicación ➔ Disponibilidad es más importante Use case: Inventory adjustments ➔ Retención: 6 meses ➔ Menos particiones ➔ Mayor factor de replicación ➔ Consistencia es más importante Streaming in Practice: Putting Kafka in Production - Roger Hoover https://www.confluent.io/apache-kafka-talk-series/Streaming-in-Practice-Putting-Kafka-in-Production/

Slide 28

Slide 28 text

Lab: Kafka Cluster Scalability: Cluster and Brokers Topics: Partitions, Replication, ISR Cleaning up: Compaction and Retention

Slide 29

Slide 29 text

Be careful with putting data in Containers https://twitter.com/waxzce/status/829420329177083904

Slide 30

Slide 30 text

Kafka Clients API

Slide 31

Slide 31 text

Kafka Clients survey https://www.confluent.io/blog/first-annual- state-apache-kafka-client-use-survey

Slide 32

Slide 32 text

Kafka Producer API

Slide 33

Slide 33 text

Batching and Compression

Slide 34

Slide 34 text

Acknowledgment: Latency vs Durability Ack=0 → No network delay → some data loss

Slide 35

Slide 35 text

Acknowledgment: Latency vs Durability Ack=1 → 1 network round-trip → few data loss

Slide 36

Slide 36 text

Acknowledgment: Latency vs Durability Ack=all (-1) → 2 network round-trip → no data loss (in combination with `min.insync.replicas`)

Slide 37

Slide 37 text

Lab: Kafka Producer Batching and Compression Acknowledgements

Slide 38

Slide 38 text

Results kafka_producer_ack_zero_latency_sum/kafka_producer_ack_zero_latency_count ack=0 => 0.05494 s. kafka_producer_ack_one_latency_sum/kafka_producer_ack_one_latency_count ack=1 => 0.06097 s. kafka_producer_ack_all_latency_sum/kafka_producer_ack_all_latency_count ack=* => 0.06375 s. Benchmarking Apache Kafka: 2 million writes per second on 3 cheap machines- Roger Hoover https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

Slide 39

Slide 39 text

Kafka Consumer API

Slide 40

Slide 40 text

➔ Consumer Groups as Logical Subscribers ➔ Offset by Consumer instance (group member) ➔ Consumer Groups as base of parallelism, with Partitions ➔ Ordering ensured by partition (+ keyed topics is normally enough) Multiple Consumers

Slide 41

Slide 41 text

At-Most-Once Delivery ➔ Scenario El proceso del consumo se ‘cae’ luego de guardar su posición pero antes de procesar el mensaje. ➔ Result El proceso que retoma el procesamiento, empezará de la posición guardada, aún si algunos mensajes previos no han sido procesados.

Slide 42

Slide 42 text

At-Least-Once Delivery ➔ Scenario El proceso de consumo se ‘cae’ luego de procesar los mensajes, pero antes de guardar su posición. ➔ Result Cuando el nuevo proceso retoma el procesamiento, los primeros mensajes que reciba pueden ya haber sido procesados.

Slide 43

Slide 43 text

Exactly-Once Delivery “Exactly-once delivery require de la cooperación con el sistema de almacenamiento de destino …” Próximamente (KIP-98): ● Idempotent Producer Guarantees ● Transactional Guarantees

Slide 44

Slide 44 text

Lab: Kafka Consumer Consumer Groups: Parallelism Consumer Offsets: Control and reprocessing (https://jeqo.github.io/post/2017-01-31-kafka-rewind-consume rs-offset/)

Slide 45

Slide 45 text

Kafka Streams API & Kafka Connector API

Slide 46

Slide 46 text

Kafka Streams API & Kafka Connector API Unifying Stream Processing and Interactive Queries in Apache Kafka - Eno Thereska https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/

Slide 47

Slide 47 text

Kafka Streams https://twitter.com/lcrsilveira/status/829615803133730816 https://twitter.com/jessetanderson/status/830113106277785600

Slide 48

Slide 48 text

Kafka Connect HDFS, JDBC, GoldenGate, Elasticsearch, Couchbase, DataStax, Cassandra, Attunity, Azure IoTHub, SAP Hana, VoltDb, FTP, JMS, JMX, MongoDB, Solr, Splunk, RethinkDB, SQS, S3, MQTT, Redis, InfluxDB, HBase, Hazelcast, Twitter, and more...

Slide 49

Slide 49 text

Lab: Kafka Streams & Kafka Connector “Simplified Consumer” Stream/Table Duality Windows

Slide 50

Slide 50 text

Kafka++

Slide 51

Slide 51 text

Confluent Platform

Slide 52

Slide 52 text

Confluent Platform: Apache Kafka Enterprise Edition

Slide 53

Slide 53 text

Lab: Confluent Platform Confluent Platform: ➔ Schema Registry ➔ REST API

Slide 54

Slide 54 text

Integración con Apache Kafka

Slide 55

Slide 55 text

Lab: Integración con Kafka Integration Platforms: ➔ Camel http://camel.apache.org/kafka.html ➔ Akka Streams http://doc.akka.io/docs/akka-stream-kafka/current/home.html ➔ Oracle Service Bus http://www.ateam-oracle.com/osb-transport-for-apache-kafka-part-1/

Slide 56

Slide 56 text

What’s in discussion and/or coming soon? Exactly-once Delivery / Txn Messaging https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly +Once+Delivery+and+Transactional+Messaging Headers support (additional metadata) https://cwiki.apache.org/confluence/display/KAFKA/KIP-82+-+Add+Rec ord+Headers ZStandard Compression support https://cwiki.apache.org/confluence/display/KAFKA/KIP-110%3A+Add+C odec+for+ZStandard+Compression Reset Offset tool https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a +tool+to+Reset+Consumer+Group+Offsets https://cwiki.apache.org/confluence/display/KAFKA/ Kafka+Improvement+Proposals

Slide 57

Slide 57 text

How NOT to use Kafka Top 5: ➔ No consideration of data on the inside vs outside ➔ Schema not externally defined ➔ Same config for every clients/topics ➔ 128 partitions as default ➔ Running on 8 overloaded nodes Kafka Summit 2016: 101 ways to config Kafka - Badly https://www.confluent.io/ kafka-summit-2016-101-ways-to-configure-kafka-badly https://cwiki.apache.org/confluence/display/KAFKA/Operations

Slide 58

Slide 58 text

Further reading

Slide 59

Slide 59 text

Thanks!!! Twitter: @jeqo89 GitHub: /jeqo Blog: jeqo.github.io Code: github.com/jeqo/talk-kafka-messaging-logs