Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything You Wanted to Know About Apache Kafka But You Were Too Afraid to Ask!

Everything You Wanted to Know About Apache Kafka But You Were Too Afraid to Ask!

Ricardo Ferreira

June 04, 2019
Tweet

More Decks by Ricardo Ferreira

Other Decks in Programming

Transcript

  1. 1
    Everything you Wanted to
    Know about Apache Kafka
    But You Were Too Afraid to Ask
    Ricardo Ferreira
    Developer Advocate, Confluent

    View Slide

  2. 2
    Wakanda? Forever!
    Hulk? Smash!
    Apache
    Kafka?
    Its Like
    Messaging?

    View Slide

  3. About Us:
    ● Ricardo Ferreira
    ❑ Developer Advocate @ Confluent
    ❑ Ex-Oracle, Red Hat, IONA Tech
    ❑ Currently ~70% Dev, ~30 Ops
    ❑ https://riferrei.net
    ● Echo Dot (Alexa)
    ❑ The voice behind Amazon
    ❑ Ex-Raspberry Pi, Arduino
    @riferrei
    @alexa99

    View Slide

  4. @riferrei | @JNationConf | @confluentinc
    Question:
    "What is a Distributed
    Streaming Platform?"

    View Slide

  5. @riferrei | @JNationConf | @confluentinc
    ? ? ?

    View Slide

  6. Let's do some
    time travel
    Shaw we?

    View Slide

  7. @riferrei | @JNationConf | @confluentinc
    SQL DBs 25 years ago…
    SQL DBs Today
    Dude, you're embarrassing
    me in front of the wizards…

    View Slide

  8. @riferrei | @JNationConf | @confluentinc
    ETL/Batch

    View Slide

  9. @riferrei | @JNationConf | @confluentinc

    View Slide

  10. @riferrei | @JNationConf | @confluentinc
    What did it cost to extract data
    from the transactional DB?

    View Slide

  11. Latency

    View Slide

  12. Data
    Plumbing

    View Slide

  13. @riferrei | @JNationConf | @confluentinc

    View Slide

  14. @riferrei | @JNationConf | @confluentinc
    Solution for
    "Combining" Processing
    and Data: NoSQL
    Solution for Large
    Amounts of Data:
    Big Data!

    View Slide

  15. @riferrei | @JNationConf | @confluentinc
    How about
    Messaging?

    View Slide

  16. View Slide

  17. View Slide

  18. @riferrei | @JNationConf | @confluentinc

    View Slide

  19. 19
    Event-Driven App
    (Location Tracking)
    Only Real-Time Events
    Messaging Queues and
    Event Streaming
    Platforms can do this
    Contextual
    Event-Driven App
    (ETA)
    Real-Time combined
    with stored data
    Only Event Streaming
    Platforms can do this
    Where is my driver? When will my driver
    get here?
    Where is my driver? When will my driver
    get here?
    Why Combine Real-time
    With Historical Context?
    2
    min

    View Slide

  20. 20
    ETL/Data Integration Messaging
    Batch
    Expensive
    Time Consuming
    Difficult to Scale
    No Persistence After
    Consumption
    No Replay
    Highly Scalable
    Durable
    Persistent
    Ordered
    Fast (Low Latency)
    What is happening
    in the world
    What happened
    in the world

    View Slide

  21. 21
    Highly Scalable
    Durable
    Persistent
    Maintains Order
    ETL/Data Integration Messaging
    ETL/Data Integration Messaging
    Messaging
    Batch
    Expensive
    Time Consuming
    Difficult to Scale
    No Persistence
    Data Loss
    No Replay
    Fast (Low Latency)
    What happened
    in the world
    What is happening
    in the world
    Highly Scalable
    Durable
    Persistent
    Ordered
    Fast (Low Latency)
    Event Streaming Thinking

    View Slide

  22. @riferrei | @JNationConf | @confluentinc

    View Slide

  23. @riferrei | @JNationConf | @confluentinc

    View Slide

  24. @riferrei | @JNationConf | @confluentinc
    http://the-song-is-riferrei.s3-
    website-us-east-1.amazonaws.com/

    View Slide

  25. @riferrei | @JNationConf | @confluentinc
    https://github.com/riferrei/the-song-is

    View Slide

  26. @riferrei | @JNationConf | @confluentinc
    Have you Ever Heard
    About the #AskConfluent
    Initiative?

    View Slide

  27. There was an idea…

    View Slide

  28. To bring together a group
    of remarkable people…

    View Slide

  29. That could answer the
    questions…

    View Slide

  30. That we never could…

    View Slide

  31. @riferrei | @JNationConf | @confluentinc
    @tlberglund @gwenshap

    View Slide

  32. @riferrei | @JNationConf | @confluentinc
    https://www.youtube.com/playlist?list=
    PLa7VYi0yPIH0snucuYWkuUXwasMr-HR7Y
    Kafka is
    so Cool!

    View Slide

  33. @riferrei | @JNationConf | @confluentinc
    Steps:
    1. Make your question on Twitter
    using the hashtag #AskConfluent
    2. Wait for the next episode

    View Slide

  34. @riferrei | @JNationConf | @confluentinc
    Question:
    "Is There Such a Thing as
    Oversubscribing to Kafka?"

    View Slide

  35. @riferrei | @JNationConf | @confluentinc
    Question:
    "Is it More Costly to Up Convert or
    Down Convert Message Formats for
    Records Sent to Apache Kafka?"

    View Slide

  36. @riferrei | @JNationConf | @confluentinc
    Question:
    "How Many Partitions
    a Topic Can Have?"

    View Slide

  37. @riferrei | @JNationConf | @confluentinc
    Question:
    "What are the Pros and Cons
    About Exactly-Once Feature?"

    View Slide

  38. @riferrei | @JNationConf | @confluentinc
    Question:
    "'auto.offset.reset=earliest' was set
    but after restarting the app (it was 4
    days down) it started to process from
    the beginning of the topic…"

    View Slide

  39. @riferrei | @JNationConf | @confluentinc
    Question:
    "Not Using Keys to Produce a
    Record is Considered a Bad
    Practice?"

    View Slide

  40. @riferrei | @JNationConf | @confluentinc
    Free Books

    View Slide

  41. @riferrei | @JNationConf | @confluentinc

    View Slide

  42. 42

    View Slide