[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform

[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform

0680be1c881abcf19219f09f1e8cf140?s=128

Viktor Gamov

March 08, 2018
Tweet

Transcript

  1. @ Apache Kafka A Streaming Data Platform

  2. @ @gamussa @confluentinc Solutions Architect Developer Advocate @gamussa in internetz

    Hey you, yes, you, go follow me in twitter © Who am I?
  3. @ @gamussa @confluentinc A company is build on DATA FLOWS

    but All we have is DATA STORES
  4. @ @gamussa @confluentinc

  5. @ @gamussa @confluentinc

  6. @ @gamussa @confluentinc

  7. @ @gamussa @confluentinc

  8. @ @gamussa @confluentinc

  9. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  10. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  11. @ @gamussa @confluentinc Core abstraction DB - table Hadoop -

    file Messaging -?
  12. @ @gamussa @confluentinc LOGS

  13. @ @gamussa @confluentinc Producing to Kafka Time

  14. @ @gamussa @confluentinc Producing to Kafka Time C C C

  15. @ @gamussa @confluentinc Producing to Kafka - With Key Time

    A B C D hash(key) % numPartitions = N
  16. @ @gamussa @confluentinc Producing to Kafka - No Key Time

    Messages will be produced in a round robin fashion
  17. @ @gamussa @confluentinc Consuming From Kafka - Single Consumer C

  18. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C

    C C1 C C C2
  19. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C

    C C C
  20. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0

    1 2 3
  21. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0

    1 2 3
  22. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0,

    3 1 2 3
  23. @ @gamussa @confluentinc Producers Consumers

  24. @ @gamussa @confluentinc

  25. @ @gamussa @confluentinc

  26. @ @gamussa @confluentinc

  27. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t 1. Scale out
  28. @ @gamussa @confluentinc

  29. @ @gamussa @confluentinc

  30. @ @gamussa @confluentinc

  31. @ @gamussa @confluentinc

  32. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  33. @ @gamussa @confluentinc Why Store?

  34. @ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s

    TBs per server Commodity Hardware O(1) writes
  35. @ @gamussa @confluentinc Guarantees of a database Persistence Strict ordering

  36. @ @gamussa @confluentinc Replication Fault Tolerance Partitioning Scale Distributed by

    Design
  37. @ @gamussa @confluentinc

  38. @ @gamussa @confluentinc Partition Leadership and Replication Broker 1 Topic1

    partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  39. @ @gamussa @confluentinc Partition Leadership and Replication - node failure

    Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  40. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  41. @ @gamussa @confluentinc What is Stream Processing? A machine for

    combining streams of events
  42. @ @gamussa @confluentinc

  43. @ @gamussa @confluentinc

  44. @ @gamussa @confluentinc https://www.confluent.io/download/

  45. @ @gamussa @confluentinc We are hiring! https://www.confluent.io/careers/

  46. @ @gamussa @confluentinc One more thing…

  47. @ @gamussa @confluentinc

  48. @ @gamussa @confluentinc

  49. @ @gamussa @confluentinc A Major New Paradigm

  50. @ @gamussa @confluentinc Thanks! questions? @gamussa viktor@confluent.io We are hiring!

    https://www.confluent.io/careers/