[DevNexus-2018] Apache Kafka A Streaming Data Platform

0680be1c881abcf19219f09f1e8cf140?s=47 Viktor Gamov
February 22, 2018

[DevNexus-2018] Apache Kafka A Streaming Data Platform

0680be1c881abcf19219f09f1e8cf140?s=128

Viktor Gamov

February 22, 2018
Tweet

Transcript

  1. @ Apache Kafka A Streaming Data Platform

  2. @ @gamussa @confluentinc Who am I?

  3. @ @gamussa @confluentinc Solutions Architect Who am I?

  4. @ @gamussa @confluentinc Solutions Architect Developer Advocate Who am I?

  5. @ @gamussa @confluentinc Solutions Architect Developer Advocate @gamussa in internetz

    Who am I?
  6. @ @gamussa @confluentinc Solutions Architect Developer Advocate @gamussa in internetz

    Hey you, yes, you, go follow me in twitter © Who am I?
  7. @ @gamussa @confluentinc

  8. @ @gamussa @confluentinc A company is build on

  9. @ @gamussa @confluentinc A company is build on DATA FLOWS

    but All we have is DATA STORES
  10. @ @gamussa @confluentinc

  11. @ @gamussa @confluentinc

  12. @ @gamussa @confluentinc

  13. @ @gamussa @confluentinc

  14. @ @gamussa @confluentinc

  15. @ @gamussa @confluentinc

  16. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  17. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  18. @ @gamussa @confluentinc Core abstraction

  19. @ @gamussa @confluentinc Core abstraction DB - table

  20. @ @gamussa @confluentinc Core abstraction DB - table Hadoop -

    file
  21. @ @gamussa @confluentinc Core abstraction DB - table Hadoop -

    file Messaging -?
  22. @ @gamussa @confluentinc LOGS

  23. @ @gamussa @confluentinc Producing to Kafka Time

  24. @ @gamussa @confluentinc Producing to Kafka Time C C C

  25. @ @gamussa @confluentinc Producing to Kafka - With Key Time

    A B C D hash(key) % numPartitions = N
  26. @ @gamussa @confluentinc Producing to Kafka - No Key Time

    Messages will be produced in a round robin fashion
  27. @ @gamussa @confluentinc Producing to Kafka - No Key Time

    Messages will be produced in a round robin fashion
  28. @ @gamussa @confluentinc Producing to Kafka - No Key Time

    Messages will be produced in a round robin fashion
  29. @ @gamussa @confluentinc Producing to Kafka - No Key Time

    Messages will be produced in a round robin fashion
  30. @ @gamussa @confluentinc Consuming From Kafka - Single Consumer C

  31. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C

    C C1 C C C2
  32. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers C

    C C C
  33. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0

    1 2 3
  34. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0

    1 2 3
  35. @ @gamussa @confluentinc Consuming From Kafka - Grouped Consumers 0,

    3 1 2 3
  36. @ @gamussa @confluentinc Producers Consumers

  37. @ @gamussa @confluentinc

  38. @ @gamussa @confluentinc

  39. @ @gamussa @confluentinc

  40. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t
  41. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t 1. Scale out
  42. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t 1. Scale out
  43. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t 1. Scale out
  44. @ @gamussa @confluentinc Kafka Connect does hard work so you

    don’t 1. Scale out
  45. @ @gamussa @confluentinc

  46. @ @gamussa @confluentinc

  47. @ @gamussa @confluentinc

  48. @ @gamussa @confluentinc

  49. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  50. @ @gamussa @confluentinc Why Store?

  51. @ @gamussa @confluentinc Scalability of a filesystem

  52. @ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s

  53. @ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s

    TBs per server
  54. @ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s

    TBs per server Commodity Hardware
  55. @ @gamussa @confluentinc Scalability of a filesystem Throughput 100s mb/s

    TBs per server Commodity Hardware O(1) writes
  56. @ @gamussa @confluentinc Guarantees of a database

  57. @ @gamussa @confluentinc Guarantees of a database Persistence

  58. @ @gamussa @confluentinc Guarantees of a database Persistence Strict ordering

  59. @ @gamussa @confluentinc Distributed by Design

  60. @ @gamussa @confluentinc Replication Distributed by Design

  61. @ @gamussa @confluentinc Replication Fault Tolerance Distributed by Design

  62. @ @gamussa @confluentinc Replication Fault Tolerance Partitioning Distributed by Design

  63. @ @gamussa @confluentinc Replication Fault Tolerance Partitioning Scale Distributed by

    Design
  64. @ @gamussa @confluentinc

  65. @ @gamussa @confluentinc Partition Leadership and Replication Broker 1 Topic1

    partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  66. @ @gamussa @confluentinc Partition Leadership and Replication - node failure

    Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  67. @ @gamussa @confluentinc Streaming Platform 1. Pub/Sub 2. Store 3.

    Process
  68. @ @gamussa @confluentinc What is Stream Processing? A machine for

    combining streams of events
  69. @ @gamussa @confluentinc

  70. @ @gamussa @confluentinc

  71. @ @gamussa @confluentinc https://www.confluent.io/download/

  72. @ @gamussa @confluentinc We are hiring! https://www.confluent.io/careers/

  73. @ @gamussa @confluentinc One more thing…

  74. @ @gamussa @confluentinc

  75. @ @gamussa @confluentinc

  76. @ @gamussa @confluentinc

  77. @ @gamussa @confluentinc

  78. @ @gamussa @confluentinc

  79. @ @gamussa @confluentinc A Major New Paradigm

  80. @ @gamussa @confluentinc Thanks! questions? @gamussa viktor@confluent.io We are hiring!

    https://www.confluent.io/careers/