Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Data on the Shoulders of Giants @ MongoDB.local London 2019

Streaming Data on the Shoulders of Giants @ MongoDB.local London 2019

* Abstract:

Life doesn’t happen in batch mode which is why application engineers and data architects need to closely cooperate to get the best out of streaming platforms like Apache Kafka and operational NoSQL data stores such as MongoDB. This session explores ways and means to integrate both worlds in a streaming fashion.

* Description:

Without doubt stream processing is a big deal these days and oftentimes we find Apache Kafka as the central nervous system of company-wide data architectures. However, many real-world uses cases simply need an operational data store which is flexible, robust and scalable enough to live up to diverse application-related requirements and challenges. This session discusses different options in order to build solid data integration pipelines between MongoDB and Apache Kafka. The focus lies on configuration-based data in motion scenarios leveraging the Kafka Connect framework in order to lay out streaming ETL pipeline examples without writing a single line of code.

* Video Recording: https://youtu.be/1gVS9WNaZTE

744f1c2c6cbea2ff5104b0ac512936bd?s=128

Hans-Peter Grahsl

September 25, 2019
Tweet

Transcript

  1. @ #MDBlocal Hans-Peter Grahsl, NETCONOMY Streaming Data on the Shoulders

    of Giants hpgrahsl LONDON
  2. #MDBLocal > whoami ü working & living in Graz !

    ü technical trainer at NETCONOMY ü independent engineer / consultant ü associate lecturer ü " occasional conference speaker # mongodb.com/local/london
  3. Speed & Agility Among Top Tech Risks For businesses to

    stay relevant they must deliver value at a breakneck pace and be constantly seeking new sources of value…
  4. #MDBLocal Managing, Processing & Analyzing Data We all use Data

    to unlock insights and drive value mongodb.com/local/london
  5. #MDBLocal Diminishing Value of Data mongodb.com/local/london near real time seconds

    minutes hours days months VALUE of data to decision-making traditional business intelligence time critical decisions actionable reactive historical preventive / predictive Source: Perishable Insights, Mike Gualtieri, Forrester
  6. #MDBLocal Historic ETL can be rather painful • batch-driven •

    brittle & error prone • (too) late answers Speed & Agility Antipattern mongodb.com/local/london
  7. #MDBLocal Alleviate some ETL pain with streaming mongodb.com/local/london • event-centric

    & stream-oriented • quick insights & fast answers Speed & Agility Enabler
  8. #MDBLocal Stream Processors Connected Apps Architecture of a Modern Data

    Platform? mongodb.com/local/london Streaming Data Fabric Data Sources Connected Apps Data Sinks
  9. #MDBLocal On the Shoulders of Giants mongodb.com/local/london Kafka® MongoDB®

  10. #MDBLocal Modern Database mongodb.com/local/london

  11. #MDBLocal Modern Database Document Model Run Anywhere Distributed & Scalable

    Resilient & Performant mongodb.com/local/london
  12. Apache Kafka® Minimum Viable Introduction

  13. #MDBLocal Streaming Platform mongodb.com/local/london

  14. #MDBLocal Streaming Platform ü distributed ü horizontally scalable ü highly

    fault-tolerant mongodb.com/local/london
  15. #MDBLocal What is Streaming? “… a type of data processing

    that is designed with infinite data sets in mind …” – Tyler Akidau mongodb.com/local/london
  16. "…everything that happens in a company – every customer interaction,

    every API request, every database change – can be represented as real-time stream that anything else can tap into, process or react to."
  17. "…Kafka and the whole category of stream processing represents a

    fundamental paradigm shift in how the digital part of a company is built, how data is used, and how applications are built. This is actually a pretty rare thing…" – Jay Kreps
  18. #MDBLocal KStreams App Data Sources Data Sinks KSQL App Streams

    API KSQL Consumer API Connect API App Apps App Apps Connect API Producer API mongodb.com/local/london
  19. #MDBLocal Kafka APIs in a Nutshell… • Producer & Consumer

    API à publish-subscribe scenarios • Connect API à streaming data integration scenarios • Streams API & KSQL à code or SQL-based streaming scenarios mongodb.com/local/london
  20. Kafka® Connect What’s it about?

  21. #MDBLocal Kafka Connect Basics ANY sink Connect Connect ANY source

    ANY à e.g. file systems, data stores, REST endpoints, … mongodb.com/local/london
  22. #MDBLocal Kafka Connect Basics often about data stores Connect Connect

    SOURCE DB SINK DB mongodb.com/local/london
  23. #MDBLocal Kafka Connect Basics or more concretely Connect Connect https://hub.confluent.io

    à many many more mongodb.com/local/london
  24. #MDBLocal Kafka Connect Basics or more concretely Connect Connect mongodb.com/local/london

    https://hub.confluent.io à many many more
  25. Kafka® Connect How do connectors operate?

  26. #MDBLocal Kafka Source Connectors Source Connector Converter Serialize S M

    T 1 … N Single Message Transforms for basic in-flight manipulations … S M T mongodb.com/local/london
  27. #MDBLocal Kafka Sink Connectors Converter Deserialize Sink Connector S M

    T 1 … N Single Message Transforms for basic in-flight manipulations … S M T mongodb.com/local/london
  28. Official Connector Announced at #MDBW19

  29. #MDBLocal MongoDB Connector for Apache Kafka mongodb.com/local/london Map and persist

    events from Kafka topics directly to MongoDB Publish data changes from MongoDB into Kafka topics
  30. #MDBLocal MongoDB Connector for Apache Kafka ü developed open-source ü

    officially supported by MongoDB ü Verified Gold certified by Confluent mongodb.com/local/london Available for Download on the Confluent Hub https://www.conflent.io/hub/mongodb/kafka-connect-mongodb
  31. Use Cases MongoDB Connector for Apache Kafka

  32. #MDBLocal Single Customer View for eCommerce MongoDB Sinks Single Source

    of Truth Source Connectors
  33. #MDBLocal Data Synchronization between Microservices Service 1 Service N MongoDB

    Sinks . . .
  34. #MDBLocal Recommendation Engine for Opinion Mining Surveys & Polls Data

    MongoDB Source Change Streams Change Streams User Recommendation Engine
  35. Demo Scenario Let’s see it in action!

  36. #MDBLocal Producer API data generation Stream Processor data serving REST

    Change Streams device management SSE Demo Scenario ! mongodb.com/local/london
  37. #MDBLocal Producer API data generation Stream Processor MongoDB Sink Connector

    MongoDB Source Connector data serving REST Change Streams device management SSE Demo Scenario mongodb.com/local/london
  38. THANK YOU!

  39. Streaming Data on the Shoulders of Giants [NETCONOMY] Hans-Peter Grahsl

    https://www.surveymonkey.com/r/KXC6DB6
  40. None