Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Data on the Shoulders of Giants @ MongoDB.local London 2019

Streaming Data on the Shoulders of Giants @ MongoDB.local London 2019

* Abstract:

Life doesn’t happen in batch mode which is why application engineers and data architects need to closely cooperate to get the best out of streaming platforms like Apache Kafka and operational NoSQL data stores such as MongoDB. This session explores ways and means to integrate both worlds in a streaming fashion.

* Description:

Without doubt stream processing is a big deal these days and oftentimes we find Apache Kafka as the central nervous system of company-wide data architectures. However, many real-world uses cases simply need an operational data store which is flexible, robust and scalable enough to live up to diverse application-related requirements and challenges. This session discusses different options in order to build solid data integration pipelines between MongoDB and Apache Kafka. The focus lies on configuration-based data in motion scenarios leveraging the Kafka Connect framework in order to lay out streaming ETL pipeline examples without writing a single line of code.

* Video Recording: https://youtu.be/1gVS9WNaZTE

Hans-Peter Grahsl

September 25, 2019
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. #MDBLocal > whoami ü working & living in Graz !

    ü technical trainer at NETCONOMY ü independent engineer / consultant ü associate lecturer ü " occasional conference speaker # mongodb.com/local/london
  2. Speed & Agility Among Top Tech Risks For businesses to

    stay relevant they must deliver value at a breakneck pace and be constantly seeking new sources of value…
  3. #MDBLocal Managing, Processing & Analyzing Data We all use Data

    to unlock insights and drive value mongodb.com/local/london
  4. #MDBLocal Diminishing Value of Data mongodb.com/local/london near real time seconds

    minutes hours days months VALUE of data to decision-making traditional business intelligence time critical decisions actionable reactive historical preventive / predictive Source: Perishable Insights, Mike Gualtieri, Forrester
  5. #MDBLocal Historic ETL can be rather painful • batch-driven •

    brittle & error prone • (too) late answers Speed & Agility Antipattern mongodb.com/local/london
  6. #MDBLocal Alleviate some ETL pain with streaming mongodb.com/local/london • event-centric

    & stream-oriented • quick insights & fast answers Speed & Agility Enabler
  7. #MDBLocal Stream Processors Connected Apps Architecture of a Modern Data

    Platform? mongodb.com/local/london Streaming Data Fabric Data Sources Connected Apps Data Sinks
  8. #MDBLocal What is Streaming? “… a type of data processing

    that is designed with infinite data sets in mind …” – Tyler Akidau mongodb.com/local/london
  9. "…everything that happens in a company – every customer interaction,

    every API request, every database change – can be represented as real-time stream that anything else can tap into, process or react to."
  10. "…Kafka and the whole category of stream processing represents a

    fundamental paradigm shift in how the digital part of a company is built, how data is used, and how applications are built. This is actually a pretty rare thing…" – Jay Kreps
  11. #MDBLocal KStreams App Data Sources Data Sinks KSQL App Streams

    API KSQL Consumer API Connect API App Apps App Apps Connect API Producer API mongodb.com/local/london
  12. #MDBLocal Kafka APIs in a Nutshell… • Producer & Consumer

    API à publish-subscribe scenarios • Connect API à streaming data integration scenarios • Streams API & KSQL à code or SQL-based streaming scenarios mongodb.com/local/london
  13. #MDBLocal Kafka Connect Basics ANY sink Connect Connect ANY source

    ANY à e.g. file systems, data stores, REST endpoints, … mongodb.com/local/london
  14. #MDBLocal Kafka Source Connectors Source Connector Converter Serialize S M

    T 1 … N Single Message Transforms for basic in-flight manipulations … S M T mongodb.com/local/london
  15. #MDBLocal Kafka Sink Connectors Converter Deserialize Sink Connector S M

    T 1 … N Single Message Transforms for basic in-flight manipulations … S M T mongodb.com/local/london
  16. #MDBLocal MongoDB Connector for Apache Kafka mongodb.com/local/london Map and persist

    events from Kafka topics directly to MongoDB Publish data changes from MongoDB into Kafka topics
  17. #MDBLocal MongoDB Connector for Apache Kafka ü developed open-source ü

    officially supported by MongoDB ü Verified Gold certified by Confluent mongodb.com/local/london Available for Download on the Confluent Hub https://www.conflent.io/hub/mongodb/kafka-connect-mongodb
  18. #MDBLocal Recommendation Engine for Opinion Mining Surveys & Polls Data

    MongoDB Source Change Streams Change Streams User Recommendation Engine
  19. #MDBLocal Producer API data generation Stream Processor data serving REST

    Change Streams device management SSE Demo Scenario ! mongodb.com/local/london
  20. #MDBLocal Producer API data generation Stream Processor MongoDB Sink Connector

    MongoDB Source Connector data serving REST Change Streams device management SSE Demo Scenario mongodb.com/local/london