Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Flink

Apache Flink

This presentation gives an overview of the Apache Flink project. It explains Flink in terms of its architecture, use cases and the manner in which it works.

Links for further information and connecting

http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/

https://nz.linkedin.com/pub/mike-frampton/20/630/385

https://open-source-systems.blogspot.com/

Mike Frampton

May 23, 2020
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. What Is Apache Flink ? • A stream processing framework

    • Open source / Apache 2.0 license • Written in Java and Scala • For batch and stream processing • For high volume , low latency • Develop in Java, Scala, Python, SQL • Automatic compilation/optimization into data flows
  2. How Does Flink Work ? • Process Unbounded and Bounded

    Data • Uses file systems to consume/persistently store data i.e. – local, hadoop-compatible, Amazon S3, MapR FS, OpenStack Swift FS, Aliyun OSS and Azure Blob Storage • Leverages In-Memory Performance • Provides a rich function set for handling – Streams, state and time – When building applications • Provides layered API's which provides a balance between – Conciseness and expressiveness – See next slide
  3. Flink API's • SQL & Table API • DataStream API

    • ProcessFunctions – event processing • Flink also has libraries for common data processing – Complex Event Processing (CEP) – DataSet API – Gelly - library for scalable graph processing/analysis
  4. Flink Deployment • Deploy Flink to use the following cluster

    managers – YARN – Mesos – Kubernetes – Stand alone • All application control communications via REST calls • Deploy at any scale – multiple trillions of events per day – multiple terabytes of state – thousands of cores
  5. Flink Stateful Functions • Simplifies building distributed stateful applications •

    Provides a runtime built for serverless architectures • Key Benefits – Dynamic Messaging – Consistent State – Multi-language Support – No Database Required – Cloud Native – "Stateless" Operation
  6. Flink Use Cases • Event-driven Applications i.e. – Fraud detection

    – Anomaly detection • Data Analytics Applications – Quality monitoring of Telco networks – Analysis of product updates & experiment evaluation in mobile applications • Data Pipeline Applications – Real-time search index building in e-commerce – Continuous ETL in e-commerce
  7. Available Books • See “Big Data Made Easy” – Apress

    Jan 2015 • See “Mastering Apache Spark” – Packt Oct 2015 • See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” • Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ • Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
  8. Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

    • See my open source blog at – open-source-systems.blogspot.com/ • I am always interested in – New technology – Opportunities – Technology based issues – Big data integration