Apache Flink

Apache Flink

This presentation gives an overview of the Apache Flink project. It explains Flink in terms of its architecture, use cases and the manner in which it works.

Links for further information and connecting

http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/

https://nz.linkedin.com/pub/mike-frampton/20/630/385

https://open-source-systems.blogspot.com/

D175960fc5e8c2d478bb724c83bf70e5?s=128

Mike Frampton

May 23, 2020
Tweet

Transcript

  1. What Is Apache Flink ? • A stream processing framework

    • Open source / Apache 2.0 license • Written in Java and Scala • For batch and stream processing • For high volume , low latency • Develop in Java, Scala, Python, SQL • Automatic compilation/optimization into data flows
  2. How Does Flink Work ? • Process Unbounded and Bounded

    Data • Uses file systems to consume/persistently store data i.e. – local, hadoop-compatible, Amazon S3, MapR FS, OpenStack Swift FS, Aliyun OSS and Azure Blob Storage • Leverages In-Memory Performance • Provides a rich function set for handling – Streams, state and time – When building applications • Provides layered API's which provides a balance between – Conciseness and expressiveness – See next slide
  3. How Does Flink Work ? Flink layered API's

  4. Flink API's • SQL & Table API • DataStream API

    • ProcessFunctions – event processing • Flink also has libraries for common data processing – Complex Event Processing (CEP) – DataSet API – Gelly - library for scalable graph processing/analysis
  5. Flink Used By

  6. Flink Deployment • Deploy Flink to use the following cluster

    managers – YARN – Mesos – Kubernetes – Stand alone • All application control communications via REST calls • Deploy at any scale – multiple trillions of events per day – multiple terabytes of state – thousands of cores
  7. Flink Architecture

  8. Flink Stateful Functions • Simplifies building distributed stateful applications •

    Provides a runtime built for serverless architectures • Key Benefits – Dynamic Messaging – Consistent State – Multi-language Support – No Database Required – Cloud Native – "Stateless" Operation
  9. Flink Stateful Functions

  10. Flink Use Cases • Event-driven Applications i.e. – Fraud detection

    – Anomaly detection • Data Analytics Applications – Quality monitoring of Telco networks – Analysis of product updates & experiment evaluation in mobile applications • Data Pipeline Applications – Real-time search index building in e-commerce – Continuous ETL in e-commerce
  11. Flink Use Cases

  12. Flink Use Cases

  13. Available Books • See “Big Data Made Easy” – Apress

    Jan 2015 • See “Mastering Apache Spark” – Packt Oct 2015 • See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” • Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ • Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
  14. Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

    • See my open source blog at – open-source-systems.blogspot.com/ • I am always interested in – New technology – Opportunities – Technology based issues – Big data integration