Apache Flink

What Is Apache Flink ? • A stream processing framework
• Open source / Apache 2.0 license • Written in Java and Scala • For batch and stream processing • For high volume , low latency • Develop in Java, Scala, Python, SQL • Automatic compilation/optimization into data flows

How Does Flink Work ? • Process Unbounded and Bounded
Data • Uses file systems to consume/persistently store data i.e. – local, hadoop-compatible, Amazon S3, MapR FS, OpenStack Swift FS, Aliyun OSS and Azure Blob Storage • Leverages In-Memory Performance • Provides a rich function set for handling – Streams, state and time – When building applications • Provides layered API's which provides a balance between – Conciseness and expressiveness – See next slide

How Does Flink Work ? Flink layered API's

Flink API's • SQL & Table API • DataStream API
• ProcessFunctions – event processing • Flink also has libraries for common data processing – Complex Event Processing (CEP) – DataSet API – Gelly - library for scalable graph processing/analysis

Flink Used By

Flink Deployment • Deploy Flink to use the following cluster
managers – YARN – Mesos – Kubernetes – Stand alone • All application control communications via REST calls • Deploy at any scale – multiple trillions of events per day – multiple terabytes of state – thousands of cores

Flink Architecture

Flink Stateful Functions • Simplifies building distributed stateful applications •
Provides a runtime built for serverless architectures • Key Benefits – Dynamic Messaging – Consistent State – Multi-language Support – No Database Required – Cloud Native – "Stateless" Operation

Flink Stateful Functions

Flink Use Cases • Event-driven Applications i.e. – Fraud detection
– Anomaly detection • Data Analytics Applications – Quality monitoring of Telco networks – Analysis of product updates & experiment evaluation in mobile applications • Data Pipeline Applications – Real-time search index building in e-commerce – Continuous ETL in e-commerce

Flink Use Cases

Available Books • See “Big Data Made Easy” – Apress
Jan 2015 • See “Mastering Apache Spark” – Packt Oct 2015 • See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” • Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ • Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
• See my open source blog at – open-source-systems.blogspot.com/ • I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

Apache Flink

Apache Flink

Mike Frampton

More Decks by Mike Frampton

Other Decks in Technology

Featured

Transcript

What Is Apache Flink ? • A stream processing framework

How Does Flink Work ? • Process Unbounded and Bounded

How Does Flink Work ? Flink layered API's

Flink API's • SQL & Table API • DataStream API

Flink Used By

Flink Deployment • Deploy Flink to use the following cluster

Flink Architecture

Flink Stateful Functions • Simplifies building distributed stateful applications •

Flink Stateful Functions

Flink Use Cases • Event-driven Applications i.e. – Fraud detection

Flink Use Cases

Flink Use Cases

Available Books • See “Big Data Made Easy” – Apress

Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020