Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
400
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
390
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
90
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
240
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
220
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
300
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
480
Other Decks in Technology
See All in Technology
MLflowダイエット大作戦
lycorptech_jp
PRO
1
140
Reinforcement Fine-tuning 基礎〜実践まで
ch6noota
0
200
チーリンについて
hirotomotaguchi
6
2.1k
2025年 開発生産「可能」性向上報告 サイロ解消からチームが能動性を獲得するまで/ 20251216 Naoki Takahashi
shift_evolve
PRO
2
210
寫了幾年 Code,然後呢?軟體工程師必須重新認識的 DevOps
cheng_wei_chen
1
1.5k
Connection-based OAuthから学ぶOAuth for AI Agents
flatt_security
0
100
生成AIを利用するだけでなく、投資できる組織へ / Becoming an Organization That Invests in GenAI
kaminashi
0
110
まだ間に合う! Agentic AI on AWSの現在地をやさしく一挙おさらい
minorun365
14
1.5k
生成AI活用の型ハンズオン〜顧客課題起点で設計する7つのステップ
yushin_n
0
250
ウェルネス SaaS × AI、1,000万ユーザーを支える 業界特化 AI プロダクト開発への道のり
hacomono
PRO
0
250
生成AI時代におけるグローバル戦略思考
taka_aki
0
210
.NET 10の概要
tomokusaba
0
120
Featured
See All Featured
Designing for Performance
lara
610
69k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
1.8k
Chasing Engaging Ingredients in Design
codingconduct
0
75
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
32
Balancing Empowerment & Direction
lara
5
810
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
1
200
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
110
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
120
Why Our Code Smells
bkeepers
PRO
340
57k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.3k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None