Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Viktor Gamov
February 03, 2017
Technology
1
210
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
410
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
410
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
96
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
250
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
230
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
310
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
490
Other Decks in Technology
See All in Technology
Oracle AI Database移行・アップグレード勉強会 - RAT活用編
oracle4engineer
PRO
0
100
生成AIを活用した音声文字起こしシステムの2つの構築パターンについて
miu_crescent
PRO
3
210
Red Hat OpenStack Services on OpenShift
tamemiya
0
120
usermode linux without MMU - fosdem2026 kernel devroom
thehajime
0
240
会社紹介資料 / Sansan Company Profile
sansan33
PRO
15
400k
茨城の思い出を振り返る ~CDKのセキュリティを添えて~ / 20260201 Mitsutoshi Matsuo
shift_evolve
PRO
1
340
学生・新卒・ジュニアから目指すSRE
hiroyaonoe
2
650
GitHub Issue Templates + Coding Agentで簡単みんなでIaC/Easy IaC for Everyone with GitHub Issue Templates + Coding Agent
aeonpeople
1
250
15 years with Rails and DDD (AI Edition)
andrzejkrzywda
0
200
Data Hubグループ 紹介資料
sansan33
PRO
0
2.7k
20260204_Midosuji_Tech
takuyay0ne
1
160
Agile Leadership Summit Keynote 2026
m_seki
1
650
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Navigating Team Friction
lara
192
16k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
The SEO identity crisis: Don't let AI make you average
varn
0
290
Practical Orchestrator
shlominoach
191
11k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
9.6k
First, design no harm
axbom
PRO
2
1.1k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.6k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
100
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
190
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
0
2.3k
エンジニアに許された特別な時間の終わり
watany
106
230k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None