Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
400
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
400
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
92
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
250
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
230
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
300
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
490
Other Decks in Technology
See All in Technology
新米スクラムマスターの4ヶ月 -「スクラムイベントを回しているのに手応えがない」からの脱出 / Four Months as a New Scrum Master — When Scrum Events Were Running, but Nothing Felt Right
owata
0
170
ALB「証明書上限問題」からの脱却
nishiokashinji
0
210
「アウトプット脳からユーザー価値脳へ」がそんなに簡単にできたら苦労しない #RSGT2026
aki_iinuma
11
5.5k
20260114_データ横丁 新年LT大会:2026年の抱負
taromatsui_cccmkhd
0
300
Java 25に至る道
skrb
3
230
モノタロウ x クリエーションラインで実現する チームトポロジーにおける プラットフォームチーム・ ストリームアラインドチームの 効果的なコラボレーション
creationline
0
950
さくらのクラウドでのシークレット管理を考える/tamachi.sre#2
fujiwara3
1
190
旬のブリと旬の技術で楽しむ AI エージェント設計開発レシピ
chack411
1
290
Git Training GitHub
yuhattor
1
130
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
2.9k
たかがボタン、されどボタン ~button要素から深ぼるボタンUIの定義について~ / BuriKaigi 2026
yamanoku
1
280
チームで安全にClaude Codeを利用するためのプラクティス / team-claude-code-practices
tomoki10
7
3.4k
Featured
See All Featured
The Invisible Side of Design
smashingmag
302
51k
Google's AI Overviews - The New Search
badams
0
890
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.2k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
2
81
Automating Front-end Workflow
addyosmani
1371
200k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.2k
How to Ace a Technical Interview
jacobian
281
24k
A Modern Web Designer's Workflow
chriscoyier
698
190k
GraphQLの誤解/rethinking-graphql
sonatard
74
11k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.7k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None