$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
400
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
390
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
89
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
230
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
220
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
300
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
480
Other Decks in Technology
See All in Technology
モダンデータスタック (MDS) の話とデータ分析が起こすビジネス変革
sutotakeshi
0
450
MapKitとオープンデータで実現する地図情報の拡張と可視化
zozotech
PRO
1
130
「Managed Instances」と「durable functions」で広がるAWS Lambdaのユースケース
lamaglama39
0
290
評価駆動開発で不確実性を制御する - MLflow 3が支えるエージェント開発
databricksjapan
1
110
AWSセキュリティアップデートとAWSを育てる話
cmusudakeisuke
0
200
30分であなたをOmniのファンにしてみせます~分析画面のクリック操作をそのままコード化できるAI-ReadyなBIツール~
sagara
0
110
生成AI時代の自動E2Eテスト運用とPlaywright実践知_引持力哉
legalontechnologies
PRO
0
210
[JAWS-UG 横浜支部 #91]DevOps Agent vs CloudWatch Investigations -比較と実践-
sh_fk2
1
250
Kiro Autonomous AgentとKiro Powers の紹介 / kiro-autonomous-agent-and-powers
tomoki10
0
360
pmconf2025 - 他社事例を"自社仕様化"する技術_iRAFT法
daichi_yamashita
0
800
Karate+Database RiderによるAPI自動テスト導入工数をCline+GitLab MCPを使って2割削減を目指す! / 20251206 Kazuki Takahashi
shift_evolve
PRO
1
660
regrowth_tokyo_2025_securityagent
hiashisan
0
200
Featured
See All Featured
Measuring & Analyzing Core Web Vitals
bluesmoon
9
700
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.1k
Site-Speed That Sticks
csswizardry
13
990
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
7.9k
Code Reviewing Like a Champion
maltzj
527
40k
Embracing the Ebb and Flow
colly
88
4.9k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Designing for Performance
lara
610
69k
Statistics for Hackers
jakevdp
799
230k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.5k
Automating Front-end Workflow
addyosmani
1371
200k
Practical Orchestrator
shlominoach
190
11k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None