Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
190
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
370
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
370
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
80
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
170
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
220
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
200
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
270
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
100
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
470
Other Decks in Technology
See All in Technology
監視のこれまでとこれから/sakura monitoring seminar 2025
fujiwara3
11
4k
Github Copilot エージェントモードで試してみた
ochtum
0
110
Welcome to the LLM Club
koic
0
190
GitHub Copilot の概要
tomokusaba
1
140
AIとともに進化するエンジニアリング / Engineering-Evolving-with-AI_final.pdf
lycorptech_jp
PRO
0
110
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
5
650
25分で解説する「最小権限の原則」を実現するための AWS「ポリシー」大全 / 20250625-aws-summit-aws-policy
opelab
9
1.2k
Oracle Audit Vault and Database Firewall 20 概要
oracle4engineer
PRO
3
1.7k
【PHPカンファレンス 2025】PHPを愛するひとに伝えたい PHPとキャリアの話
tenshoku_draft
0
120
Amazon ECS & AWS Fargate 運用アーキテクチャ2025 / Amazon ECS and AWS Fargate Ops Architecture 2025
iselegant
17
5.7k
急成長を支える基盤作り〜地道な改善からコツコツと〜 #cre_meetup
stefafafan
0
130
【5分でわかる】セーフィー エンジニア向け会社紹介
safie_recruit
0
26k
Featured
See All Featured
Code Reviewing Like a Champion
maltzj
524
40k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
490
Making the Leap to Tech Lead
cromwellryan
134
9.4k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
233
17k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
107
19k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
657
60k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
Raft: Consensus for Rubyists
vanstee
140
7k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.4k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.3k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
800
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None