Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Viktor Gamov
February 03, 2017
Technology
1
210
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
430
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
410
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
97
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
250
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
230
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
320
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
490
Other Decks in Technology
See All in Technology
生成AI活用でQAエンジニアにどのような仕事が生まれるか/Support Required of QA Engineers for Generative AI
goyoki
1
370
めちゃくちゃ開発するQAエンジニアになって感じたメリットとこれからの課題感
ryuhei0000yamamoto
0
270
A Casual Introduction to RISC-V
omasanori
0
550
開発チームとQAエンジニアの新しい協業モデル -年末調整開発チームで実践する【QAリード施策】-
qa
0
160
The Rise of Browser Automation: AI-Powered Web Interaction in 2026
marcthompson_seo
0
300
スケールアップ企業でQA組織が機能し続けるための組織設計と仕組み〜ボトムアップとトップダウンを両輪としたアプローチ〜
qa
0
160
脳が溶けた話 / Melted Brain
keisuke69
1
830
DMBOKを使ってレバレジーズのデータマネジメントを評価した
leveragestech
0
180
品質を経営にどう語るか #jassttokyo / Communicating the Strategic Value of Quality to Executive Leadership
kyonmm
PRO
3
1.2k
モジュラモノリス導入から4年間の総括:アーキテクチャと組織の相互作用について / Architecture and Organizational Interaction
nazonohito51
3
1.6k
AI時代のIssue駆動開発のススメ
moongift
PRO
0
130
SSoT(Single Source of Truth)で「壊して再生」する設計
kawauso
2
310
Featured
See All Featured
What's in a price? How to price your products and services
michaelherold
247
13k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
410
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
130
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.7k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
110
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
82
What does AI have to do with Human Rights?
axbom
PRO
1
2k
The Cost Of JavaScript in 2023
addyosmani
55
9.8k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
410
How to Ace a Technical Interview
jacobian
281
24k
Mind Mapping
helmedeiros
PRO
1
130
Accessibility Awareness
sabderemane
0
84
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None