Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
220
1
Share
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
440
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
430
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
100
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
260
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
240
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
330
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
120
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
500
Other Decks in Technology
See All in Technology
情シスがMCP環境導入時に打ちのめされる認可の崖
oidfj
0
630
layerx-fde-practices
cipepser
6
2.8k
Oracle Cloud Infrastructure:2026年5月度サービス・アップデート
oracle4engineer
PRO
1
180
JJUG CCC 2026 Spring AI時代の開発こそ標準化を武器に! ― 方式・プロセス・プラットフォームの標準化
s27watanabe
2
420
DI コンテナ自動生成ツールを実装してみた / intro-autodi
uhzz
0
870
データ基盤構築・運用の現場から 〜 Snowflake Intelligence 導入で変わった、データ活用の未来 〜
wonohe
0
200
大規模環境でどのように監視を実現する?
yuobayashi
1
150
Gradle×GitHub_ActionsでCI時間を約50%短縮 ジョブ分割の設計と落とし穴 / Cutting CI Time by ~50% with Gradle and GitHub Actions: Job-Splitting Design and Pitfalls
takatty
0
390
Kaggle未経験社員をメダリストに育てる「AIドラゴン桜」
lycorptech_jp
PRO
0
630
TypeScriptはどのようにどこまで推論できるのか ─ とにかく as は禁止で
ypresto
3
450
AIAgentと取り組むKaggle
508shuto
2
610
Claude Codeですべての日常業務を爆速化しよう!
minorun365
PRO
16
14k
Featured
See All Featured
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
120
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.6k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
1.5k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
210
Designing Experiences People Love
moore
143
24k
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
210
Amusing Abliteration
ianozsvald
1
180
XXLCSS - How to scale CSS and keep your sanity
sugarenia
250
1.3M
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
1
330
Writing Fast Ruby
sferik
630
63k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
4k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None