Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
180
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
320
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
310
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
67
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
150
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
210
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
170
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
230
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
93
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
430
Other Decks in Technology
See All in Technology
dxd2024-生成AIに振り回された3か月間の成功と失敗/dxd2024-link-and-motivation
lmi
2
260
AIエージェントを現場に導入する目線とは
masahiro_nishimi
1
1.5k
テストケースの自動生成に生成AIの導入を試みた話と生成AIによる今後の期待
shift_evolve
0
190
データ分析基盤を作ってみよう~設計編~
nrinetcom
PRO
1
110
E2Eテスト自動化プラットフォームにおけるAIの活用
shift_evolve
0
190
RAGのサービスをリリースして1年3ヶ月が経ちました
segavvy
4
950
プレイドにおけるDatadog APMの活用方法
plaidtech
PRO
2
120
楽しくGoを学び合う、LayerXの勉強会文化 / LayerX's study culture of having fun and learning Go together
ar_tama
2
350
累計ダウンロード数1億8000万を超えるアプリケーションプラットフォームのレガシーシステム脱却とモダン化への道
kmitsuhashi
0
120
What if...? 처음부터 다시 LLM 어플리케이션을 개발한다면
huffon
0
1k
Datadog Cloud SIEMを使ってAWS環境の脅威を可視化した話/lifeistech-datadog-cloud-siem
gidajun
0
480
Git 研修 Basic【MIXI 24新卒技術研修】
mixi_engineers
PRO
0
310
Featured
See All Featured
Happy Clients
brianwarren
94
6.6k
Infographics Made Easy
chrislema
238
18k
GraphQLとの向き合い方2022年版
quramy
36
13k
Building Flexible Design Systems
yeseniaperezcruz
323
37k
Speed Design
sergeychernyshev
9
270
VelocityConf: Rendering Performance Case Studies
addyosmani
321
23k
Become a Pro
speakerdeck
PRO
15
4.8k
The Cost Of JavaScript in 2023
addyosmani
31
4.7k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
662
120k
Design by the Numbers
sachag
277
18k
5 minutes of I Can Smell Your CMS
philhawksworth
200
19k
Statistics for Hackers
jakevdp
792
220k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None