Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How It Works - Spark
Search
Yuri Ostapchuk
September 13, 2021
Programming
0
27
How It Works - Spark
A series of talks on data engineering
Yuri Ostapchuk
September 13, 2021
Tweet
Share
More Decks by Yuri Ostapchuk
See All by Yuri Ostapchuk
Detecting person's direction of interest
twist522
0
25
Hedera fundamentals course
twist522
0
17
Sweet.tv - hackathon 2020 - movie recommendations by emotion
twist522
0
8
How It Works - Kafka
twist522
0
49
Spark: From Interactivity To Production (And Back)
twist522
0
25
What Is Data Engineering
twist522
0
41
What Is Big Data
twist522
0
27
How I Learned To Stop Worrying And Love LSP (And Metals)
twist522
0
35
How It Works - Hadoop
twist522
0
30
Other Decks in Programming
See All in Programming
俺流レスポンシブコーディング 2025
tak_dcxi
13
8.4k
dnx で実行できるコマンド、作ってみました
tomohisa
0
140
ローターアクトEクラブ アメリカンナイト:川端 柚菜 氏(Japan O.K. ローターアクトEクラブ 会長):2720 Japan O.K. ロータリーEクラブ2025年12月1日卓話
2720japanoke
0
720
AIコーディングエージェント(NotebookLM)
kondai24
0
160
無秩序からの脱却 / Emergence from chaos
nrslib
2
13k
WebRTC、 綺麗に見るか滑らかに見るか
sublimer
1
160
20 years of Symfony, what's next?
fabpot
2
340
これだけで丸わかり!LangChain v1.0 アップデートまとめ
os1ma
6
1.7k
なあ兄弟、 余白の意味を考えてから UI実装してくれ!
ktcryomm
11
11k
非同期処理の迷宮を抜ける: 初学者がつまづく構造的な原因
pd1xx
1
690
STYLE
koic
0
130
Full-Cycle Reactivity in Angular: SignalStore mit Signal Forms und Resources
manfredsteyer
PRO
0
120
Featured
See All Featured
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
710
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
253
22k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.1k
Code Review Best Practice
trishagee
74
19k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Building Flexible Design Systems
yeseniaperezcruz
330
39k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
54k
RailsConf 2023
tenderlove
30
1.3k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
Transcript
HOW IT WORKS: HOW IT WORKS: SPARK SPARK 1
PLAN PLAN Hadoop weakpoints Spark core ideas & concepts Applications
& Ecosystem Demo 2 . 1
RECAP: HADOOP & MAPREDUCE RECAP: HADOOP & MAPREDUCE 3 .
1
PROBLEM: HADOOP WEAKPOINTS PROBLEM: HADOOP WEAKPOINTS slow intermediate results are
saved to disk complex imperative style, too verbose APIs, not- available to regular humans 4 . 1
IDEA IDEA lets keep all data being processed in memory
lets treat whole dataset simply as a collection lets build functional API for processing 5 . 1
SPARK CORE CONCEPTS SPARK CORE CONCEPTS 6 . 1
RDD RDD Resilient Distributed Dataset 6 . 2
6 . 3
6 . 4
RDD FEATURES RDD FEATURES immutable lazy partitioned, location-aware & location-
transparancy persistence distributed, scalable in-memory fault-tolerant, lineage: child knows its parents functional api: declarative, typed 6 . 5
DAG DAG Directed Acyclic Graph 6 . 6
6 . 7
6 . 8
6 . 9
EXECUTION MODEL EXECUTION MODEL 6 . 10
6 . 11
DEPLOYMENT DEPLOYMENT 6 . 12
6 . 13
API API 6 . 14
6 . 15
COMPONENTS COMPONENTS 6 . 16
6 . 17
SPARK SQL & DATAFRAME SPARK SQL & DATAFRAME 7 .
1
7 . 2
7 . 3
SQL api, functional api, typed/untyped interactive, analytical interface, uni ed
programming model distributed, scalable code generation, out-of-the-box optimizations = catalyst engine memory & binary & compute optimizations = tungsten engine integration: multiple datasources, single representation, hive metastore 7 . 4
7 . 5
7 . 6
ECOSYSTEM & USECASES ECOSYSTEM & USECASES 8 . 1
8 . 2
DEMO DEMO spark-shell text le (rdd) load into memory lter,
map, group by reduce save show ui show plan, explain caching rdd -> dataframe 9 . 1
PLACE OF SPARK IN BIGDATA ECOSYSTEM PLACE OF SPARK IN
BIGDATA ECOSYSTEM 10 . 1
10 . 2
None
10 . 3
CALL TO ACTION CALL TO ACTION High Performance Spark -
Holden Karau install spark, run spark-shell, load text le, play with it http://learn.mapr.com/dev-360-apache-spark- essentials 11 . 1
12 . 1