Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rayon (Rust Belt Rust)
Search
nikomatsakis
October 28, 2016
Programming
7
1.1k
Rayon (Rust Belt Rust)
A talk about Rayon from the Rust Belt Rust conference
nikomatsakis
October 28, 2016
Tweet
Share
More Decks by nikomatsakis
See All by nikomatsakis
Hereditary Harrop Formulas (Papers We Love Boston)
nikomatsakis
2
480
Rust: Systems Programming for All!
nikomatsakis
0
180
CppNow 2017
nikomatsakis
0
200
Rust at Mozilla (part of Mozilla Onboarding)
nikomatsakis
0
170
Guaranteeing Memory Safety and Data-Race Freedom in Rust
nikomatsakis
0
240
Other Decks in Programming
See All in Programming
CloudflareのChat Agent Starter Kitで簡単!AIチャットボット構築
syumai
2
500
JSONataを使ってみよう Step Functionsが楽しくなる実践テクニック #devio2025
dafujii
1
590
AI Coding Agentのセキュリティリスク:PRの自己承認とメルカリの対策
s3h
0
230
Amazon RDS 向けに提供されている MCP Server と仕組みを調べてみた/jawsug-okayama-2025-aurora-mcp
takahashiikki
1
110
アプリの "かわいい" を支えるアニメーションツールRiveについて
uetyo
0
270
HTMLの品質ってなんだっけ? “HTMLクライテリア”の設計と実践
unachang113
4
2.9k
Zendeskのチケットを Amazon Bedrockで 解析した
ryokosuge
3
310
アルテニア コンサル/ITエンジニア向け 採用ピッチ資料
altenir
0
110
rage against annotate_predecessor
junk0612
0
170
Android端末で実現するオンデバイスLLM 2025
masayukisuda
1
170
パッケージ設計の黒魔術/Kyoto.go#63
lufia
3
440
実用的なGOCACHEPROG実装をするために / golang.tokyo #40
mazrean
1
290
Featured
See All Featured
Statistics for Hackers
jakevdp
799
220k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Balancing Empowerment & Direction
lara
3
620
Six Lessons from altMBA
skipperchong
28
4k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
36
2.5k
GraphQLとの向き合い方2022年版
quramy
49
14k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
How to train your dragon (web standard)
notwaldorf
96
6.2k
Scaling GitHub
holman
463
140k
Thoughts on Productivity
jonyablonski
70
4.8k
Large-scale JavaScript Application Architecture
addyosmani
513
110k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
113
20k
Transcript
Rayon Data Parallelism for Fun and Profit Nicholas Matsakis (nmatsakis
on IRC)
Want to make parallelization easy 2 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { paths.iter() .map(|path| Image::load(path)) .collect() } fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path)) .collect() } For each path… …load an image… …create and return a vector.
Want to make parallelization safe 3 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { let mut pngs = 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
4 http://blog.faraday.io/saved-by-the-compiler-parallelizing-a-loop-with-rust-and-rayon/
5 Parallel Iterators join() threadpool Basically all safe Safe interface
Unsafe impl Unsafe
6 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.iter() .map(|path| Image::load(path))
.collect() }
7 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() }
Not quite that simple… 8 (but almost!) 1. No mutating
shared state (except for atomics, locks). 2. Some combinators are inherently sequential. 3. Some things aren’t implemented yet.
9 fn load_images(paths: &[PathBuf]) -> Vec<Image> { let mut pngs
= 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
10 `c` not shared between iterations! fn increment_all(counts: &mut [u32])
{ for c in counts.iter_mut() { *c += 1; } } fn increment_all(counts: &mut [u32]) { paths.par_iter_mut() .for_each(|c| *c += 1); }
fn load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = paths.par_iter()
.filter(|p| p.ends_with(“png”)) .map(|_| 1) .sum(); paths.par_iter() .map(|p| Image::load(p)) .collect() } 11
12 But beware: atomics introduce nondeterminism! use std::sync::atomic::{AtomicUsize, Ordering}; fn
load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = AtomicUsize::new(0); paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs.fetch_add(1, Ordering::SeqCst); } Image::load(path) }) .collect() }
13 3 2 1 12 0 4 5 1 2
1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 6 2 6 * sum 8 82 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.iter() .zip(vec2) .map(|(e1, e2)| e1 * e2) .fold(0, |a, b| a + b) // aka .sum() }
14 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.par_iter()
.zip(vec2) .map(|(e1, e2)| e1 * e2) .reduce(|| 0, |a, b| a + b) // aka .sum() } 3 2 1 12 0 4 5 1 2 1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 sum 20 19 43 39 82
15 Parallel iterators: Mostly like normal iterators, but: • closures
cannot mutate shared state • some operations are different For the most part, Rust protects you from surprises.
16 Parallel Iterators join() threadpool
The primitive: join() 17 rayon::join(|| do_something(…), || do_something_else(…)); Meaning: maybe
execute two closures in parallel. Idea: - add `join` wherever parallelism is possible - let the library decide when it is profitable
18 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() } Image::load(paths[0]) Image::load(paths[1])
Work stealing 19 Cilk: http://supertech.lcs.mit.edu/cilk/ (0..22) Thread A Thread B
(0..15) (15..22) (1..15) (queue) (queue) (0..1) (15..22) (15..18) (18..22) (15..16) (16..18) “stolen” (18..22) “stolen”
20
21 Parallel Iterators join() threadpool Rayon: • Parallelize for fun
and profit • Variety of APIs available • Future directions: • more iterators • integrate SIMD, array ops • integrate persistent trees • factor out threadpool
22 Parallel Iterators join() scope() threadpool
23 the scope `s` task `t1` task `t2` rayon::scope(|s| {
… s.spawn(move |s| { // task t1 }); s.spawn(move |s| { // task t2 }); … });
rayon::scope(|s| { … s.spawn(move |s| { // task t1 s.spawn(move
|s| { // task t2 … }); … }); … }); 24 the scope task t1 task t2
`not_ok` is freed here 25 the scope task t1 let
ok: &[u32]s = &[…]; rayon::scope(|scope| { … let not_ok: &[u32] = &[…]; … scope.spawn(move |scope| { // which variables can t1 use? }); });
26 fn join<A,B>(a: A, b: B) where A: FnOnce() +
Send, B: FnOnce() + Send, { rayon::scope(|scope| { scope.spawn(move |_| a()); scope.spawn(move |_| b()); }); } (Real join avoids heap allocation)
27 struct Tree<T> { value: T, children: Vec<Tree<T>>, } impl<T>
Tree<T> { fn process_all(&mut self) { process_value(&mut self.value); for child in &mut self.children { child.process_all(); } } }
28 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { for child in &mut self.children { scope.spawn(move |_| child.process_all()); } process_value(&mut self.value); }); } }
29 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |_| child.process_all()); } }); process_value(&mut self.value); }); } }
30 impl<T: Send> Tree<T> { fn process_all(&mut self) { rayon::scope(|s|
self.process_in(s)); } fn process_in<‘s>(&’s mut self, scope: &Scope<‘s>) { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |scope| child.process_in(scope)); } }); process_value(&mut self.value); } }