Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rayon (Rust Belt Rust)
Search
nikomatsakis
October 28, 2016
Programming
7
1.1k
Rayon (Rust Belt Rust)
A talk about Rayon from the Rust Belt Rust conference
nikomatsakis
October 28, 2016
Tweet
Share
More Decks by nikomatsakis
See All by nikomatsakis
Hereditary Harrop Formulas (Papers We Love Boston)
nikomatsakis
2
490
Rust: Systems Programming for All!
nikomatsakis
0
190
CppNow 2017
nikomatsakis
0
220
Rust at Mozilla (part of Mozilla Onboarding)
nikomatsakis
0
180
Guaranteeing Memory Safety and Data-Race Freedom in Rust
nikomatsakis
0
260
Other Decks in Programming
See All in Programming
AtCoder Conference 2025「LLM時代のAHC」
imjk
2
640
例外処理とどう使い分ける?Result型を使ったエラー設計 #burikaigi
kajitack
15
4.4k
生成AIを利用するだけでなく、投資できる組織へ
pospome
2
440
AI Agent Tool のためのバックエンドアーキテクチャを考える #encraft
izumin5210
6
1.6k
Rubyで鍛える仕組み化プロヂュース力
muryoimpl
0
340
大規模Cloud Native環境におけるFalcoの運用
owlinux1000
0
240
Unicodeどうしてる? PHPから見たUnicode対応と他言語での対応についてのお伺い
youkidearitai
PRO
0
330
AI 駆動開発ライフサイクル(AI-DLC):ソフトウェアエンジニアリングの再構築 / AI-DLC Introduction
kanamasa
11
5k
TerraformとStrands AgentsでAmazon Bedrock AgentCoreのSSO認証付きエージェントを量産しよう!
neruneruo
4
2.3k
ゆくKotlin くるRust
exoego
1
190
GoLab2025 Recap
kuro_kurorrr
0
1.6k
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
120
Featured
See All Featured
The Spectacular Lies of Maps
axbom
PRO
1
420
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
130
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
1
220
The World Runs on Bad Software
bkeepers
PRO
72
12k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
270
How to Talk to Developers About Accessibility
jct
1
94
Thoughts on Productivity
jonyablonski
73
5k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
0
1.8k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
How to make the Groovebox
asonas
2
1.9k
Transcript
Rayon Data Parallelism for Fun and Profit Nicholas Matsakis (nmatsakis
on IRC)
Want to make parallelization easy 2 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { paths.iter() .map(|path| Image::load(path)) .collect() } fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path)) .collect() } For each path… …load an image… …create and return a vector.
Want to make parallelization safe 3 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { let mut pngs = 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
4 http://blog.faraday.io/saved-by-the-compiler-parallelizing-a-loop-with-rust-and-rayon/
5 Parallel Iterators join() threadpool Basically all safe Safe interface
Unsafe impl Unsafe
6 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.iter() .map(|path| Image::load(path))
.collect() }
7 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() }
Not quite that simple… 8 (but almost!) 1. No mutating
shared state (except for atomics, locks). 2. Some combinators are inherently sequential. 3. Some things aren’t implemented yet.
9 fn load_images(paths: &[PathBuf]) -> Vec<Image> { let mut pngs
= 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
10 `c` not shared between iterations! fn increment_all(counts: &mut [u32])
{ for c in counts.iter_mut() { *c += 1; } } fn increment_all(counts: &mut [u32]) { paths.par_iter_mut() .for_each(|c| *c += 1); }
fn load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = paths.par_iter()
.filter(|p| p.ends_with(“png”)) .map(|_| 1) .sum(); paths.par_iter() .map(|p| Image::load(p)) .collect() } 11
12 But beware: atomics introduce nondeterminism! use std::sync::atomic::{AtomicUsize, Ordering}; fn
load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = AtomicUsize::new(0); paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs.fetch_add(1, Ordering::SeqCst); } Image::load(path) }) .collect() }
13 3 2 1 12 0 4 5 1 2
1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 6 2 6 * sum 8 82 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.iter() .zip(vec2) .map(|(e1, e2)| e1 * e2) .fold(0, |a, b| a + b) // aka .sum() }
14 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.par_iter()
.zip(vec2) .map(|(e1, e2)| e1 * e2) .reduce(|| 0, |a, b| a + b) // aka .sum() } 3 2 1 12 0 4 5 1 2 1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 sum 20 19 43 39 82
15 Parallel iterators: Mostly like normal iterators, but: • closures
cannot mutate shared state • some operations are different For the most part, Rust protects you from surprises.
16 Parallel Iterators join() threadpool
The primitive: join() 17 rayon::join(|| do_something(…), || do_something_else(…)); Meaning: maybe
execute two closures in parallel. Idea: - add `join` wherever parallelism is possible - let the library decide when it is profitable
18 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() } Image::load(paths[0]) Image::load(paths[1])
Work stealing 19 Cilk: http://supertech.lcs.mit.edu/cilk/ (0..22) Thread A Thread B
(0..15) (15..22) (1..15) (queue) (queue) (0..1) (15..22) (15..18) (18..22) (15..16) (16..18) “stolen” (18..22) “stolen”
20
21 Parallel Iterators join() threadpool Rayon: • Parallelize for fun
and profit • Variety of APIs available • Future directions: • more iterators • integrate SIMD, array ops • integrate persistent trees • factor out threadpool
22 Parallel Iterators join() scope() threadpool
23 the scope `s` task `t1` task `t2` rayon::scope(|s| {
… s.spawn(move |s| { // task t1 }); s.spawn(move |s| { // task t2 }); … });
rayon::scope(|s| { … s.spawn(move |s| { // task t1 s.spawn(move
|s| { // task t2 … }); … }); … }); 24 the scope task t1 task t2
`not_ok` is freed here 25 the scope task t1 let
ok: &[u32]s = &[…]; rayon::scope(|scope| { … let not_ok: &[u32] = &[…]; … scope.spawn(move |scope| { // which variables can t1 use? }); });
26 fn join<A,B>(a: A, b: B) where A: FnOnce() +
Send, B: FnOnce() + Send, { rayon::scope(|scope| { scope.spawn(move |_| a()); scope.spawn(move |_| b()); }); } (Real join avoids heap allocation)
27 struct Tree<T> { value: T, children: Vec<Tree<T>>, } impl<T>
Tree<T> { fn process_all(&mut self) { process_value(&mut self.value); for child in &mut self.children { child.process_all(); } } }
28 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { for child in &mut self.children { scope.spawn(move |_| child.process_all()); } process_value(&mut self.value); }); } }
29 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |_| child.process_all()); } }); process_value(&mut self.value); }); } }
30 impl<T: Send> Tree<T> { fn process_all(&mut self) { rayon::scope(|s|
self.process_in(s)); } fn process_in<‘s>(&’s mut self, scope: &Scope<‘s>) { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |scope| child.process_in(scope)); } }); process_value(&mut self.value); } }