Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rayon (Rust Belt Rust)
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
nikomatsakis
October 28, 2016
Programming
1.1k
7
Share
Rayon (Rust Belt Rust)
A talk about Rayon from the Rust Belt Rust conference
nikomatsakis
October 28, 2016
More Decks by nikomatsakis
See All by nikomatsakis
Hereditary Harrop Formulas (Papers We Love Boston)
nikomatsakis
2
520
Rust: Systems Programming for All!
nikomatsakis
0
210
CppNow 2017
nikomatsakis
0
240
Rust at Mozilla (part of Mozilla Onboarding)
nikomatsakis
0
200
Guaranteeing Memory Safety and Data-Race Freedom in Rust
nikomatsakis
0
280
Other Decks in Programming
See All in Programming
Moments When Things Go Wrong
aurimas
3
110
横断組織出身のQAEがインプロセスQAEでつまずいたこと・活かせたこと
ty89
0
180
密結合なバックエンドから TypeScript のコードを生成する
kemuridama
1
340
誰も頼んでない機能を出荷した話
zekutax
0
130
次世代リンターで探る、tsgo 時代における型認識カスタムルールの現実解
ytakahashii
1
970
技術記事、AIに書かせるか、自分で書くか? 〜それでも私が自分の手で書く理由〜 / #QiitaConference
jnchito
2
420
Migrations : C'est une question d'hygiène !
vinceamstoutz
0
1.8k
AI駆動開発勉強会 広島支部 第一回勉強会 AI駆動開発概要とワークショップ
hayatoshimiu
0
370
柔軟なPDFレイアウトエディタを支える型システム設計 — Discriminated UnionとConditional Typeの実践
minako__ph
4
580
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
5
1.3k
CLIであることを活かしたGitHub Copilot CLI活用術 / GitHub Copilot CLI Pro Tips & Tricks
nao_mk2
1
990
ビジネスモデルから紐解く、AI+型駆動開発
hirokiomote
2
2.1k
Featured
See All Featured
Automating Front-end Workflow
addyosmani
1370
210k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
700
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
310
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
210
BBQ
matthewcrist
89
10k
The SEO identity crisis: Don't let AI make you average
varn
0
470
What's in a price? How to price your products and services
michaelherold
247
13k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.7k
4 Signs Your Business is Dying
shpigford
187
22k
Skip the Path - Find Your Career Trail
mkilby
1
130
Joys of Absence: A Defence of Solitary Play
codingconduct
1
370
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Transcript
Rayon Data Parallelism for Fun and Profit Nicholas Matsakis (nmatsakis
on IRC)
Want to make parallelization easy 2 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { paths.iter() .map(|path| Image::load(path)) .collect() } fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path)) .collect() } For each path… …load an image… …create and return a vector.
Want to make parallelization safe 3 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { let mut pngs = 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
4 http://blog.faraday.io/saved-by-the-compiler-parallelizing-a-loop-with-rust-and-rayon/
5 Parallel Iterators join() threadpool Basically all safe Safe interface
Unsafe impl Unsafe
6 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.iter() .map(|path| Image::load(path))
.collect() }
7 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() }
Not quite that simple… 8 (but almost!) 1. No mutating
shared state (except for atomics, locks). 2. Some combinators are inherently sequential. 3. Some things aren’t implemented yet.
9 fn load_images(paths: &[PathBuf]) -> Vec<Image> { let mut pngs
= 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
10 `c` not shared between iterations! fn increment_all(counts: &mut [u32])
{ for c in counts.iter_mut() { *c += 1; } } fn increment_all(counts: &mut [u32]) { paths.par_iter_mut() .for_each(|c| *c += 1); }
fn load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = paths.par_iter()
.filter(|p| p.ends_with(“png”)) .map(|_| 1) .sum(); paths.par_iter() .map(|p| Image::load(p)) .collect() } 11
12 But beware: atomics introduce nondeterminism! use std::sync::atomic::{AtomicUsize, Ordering}; fn
load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = AtomicUsize::new(0); paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs.fetch_add(1, Ordering::SeqCst); } Image::load(path) }) .collect() }
13 3 2 1 12 0 4 5 1 2
1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 6 2 6 * sum 8 82 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.iter() .zip(vec2) .map(|(e1, e2)| e1 * e2) .fold(0, |a, b| a + b) // aka .sum() }
14 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.par_iter()
.zip(vec2) .map(|(e1, e2)| e1 * e2) .reduce(|| 0, |a, b| a + b) // aka .sum() } 3 2 1 12 0 4 5 1 2 1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 sum 20 19 43 39 82
15 Parallel iterators: Mostly like normal iterators, but: • closures
cannot mutate shared state • some operations are different For the most part, Rust protects you from surprises.
16 Parallel Iterators join() threadpool
The primitive: join() 17 rayon::join(|| do_something(…), || do_something_else(…)); Meaning: maybe
execute two closures in parallel. Idea: - add `join` wherever parallelism is possible - let the library decide when it is profitable
18 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() } Image::load(paths[0]) Image::load(paths[1])
Work stealing 19 Cilk: http://supertech.lcs.mit.edu/cilk/ (0..22) Thread A Thread B
(0..15) (15..22) (1..15) (queue) (queue) (0..1) (15..22) (15..18) (18..22) (15..16) (16..18) “stolen” (18..22) “stolen”
20
21 Parallel Iterators join() threadpool Rayon: • Parallelize for fun
and profit • Variety of APIs available • Future directions: • more iterators • integrate SIMD, array ops • integrate persistent trees • factor out threadpool
22 Parallel Iterators join() scope() threadpool
23 the scope `s` task `t1` task `t2` rayon::scope(|s| {
… s.spawn(move |s| { // task t1 }); s.spawn(move |s| { // task t2 }); … });
rayon::scope(|s| { … s.spawn(move |s| { // task t1 s.spawn(move
|s| { // task t2 … }); … }); … }); 24 the scope task t1 task t2
`not_ok` is freed here 25 the scope task t1 let
ok: &[u32]s = &[…]; rayon::scope(|scope| { … let not_ok: &[u32] = &[…]; … scope.spawn(move |scope| { // which variables can t1 use? }); });
26 fn join<A,B>(a: A, b: B) where A: FnOnce() +
Send, B: FnOnce() + Send, { rayon::scope(|scope| { scope.spawn(move |_| a()); scope.spawn(move |_| b()); }); } (Real join avoids heap allocation)
27 struct Tree<T> { value: T, children: Vec<Tree<T>>, } impl<T>
Tree<T> { fn process_all(&mut self) { process_value(&mut self.value); for child in &mut self.children { child.process_all(); } } }
28 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { for child in &mut self.children { scope.spawn(move |_| child.process_all()); } process_value(&mut self.value); }); } }
29 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |_| child.process_all()); } }); process_value(&mut self.value); }); } }
30 impl<T: Send> Tree<T> { fn process_all(&mut self) { rayon::scope(|s|
self.process_in(s)); } fn process_in<‘s>(&’s mut self, scope: &Scope<‘s>) { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |scope| child.process_in(scope)); } }); process_value(&mut self.value); } }