Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rayon (Rust Belt Rust)
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
nikomatsakis
October 28, 2016
Programming
7
1.1k
Rayon (Rust Belt Rust)
A talk about Rayon from the Rust Belt Rust conference
nikomatsakis
October 28, 2016
Tweet
Share
More Decks by nikomatsakis
See All by nikomatsakis
Hereditary Harrop Formulas (Papers We Love Boston)
nikomatsakis
2
500
Rust: Systems Programming for All!
nikomatsakis
0
200
CppNow 2017
nikomatsakis
0
220
Rust at Mozilla (part of Mozilla Onboarding)
nikomatsakis
0
190
Guaranteeing Memory Safety and Data-Race Freedom in Rust
nikomatsakis
0
260
Other Decks in Programming
See All in Programming
コントリビューターによるDenoのすゝめ / Deno Recommendations by a Contributor
petamoriken
0
200
副作用をどこに置くか問題:オブジェクト指向で整理する設計判断ツリー
koxya
1
610
Honoを使ったリモートMCPサーバでAIツールとの連携を加速させる!
tosuri13
1
180
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
650
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
380
AI によるインシデント初動調査の自動化を行う AI インシデントコマンダーを作った話
azukiazusa1
1
730
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
280
AIによる高速開発をどう制御するか? ガードレール設置で開発速度と品質を両立させたチームの事例
tonkotsuboy_com
7
2.3k
Smart Handoff/Pickup ガイド - Claude Code セッション管理
yukiigarashi
0
140
高速開発のためのコード整理術
sutetotanuki
1
400
「ブロックテーマでは再現できない」は本当か?
inc2734
0
1k
AI巻き込み型コードレビューのススメ
nealle
1
280
Featured
See All Featured
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
60
42k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
140
Music & Morning Musume
bryan
47
7.1k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
170
My Coaching Mixtape
mlcsv
0
48
Code Reviewing Like a Champion
maltzj
527
40k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
770
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
350
30 Presentation Tips
portentint
PRO
1
220
Raft: Consensus for Rubyists
vanstee
141
7.3k
Believing is Seeing
oripsolob
1
55
The Spectacular Lies of Maps
axbom
PRO
1
520
Transcript
Rayon Data Parallelism for Fun and Profit Nicholas Matsakis (nmatsakis
on IRC)
Want to make parallelization easy 2 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { paths.iter() .map(|path| Image::load(path)) .collect() } fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path)) .collect() } For each path… …load an image… …create and return a vector.
Want to make parallelization safe 3 fn load_images(paths: &[PathBuf]) ->
Vec<Image> { let mut pngs = 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
4 http://blog.faraday.io/saved-by-the-compiler-parallelizing-a-loop-with-rust-and-rayon/
5 Parallel Iterators join() threadpool Basically all safe Safe interface
Unsafe impl Unsafe
6 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.iter() .map(|path| Image::load(path))
.collect() }
7 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() }
Not quite that simple… 8 (but almost!) 1. No mutating
shared state (except for atomics, locks). 2. Some combinators are inherently sequential. 3. Some things aren’t implemented yet.
9 fn load_images(paths: &[PathBuf]) -> Vec<Image> { let mut pngs
= 0; paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs += 1; } Image::load(path) }) .collect() } Data-race Will not compile
10 `c` not shared between iterations! fn increment_all(counts: &mut [u32])
{ for c in counts.iter_mut() { *c += 1; } } fn increment_all(counts: &mut [u32]) { paths.par_iter_mut() .for_each(|c| *c += 1); }
fn load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = paths.par_iter()
.filter(|p| p.ends_with(“png”)) .map(|_| 1) .sum(); paths.par_iter() .map(|p| Image::load(p)) .collect() } 11
12 But beware: atomics introduce nondeterminism! use std::sync::atomic::{AtomicUsize, Ordering}; fn
load_images(paths: &[PathBuf]) -> Vec<Image> { let pngs = AtomicUsize::new(0); paths.par_iter() .map(|path| { if path.ends_with(“png”) { pngs.fetch_add(1, Ordering::SeqCst); } Image::load(path) }) .collect() }
13 3 2 1 12 0 4 5 1 2
1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 6 2 6 * sum 8 82 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.iter() .zip(vec2) .map(|(e1, e2)| e1 * e2) .fold(0, |a, b| a + b) // aka .sum() }
14 fn dot_product(vec1: &[i32], vec2: &[i32]) -> i32 { vec1.par_iter()
.zip(vec2) .map(|(e1, e2)| e1 * e2) .reduce(|| 0, |a, b| a + b) // aka .sum() } 3 2 1 12 0 4 5 1 2 1 3 2 1 0 1 3 4 0 3 6 7 8 vec1 vec2 sum 20 19 43 39 82
15 Parallel iterators: Mostly like normal iterators, but: • closures
cannot mutate shared state • some operations are different For the most part, Rust protects you from surprises.
16 Parallel Iterators join() threadpool
The primitive: join() 17 rayon::join(|| do_something(…), || do_something_else(…)); Meaning: maybe
execute two closures in parallel. Idea: - add `join` wherever parallelism is possible - let the library decide when it is profitable
18 fn load_images(paths: &[PathBuf]) -> Vec<Image> { paths.par_iter() .map(|path| Image::load(path))
.collect() } Image::load(paths[0]) Image::load(paths[1])
Work stealing 19 Cilk: http://supertech.lcs.mit.edu/cilk/ (0..22) Thread A Thread B
(0..15) (15..22) (1..15) (queue) (queue) (0..1) (15..22) (15..18) (18..22) (15..16) (16..18) “stolen” (18..22) “stolen”
20
21 Parallel Iterators join() threadpool Rayon: • Parallelize for fun
and profit • Variety of APIs available • Future directions: • more iterators • integrate SIMD, array ops • integrate persistent trees • factor out threadpool
22 Parallel Iterators join() scope() threadpool
23 the scope `s` task `t1` task `t2` rayon::scope(|s| {
… s.spawn(move |s| { // task t1 }); s.spawn(move |s| { // task t2 }); … });
rayon::scope(|s| { … s.spawn(move |s| { // task t1 s.spawn(move
|s| { // task t2 … }); … }); … }); 24 the scope task t1 task t2
`not_ok` is freed here 25 the scope task t1 let
ok: &[u32]s = &[…]; rayon::scope(|scope| { … let not_ok: &[u32] = &[…]; … scope.spawn(move |scope| { // which variables can t1 use? }); });
26 fn join<A,B>(a: A, b: B) where A: FnOnce() +
Send, B: FnOnce() + Send, { rayon::scope(|scope| { scope.spawn(move |_| a()); scope.spawn(move |_| b()); }); } (Real join avoids heap allocation)
27 struct Tree<T> { value: T, children: Vec<Tree<T>>, } impl<T>
Tree<T> { fn process_all(&mut self) { process_value(&mut self.value); for child in &mut self.children { child.process_all(); } } }
28 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { for child in &mut self.children { scope.spawn(move |_| child.process_all()); } process_value(&mut self.value); }); } }
29 impl<T> Tree<T> { fn process_all(&mut self) where T: Send
{ rayon::scope(|scope| { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |_| child.process_all()); } }); process_value(&mut self.value); }); } }
30 impl<T: Send> Tree<T> { fn process_all(&mut self) { rayon::scope(|s|
self.process_in(s)); } fn process_in<‘s>(&’s mut self, scope: &Scope<‘s>) { let children = &mut self.children; scope.spawn(move |scope| { for child in &mut children { scope.spawn(move |scope| child.process_in(scope)); } }); process_value(&mut self.value); } }