Slide 1

Slide 1 text

RUST FOR SAFE & EFFICIENT PROGRAMS DEVELOPER WORK DISCUSSION | 02 • 10 • 2017

Slide 2

Slide 2 text

Systems programming language ⇒ C-level abstraction ⇒ C/C++/Java-like syntax ⇒ statically typed, functional + imperative

Slide 3

Slide 3 text

Systems programming language ⇒ C-level abstraction ⇒ C/C++/Java-like syntax ⇒ statically typed, functional + imperative Open source + Mozilla-sponsored ⇒ first release: 2010 ⇒ latest stable release: August 2017 ⇒ Servo: a browser rendering engine

Slide 4

Slide 4 text

Systems programming language ⇒ C-level abstraction ⇒ C/C++/Java-like syntax ⇒ statically typed, functional + imperative Open source + Mozilla-sponsored ⇒ first release: 2010 ⇒ latest stable release: August 2017 ⇒ Servo: a browser rendering engine Performance + memory safety ⇒ without using garbage collectors ⇒ without explicit free or delete ⇒ with ownerships, borrowing, & lifetimes ⇒ enables safe concurrent programming (not discussed in this talk)

Slide 5

Slide 5 text

Performance + memory safety ⇒ ownership, borrowing, and lifetimes ⇒ safe, default context Trait-based polymorphism ⇒ composition instead of inheritance Enum types (tagged unions or sum types) ⇒ Option: None or Some(5) instead of NULLs ⇒ Result: Ok(T) or Err(E) instead of exceptions Modular ⇒ crates + modules instead of includes ⇒ private by default Tooling ⇒ rustup for bootstrapping ⇒ cargo for building, testing, benchmarking, publishing, etc. Many more! ⇒ macros, pattern matching, etc.

Slide 6

Slide 6 text

stack ⇒ allocation via pointer moves ⇒ deallocation via pointer moves (LIFO) ⇒ grows and shrinks contiguously ⇒ has a maximum size ($ ulimit -s) heap ⇒ allocation via malloc/new ⇒ deallocation via free/delete ⇒ for values that may outlive their scope ⇒ may become fragmented ⇒ practically limited by remaining vmem Process memory layout kernel vmem space static variables program text address 0 → address 0x08048000 → stack heap 32-bit address space Linux (not to scale) 1 GB 3 GB shared libs mmap region brk → %esp →

Slide 7

Slide 7 text

memory unsafety ⇒ corruption of the stack/heap ⇒ undefined behaviors ⇒ what Rust (the language) prevents unsafe examples ⇒ dereferencing an invalid pointer ⇒ dereferencing a null pointer ⇒ multiple / invalid free calls Memory unsafety vs memory leaks memory leaks ⇒ not releasing unused memory ⇒ impairs performance but behavior is defined ⇒ in some cases, possible in Rust leak examples ⇒ missing free calls ⇒ circular references

Slide 8

Slide 8 text

Rust memory layout // main entry point has no return values fn main() { // declaration & definition combined let num1: u32 = 7; let num2: f32 = 3.5; // explicit typing is often optional // here Rust infers `nums` to be Vec let nums = vec![1, 4, 6, 4, 1]; } stack heap ● 1 4 6 4 1 8 5 7 3.5 nums num1 capacity length num2

Slide 9

Slide 9 text

Rust memory layout // main entry point has no return values fn main() { // declaration & definition combined let num1: Box = Box::new(7); let num2: f32 = 3.5; // explicit typing is often optional // here Rust infers `nums` to be Vec let nums = vec![1, 4, 6, 4, 1]; } stack heap ● 1 4 6 4 1 8 5 ● nums num1 capacity length num2 7 3.5

Slide 10

Slide 10 text

garbage collectors ✔ simplifies application code ✘ possible runtime performance penalty ✘ behavior is difficult to predict manual calls ✔ precise control over (de)allocations ✘ prone to human errors ✘ complicates application code Memory management

Slide 11

Slide 11 text

garbage collectors ✔ simplifies application code ✘ possible runtime performance penalty ✘ behavior is difficult to predict manual calls ✔ precise control over memory use ✘ prone to human errors ✘ complicates application code Memory management in Rust ⇒ ownership model ⇒ enforced by the compiler (borrow checker) ⇒ no runtime overhead

Slide 12

Slide 12 text

Ownership rules fn main() { // `nums` owns vector // vector owns each integer let nums = vec![1, 4, 6, 4, 1]; } 1. Each value in Rust has a variable called its owner 2. There can only be one owner at a time 3. When an owner goes out of scope, all its values will be dropped

Slide 13

Slide 13 text

Ownership rules 1. Each value in Rust has a variable called its owner 2. There can only be one owner at a time 3. When an owner goes out of scope, all its values will be dropped struct Song { title: String, // heap-allocated string rating: u32, // unsigned 32-bit integer } fn main() { // `playlist` owns the vector // vector owns each `Song` // each `Song` owns a u32 and a String let playlist = vec![ Song { title: "Macarena".to_string(), rating: 5 }, Song { title: String::from("Smooth"), rating: 4 }, ]; }

Slide 14

Slide 14 text

Ownership rules struct Song { title: String, // heap-allocated string rating: u32, // unsigned 32-bit integer } fn main() { // `playlist` owns the vector // vector owns each `Song` // each `Song` owns a u32 and a String let playlist = vec![ Song { title: "Macarena".to_string(), rating: 5 }, Song { title: String::from("Smooth"), rating: 4 }, ]; } stack heap ● ● 8 8 5 ● 8 6 4 4 2 playlist heap M a c a r e n a capacity length capacity length heap rating title

Slide 15

Slide 15 text

Ownership forms a tree struct Song { title: String, // heap-allocated string rating: u32, // unsigned 32-bit integer } fn main() { // `playlist` owns the vector // vector owns each `Song` // each `Song` owns a u32 and a String let playlist = vec![ Song { title: "Macarena".to_string(), rating: 5 }, Song { title: String::from("Smooth"), rating: 4 }, ]; } playlist Song title rating Song title rating unicode chars unicode chars vector bindings

Slide 16

Slide 16 text

Rust has move semantics fn main() { // vector owned by `nums` let nums = vec![1, 2, 1]; // vector moved to `pascal` // `pascal` now owns the vector let pascal = nums; // `nums` can not be used from this point // onwards - this is enforced by the // compiler } nums 1 pascal 2 1 vector bindings

Slide 17

Slide 17 text

bindings Rust has move semantics // moving from indexed content is not allowed fn main() { let vars = vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]; // this will cause compilation to fail let last = vars[2]; } vars String last String String vector unicode chars unicode chars unicode chars

Slide 18

Slide 18 text

bindings // moving from indexed content is not allowed fn main() { let vars = vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]; // we could use `.remove` to make it work let last = vars.remove(2); // but this has O(n) complexity // because it shifts all remaining // elements to the left and we end up // changing the vector } Rust has move semantics vars String last String String vector unicode chars unicode chars unicode chars

Slide 19

Slide 19 text

bindings // moving from indexed content is not allowed fn main() { let vars = vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]; // or we could use `.clone` let last = vars[2].clone(); // this creates a clone of the stack and // heap values, but is not always what // we want // another way is to use borrows // we'll get to that shortly } Rust has move semantics vars String last String String vector unicode chars unicode chars unicode chars String unicode chars

Slide 20

Slide 20 text

bindings Rust has move semantics - except for Copy types `Copy` types implement the `Copy` trait and can be copied bit-by-bit fn main() { let x = true; let y = x; assert!(x); let nums = vec![6, 28, 496]; let last = nums[2]; assert_eq!(last, 496); println!("we have {} items", nums.len()); // stdout: we have 3 items } nums last vector 6 28 496 496 x y true true

Slide 21

Slide 21 text

Hello, cargo! $ cargo run↲ // function calls obey move semantics fn sum(input: Vec) -> i32 { let mut result = 0; // loops also obey move semantics for n in input { result += n; } result } fn main() { let nums = vec![1, 2, 3]; let result = sum(nums); println!("sum is {}", result); // can't use `nums` anymore println!("from {} items", nums.len()); }

Slide 22

Slide 22 text

Hello, cargo! $ cargo run↲ Compiling app v0.1.0 (file:///path/to/app) error[E0382]: use of moved value: `nums` --> src/main.rs:13:34 | 13 | let result = sum(nums); | ---- value moved here ... 16 | println!("from {} items", nums.len()); | ^^^^ value used here after move | = note: move occurs because `nums` has type `std::vec::Vec`, which does not implement the `Copy` trait error: aborting due to previous error(s) error: Could not compile `app`. To learn more, run the command again with --verbose. $ _ // function calls obey move semantics fn sum(input: Vec) -> i32 { let mut result = 0; // loops also obey move semantics for n in input { result += n; } result } fn main() { let nums = vec![1, 2, 3]; let result = sum(nums); println!("sum is {}", result); // can't use `nums` anymore println!("from {} items", nums.len()); }

Slide 23

Slide 23 text

Hello, cargo! // function calls obey move semantics fn sum(input: &Vec) -> i32 { let mut result = 0; // loops also obey move semantics for n in input { result += *n; } result } fn main() { let nums = vec![1, 2, 3]; let result = sum(&nums); println!("sum is {}", result); // `nums` is accessible now println!("from {} items", nums.len()); } $ cargo run↲ Compiling app v0.1.0 (file:///path/to/app) Finished dev [unoptimized + debuginfo] target(s) in 0.46 secs Running `target/debug/app` sum is 6 from 3 items $ _

Slide 24

Slide 24 text

1. Borrows can not outlive their referent 2. At any given time, you can only have either: a. one mutable reference b. any number of shared references Ownership rules & borrowing rules 1. Each value in Rust has a variable called its owner 2. There can only be one owner at a time 3. When an owner goes out of scope, all its values will be dropped

Slide 25

Slide 25 text

shared borrow ⇒ read-only reference ⇒ created from any bindings ⇒ referent can be borrowed multiple times ⇒ freezes paths to and from referent during the borrow ⇒ for type T, its shared borrow is type &T (ref T) Two kinds of borrows mutable borrow ⇒ reference that can update its referent ⇒ created from mutable bindings ⇒ exclusive access to referent ⇒ makes paths to and from referent inaccessible during the borrow ⇒ for type T, its mutable borrow is type &mut T (ref mut T)

Slide 26

Slide 26 text

// multiple shared borrows are ok fn main() { let x = 10; let s1 = &x; let s2 = &x; } $ cargo build↲ Compiling app v0.1.0 (file:///path/to/app) Finished dev [unoptimized + debuginfo] target(s) in 0.32 secs A taste of the borrow checker // shared borrows freezes their referent fn main() { let mut x = 10; let s1 = &x; let s2 = &x; x += 3; } $ cargo build↲ Compiling app v0.1.0 (file:///path/to/app) error[E0506]: cannot assign to `x` because it is borrowed --> src/main.rs:5:5 | 3 | let s1 = &x; | - borrow of `x` occurs here 4 | let s2 = &x; 5 | x += 3; | ^^^^^^^ assignment to borrowed `x` occurs here error: aborting due to previous error(s) error: Could not compile `rust-sandbox`. To learn more, run the command again with --verbose.

Slide 27

Slide 27 text

// mutable borrow gives exclusive access fn main() { let mut x = 10; let m1 = &mut x; let m2 = &mut x; } $ cargo build↲ Compiling app v0.1.0 (file:///path/to/app) error[E0499]: cannot borrow `x` as mutable more than once at a time --> src/main.rs:4:19 | 3 | let m1 = &mut x; | - first mutable borrow occurs here 4 | let m2 = &mut x; | ^ second mutable borrow occurs here 5 | } | - first borrow ends here error: aborting due to previous error(s) error: Could not compile `rust-sandbox`. A taste of the borrow checker // mutable borrows can modify the referent fn main() { let mut x = 10; let m1 = &mut x; *m1 = 7; } $ cargo build↲ Compiling app v0.1.0 (file:///path/to/app) Finished dev [unoptimized + debuginfo] target(s) in 0.66 secs

Slide 28

Slide 28 text

struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &party.theme.title; } bindings Shared borrow freezes ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &

Slide 29

Slide 29 text

struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &party.theme.title; // error: cannot borrow `party` as mutable // because `party.theme.title` is also // borrowed as immutable let new_party = &mut party; } bindings Shared borrow freezes ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &

Slide 30

Slide 30 text

struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &party.theme.title; // error: cannot move out of `party.theme` because // it is borrowed let fav_song = party.theme; } bindings Shared borrow freezes ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &

Slide 31

Slide 31 text

struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &party.theme.title; let mut next_host = party.host; // move -- ok } bindings Shared borrow freezes ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &

Slide 32

Slide 32 text

bindings Mutable borrow shuts down access to ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &mut struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &mut party.theme.title; }

Slide 33

Slide 33 text

bindings Mutable borrow shuts down access to ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &mut struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &mut party.theme.title; // this would be ok if `karaoke` was a shared borrow // error: cannot borrow `party` as immutable // because `party.theme.title` is also // borrowed as mutable let current_party = &party; }

Slide 34

Slide 34 text

bindings Mutable borrow shuts down access to ancestors & descendants party host theme Party karaoke String unicode chars Song title rating unicode chars &mut struct Song { title: String, rating: u32 } struct Party { host: String, theme: Song } fn main() { let mut party = Party { host: "me".to_string(), theme: Song { title: "Macarena".to_string(), rating: 5 } }; let karaoke = &mut party.theme.title; // this is still ok let mut next_host = party.host; next_host.push('?'); }

Slide 35

Slide 35 text

Lifetimes for borrow scope validation { let x = 5; // -----+-- 'a // | let r = &x; // --+----- 'b // | | println!("ref is {}", r); // | | // --+ | } // -----+ $ cargo run↲ Compiling app v0.1.0 (file:///path/to/app) Finished dev [unoptimized + debuginfo] target(s) in 0.43 secs Running `target/debug/app` ref: 5

Slide 36

Slide 36 text

Lifetimes for borrow scope validation { let r; // ----+-- 'b // | { // | let x = 5; // -+----- 'a r = &x; // | | } // -+ | // | println!("ref is {}", r); // | // | // ----+ } $ cargo run↲ Compiling app v0.1.0 (file:///path/to/app) error[E0597]: `x` does not live long enough --> src/main.rs:8:9 | 7 | r = &x; // | | | - borrow occurs here 8 | } // -+ | | ^ `x` dropped here while still borrowed ... 13 | } | - borrowed value needs to live until here error: aborting due to previous error(s) error: Could not compile `app`. To learn more, run the command again with --verbose. { let x = 5; // -----+-- 'a // | let r = &x; // --+----- 'b // | | println!("ref is {}", r); // | | // --+ | } // -----+ $ cargo run↲ Compiling app v0.1.0 (file:///path/to/app) Finished dev [unoptimized + debuginfo] target(s) in 0.43 secs Running `target/debug/app` ref: 5

Slide 37

Slide 37 text

unsafe Rust allows you to: 1. dereference a raw pointer 2. call unsafe functions 3. access and modify mutable static variables 4. implement an unsafe trait Why? Because the borrow checker favors correctness unsafe Rust // splitting a mutable slice into two mutable slices // impossible in safe Rust - because of two mutable borrows use std::slice; fn split_at_mut( slice: &mut [i32], mid: usize ) -> (&mut [i32], &mut [i32]) { let len = slice.len(); let ptr = slice.as_mut_ptr(); assert!(mid <= len); let offset = mid as isize; let end = len - mid; unsafe { (slice::from_raw_parts_mut(ptr, mid), slice::from_raw_parts_mut(ptr.offset(offset), end)) } }

Slide 38

Slide 38 text

Thanks! Credits ⇒ Programming Rust book by Jason Orendorff and Jim Blandy ⇒ The Rust Programming Language book by Steve Klabnik and Carol Nichols ⇒ Memory leaks are memory safe blog post by Huon Wilson ⇒ Front photo: Marta Pawlik ⇒ This photo: Jeremy Bishop Personal experiments ⇒ https://github.com/bow/gtetools: CLI tool for gene annotation formats ⇒ https://git.lumc.nl/hem/fidus: FLT3 ITD detection tool Other projects of note: ⇒ https://github.com/rust-bio/rust-bio: bioinformatics library ⇒ https://github.com/rust-bio/rust-htslib: wrapper for the htslib C library