Intro to Rust (with speaker notes)

Slide 1

Slide 1 text

1 Intro to Rust Danilo Bargen TEC Lunch & Learn, 2015-09-29

Slide 2

Slide 2 text

2 Agenda 1. What is Rust? 2. What’s Type Safety? 3. Reading Rust 4. Memory Safety in Rust 5. Multithreaded Programming 6. Further Reading 7. Questions

Slide 3

Slide 3 text

3 1. What is Rust?

Slide 4

Slide 4 text

4 «Rust is a systems programming language that runs blazingly fast, prevents nearly all segfaults, and guarantees thread safety.» www.rust-lang.org

Slide 5

Slide 5 text

5 What’s wrong with systems languages? - It’s difficult to write secure code. - It’s very difficult to write multithreaded code. These are the problems Rust was made to address. 1 - Systems languages have come a long way in the last 50 years - First virus based on a buffer overflow appeared in 1988 - According to Open Source Vulnerability Database, still 10-15% of reported vulns during last 8 years are buffer overflows 2 - Multithreading is becoming more needed with multicore CPUs - C++ like threading is incredibly hard, even experienced programmers write hard to reproduce bugs

Slide 6

Slide 6 text

6 Quick Facts about Rust (As of September 2015) - Started by Mozilla employee Graydon Hoare - First announced by Mozilla in 2010 - Community driven development - First stable release: 1.0 in May 2015 - Latest stable release: 1.3 - 46'484 commits on Github - Largest project written in Rust: Servo

Slide 7

Slide 7 text

7 Features - Zero-cost abstractions - Move semantics - Guaranteed memory safety - Threads without data races - Trait based generics - Pattern matching - Type inference - Minimal runtime, no GC - Efficient C bindings

Slide 8

Slide 8 text

8 2. What is Type Safety? Sounds good, what are we being kept safe from?

Slide 9

Slide 9 text

9 A C Program int main(int argc, char **argv) { unsigned long a[1]; a[3] = 0x7ffff7b36cebUL; return 0; } According to C99, undefined behavior. Output: undef: Error: .netrc file is readable by others. undef: Remove password or make file unreadable by others. 1 - We’re overwriting the return address on the stack. - Jumping right into libc 2 - The user is responsible for safety - We’re not good at that

Slide 10

Slide 10 text

10 Definitions - If a program has been written so that no possible execution can exhibit undefined behavior, we say that program is well defined. - If a language’s type system ensures that every program is well defined, we say that language is type safe

Slide 11

Slide 11 text

11 Type Safe Languages - C and C++ are not type safe. - Python is type safe: >> a = [0] >>> a[3] = 0x7ffff7b36ceb Traceback (most recent call last): File "", line 1, in IndexError: list assignment index out of range >>> - Java, JavaScript, Ruby, and Haskell are also type safe. 1 - Our sample program had no type erorrs, yet exhibits undefined behavior. 2 - An exception is no undefined behavior. 3 - Every program that type safe languages accept is well defined.

Slide 12

Slide 12 text

12 It’s ironic. C and C++ are not type safe. Yet they are being used to implement the foundations of a system. Rust tries to resolve that tension. - Rust also allows unsafe code. But the great majority of programs does not need unsafe code.

Slide 13

Slide 13 text

13 3. Reading Rust

Slide 14

Slide 14 text

14 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Program that calculates the greatest common denominator

Slide 15

Slide 15 text

15 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } The fn keyword introduces a function definition. Arrow denotes return value.

Slide 16

Slide 16 text

16 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } Variables are immutable by default

Slide 17

Slide 17 text

17 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Rust variable declarations: Name followed by type - [uif](8|16|32|64) and usize / isize

Slide 18

Slide 18 text

18 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Macro invocations with exclamation mark

Slide 19

Slide 19 text

19 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Type is inferred from context or suffix. Otherwise i32.

Slide 20

Slide 20 text

20 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - let introduces a local variable - Type is inferred

Slide 21

Slide 21 text

21 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Loops and conditions don’t need parentheses, only braces around body

Slide 22

Slide 22 text

22 Example 1 fn gcd(mut n: u64, mut m: u64) -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Blocks are expressions - return statement optional

Slide 23

Slide 23 text

23 Example 2: Generics fn min(a: T, b: T) -> T { if a <= b { a } else { b } } - Program that returns the smaller of two values

Slide 24

Slide 24 text

24 Example 2: Generics fn min(a: T, b: T) -> T { if a <= b { a } else { b } } - Text marks it as generic function - Defined for any type T where T is Ord - If a type is Ord, it supports a comparison - Ord is a Trait, will be handled later

Slide 25

Slide 25 text

25 Example 2: Generics fn min(a: T, b: T) -> T { if a <= b { a } else { b } } ... min(10i8, 20) == 10; // T is i8 min(10, 20u32) == 10; // T is u32 min(“abc”, “xyz”) == “abc”; // Strings are Ord min(10i32, “xyz”); // error: mismatched types. -

Slide 26

Slide 26 text

26 Example 3: Generic Types struct Range { start: Idx, end: Idx, } ... Range { start: 200, end: 800 } // OK Range { start: 1.3, end: 4.7 } // Also OK - A struct holds data - Can be generic

Slide 27

Slide 27 text

27 Example 4: Enumerations enum Option { Some(T), None } - Algebraic Datatypes - For any type T, an Option may be either: - None, which carries no value - Some(v) which carries the value v of type T - Resemble unions in C / C++, but remember their value type

Slide 28

Slide 28 text

28 Example 5: Application of Option fn safe_div(n: i32, d: i32) -> Option { if d == 0 { return None; } Some(n / d) } - Function that returns a division result, or None if divisor is 0 - (Could also be written as a single expression)

Slide 29

Slide 29 text

29 Example 6: Matching an Option match safe_div(num, denom) { None => println!(“No quotient.”), Some(v) => println!(“Quotient is {}.”, v) } - Similar to a switch statement, but more powerful - Rust offers combinator methods for simplification, more on this later

Slide 30

Slide 30 text

30 Example 7: Traits trait HasArea { fn area(&self) -> f64; } - Any type implementing HasArea must provide a method called “area” that takes no parameters and returns an f64

Slide 31

Slide 31 text

31 Example 8: Trait Implementation struct Circle { x: f64, y: f64, radius: f64, } impl HasArea for Circle { fn area(&self) -> f64 { consts::PI * (self.radius * self.radius) } } - Traits are implemented for structs - Other way of looking at it: Methods (behavior) are attached to data

Slide 32

Slide 32 text

32 Example 9: Default Methods trait Validatable { fn is_valid(&self) -> bool; fn is_invalid(&self) -> bool { !self.is_valid() } } - A trait can provide default implementations - Can be overridden

Slide 33

Slide 33 text

33 Example 10: Trait Composition trait Foo { fn foo(&self); } trait FooBar : Foo { fn foobar(&self); } - Any trait implementing FooBar must also implement Foo - Example: A trait Number that requires to implement Add, Sub, Mul, Div

Slide 34

Slide 34 text

34 4. Memory Safety in Rust

Slide 35

Slide 35 text

35 Three Key Promises - No null pointer dereferences - No dangling pointers - No buffer overruns 1 - Your program will not crash because you tried to dereference a null pointer 2 - Every value will live as long as it must 3 - Your program will never access elements outside of an array All ensured at compile time. What do they mean?

Slide 36

Slide 36 text

36 P1: No null pointer dereferences - Null pointers are useful - They can indicate the absence of optional information - They can indicate failures - But they can introduce severe bugs - Rust separates the concept of a pointer from the concept of an optional or error value - Optional values are handled by Option - Error values are handled by Result - Many helpful tools to do error handling

Slide 37

Slide 37 text

37 You already saw Option fn safe_div(n: i32, d: i32) -> Option { if d == 0 { return None; } Some(n / d) } - But what if you want to return an error, not just None?

Slide 38

Slide 38 text

38 There’s also Result enum Result { Ok(T), Err(E) } - E can be any type, even String

Slide 39

Slide 39 text

39 How to use Results enum Error { DivisionByZero, } fn safe_div(n: i32, d: i32) -> Result { if d == 0 { return Err(Error::DivisionByZero); } Ok(n / d) } - Good practice to define your own error types instead of using strings

Slide 40

Slide 40 text

40 Tedious Results fn do_calc() -> Result { let a = match do_subcalc1() { Ok(val) => val, Err(msg) => return Err(msg), } let b = match do_subcalc2() { Ok(val) => val, Err(msg) => return Err(msg), } Ok(a + b) } - Calling a lot of functions returning a result can become tedious

Slide 41

Slide 41 text

41 The try! Macro fn do_calc() -> Result { let a = try!(do_subcalc1()); let b = try!(do_subcalc2()); Ok(a + b) } - The try! macro does the same thing, unwrap or early return - Error signature must match! - What if the errors don’t match?

Slide 42

Slide 42 text

42 Mapping Errors fn do_subcalc() -> Result { … } fn do_calc() -> Result { let res = do_subcalc(); let mapped = res.map_err(|msg| { println!(“Error: {}”, msg); Error::CalcFailed }); let val = try!(mapped); Ok(val + 1) } - Convert them with helper methods - map_err passes through a successful result while handling an error - Explanation on next slide

Slide 43

Slide 43 text

43 Mapping Errors let mapped = res.map_err(|msg| Error::CalcFailed); is the same as let mapped = match res { Ok(val) => Ok(val), Err(msg) => Err(Error::CalcFailed), } - Many other helper methods like this

Slide 44

Slide 44 text

44 Other Combinator Methods (1) Get the value from an option. Option.unwrap(self) -> T Option.unwrap_or(self, def: T) -> T Option.unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T

Slide 45

Slide 45 text

45 Other Combinator Methods (2) Map an Option to Option or U. Option.map(self, f: F) -> Option where F: FnOnce(T) -> U Option.map_or(self, default: U, f: F) -> U where F: FnOnce(T) -> U Option.map_or_else(self, default: D, f: F) -> U where F: FnOnce(T) -> U, D: FnOnce() -> U

Slide 46

Slide 46 text

46 Other Combinator Methods (3) Convert an option to a result, mapping Some(v) to Ok(v) and None to Err(err). Option.ok_or(self, err: E) -> Result Option.ok_or_else(self, err: F) -> Result where F: FnOnce() -> E - This is only a small selection. - There are similar methods on Result and others. - They all make it easier to work without having null pointers.

Slide 47

Slide 47 text

47 P2: No dangling pointers - Rust programs never try to access a heap-allocated value after it has been freed. - No garbage collection or reference counting involved! - Everything is enforced at compile time. 1 - Not an unusual promise, all type safe languages do this 2 3 - How is this done?

Slide 48

Slide 48 text

48 Three Rules - Rule 1: Every value has a single owner at any given time. - Rule 2: You can borrow a reference to a value, so long as the reference doesn’t outlive the value. - Rule 3: You can only modify a value when you have exclusive access to it. - 1: You can move a value from one owner to another, but when a value’s owner goas away the value is freed along with it. - 2: Borrowed references are temporary pointers; they allow you to operate on values you don’t own.

Slide 49

Slide 49 text

49 Ownership - Variables own their values - A struct owns its fields - An enum owns its values - Every heap-allocated value has a single pointer that owns it - All values are dropped when their owner is dropped -

Slide 50

Slide 50 text

50 Ownership: Scoping { let s = “Chuchichästli”.to_string(); } // s goes out of scope, text is freed - Variables that go out of scope are freed

Slide 51

Slide 51 text

51 Ownership: Move Semantics { let s = “Chuchichästli”.to_string(); // t1 takes ownership from s let t1 = s; // compile-time error: use of moved value s let t2 = s; } - Assigning to variables moves values (most of times)

Slide 52

Slide 52 text

52 Ownership: Copy Trait { let pi = 3.1415926f32; let foo = pi; let bar = pi; // This is fine! } - Types that implement the “Copy” trait (usually primitive types) are copied implicitly - Examples: char, bool, numeric types

Slide 53

Slide 53 text

53 Ownership: Clone Trait { let s = “Chuchichästli”.to_string(); let t1 = s.clone(); let t2 = s.clone(); } - Other types can implement Clone trait for explicit cloning - Three independent String objects - Each is owned by the variable binding

Slide 54

Slide 54 text

54 Ownership: Deriving Copy / Clone #[derive(Copy, Clone)] struct Color { r: u8, g: u8, b: u8 } - Implementing Copy and Clone is trivial for most types - So it can be auto-generated by the compiler - All values must be Copy / Clone too

Slide 55

Slide 55 text

55 But what about this? let s = “Hello, world”.to_string(); print_with_umpff(s); println!(“{}”, s); error: use of moved value: `s` println!(“{}”, s); ^ note: `s` moved here because it has type `collections::string:: String`, which is non-copyable print_with_umpff(s); ^ - Now you know move semantics - Can cause problems though. - Does someone see the problem? - Ownership is moved into function and freed when function returns

Slide 56

Slide 56 text

56 Borrowing let s = “Hello, world”.to_string(); print_with_umpff(&s); println!(“Original value was {}”, s); - The function can borrow the value - Many functions can borrow at the same time, because they cannot modify

Slide 57

Slide 57 text

57 Mutable Borrowing let mut s = “Hello, world”.to_string(); add_umpff(&mut s); println!(“New value is {}”, s); - A mutable borrow grants exclusive access - Only one mutable borrow possible at a time - While you borrow a mutable reference to a value, that reference is the only way to access that value at all.

Slide 58

Slide 58 text

58 Borrowing prevents moving let x = String::new(); let borrow = &x; let y = x; // error: cannot move out of `x` because // it is borrowed - While borrowed, a move must be prevented - Otherwise you might end up with a dangling pointer

Slide 59

Slide 59 text

59 Lifetimes let borrow; let x = String::new(); borrow = &x; // error: `x` does not live // long enough - What is the problem here? - Lifetime of borrow is longer than lifetime of x - This can also be visualized differently:

Slide 60

Slide 60 text

60 Lifetimes { let borrow; { let x = String::new(); borrow = &x; // error: `x` does not live // long enough } } - Now it should be obvious. - Using lifetime checking, the compiler guarantees that there are no dangling pointers.

Slide 61

Slide 61 text

61 Lifetimes - Sometimes the compiler is wrong about automatically inferred lifetimes - He needs more knowledge - Parameters and return values can be annotated with explicit lifetimes - Won’t be covered here :) -

Slide 62

Slide 62 text

62 P3: No buffer overruns - There’s no pointer arithmetic in Rust - Arrays in Rust are not just pointers - Bounds checks, usually at compile time (zero cost abstractions)

Slide 63

Slide 63 text

63 5. Multithreaded Programming

Slide 64

Slide 64 text

64 We’ll make this short - The Rust compiler does not know about concurrency - Everything works based on the three rules - I’ll only show a few examples

Slide 65

Slide 65 text

65 Threads let t1 = std::thread::spawn(|| { return 23; }); let t2 = std::thread::spawn(|| { return 19; }); let v1 = try!(t1.join()); let v2 = try!(t2.join()); println!(“{} + {} = {}”, v1, v2, v1 + v2); - Simple example - No shared data

Slide 66

Slide 66 text

66 Mutexes / Arcs (1) let data = Arc::new(Mutex::new(0)); let data1 = data.clone(); let t1 = thread::spawn(move || { let mut guard = data1.lock().unwrap(); *guard += 19; }); let data2 = data.clone(); let t2 = thread::spawn(move || { let mut guard = data2.lock().unwrap(); *guard += 23; }); - Arc allow multiple references to the same data. (Safe pointers.) Arcs can be cloned. - Value of an Arc gets dropped when references are 0. - Mutexes own their values. Using lock() acquires mutex. - Locking returns a MutexGuard as proxy. When guard is dropped, lock is released. - Not posible to forget about releasing. - Arc pointer moved into closures

Slide 67

Slide 67 text

67 Mutexes / Arcs (2) t1.join().unwrap(); t2.join().unwrap(); let guard = data.lock().unwrap(); assert_eq!(*guard, 42); - Threads need to be joined, otherwise result might not yet be ready. - Again, we need to acquire a Mutex lock. - We don’t need another Arc reference though.

Slide 68

Slide 68 text

68 Channels use std::sync::mpsc::channel; Signature: fn channel() -> (Sender, Receiver) - mpsc: Multiple producers, single consumer - Unidirectional channel - Returns two ends: sender and receiver - Sender can be cloned, but not receiver

Slide 69

Slide 69 text

69 6. Further Reading

Slide 70

Slide 70 text

70 «Why Rust?» Free e-book by O’Reilly, ~50 pages. Highly recommended! This presentation is actually based on that book. http://www.oreilly.com/programming/free/why-rust.csp

Slide 71

Slide 71 text

71 «Rust Book» Not actually a book. Official guide to learning Rust. Great resource. https://doc.rust-lang.org/book/

Slide 72

Slide 72 text

72 7. Questions

Slide 73

Slide 73 text

73 Let’s make it happen. Thank you. Contact: +41 44 542 90 60 [email protected] [email protected]