Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to Rust (with speaker notes)

dbrgn
September 29, 2015

Intro to Rust (with speaker notes)

This is the same presentation as here https://speakerdeck.com/dbrgn/intro-to-rust, but with speaker notes included.

dbrgn

September 29, 2015
Tweet

More Decks by dbrgn

Other Decks in Technology

Transcript

  1. 2 Agenda 1. What is Rust? 2. What’s Type Safety?

    3. Reading Rust 4. Memory Safety in Rust 5. Multithreaded Programming 6. Further Reading 7. Questions
  2. 4 «Rust is a systems programming language that runs blazingly

    fast, prevents nearly all segfaults, and guarantees thread safety.» www.rust-lang.org
  3. 5 What’s wrong with systems languages? - It’s difficult to

    write secure code. - It’s very difficult to write multithreaded code. These are the problems Rust was made to address. 1 - Systems languages have come a long way in the last 50 years - First virus based on a buffer overflow appeared in 1988 - According to Open Source Vulnerability Database, still 10-15% of reported vulns during last 8 years are buffer overflows 2 - Multithreading is becoming more needed with multicore CPUs - C++ like threading is incredibly hard, even experienced programmers write hard to reproduce bugs
  4. 6 Quick Facts about Rust (As of September 2015) -

    Started by Mozilla employee Graydon Hoare - First announced by Mozilla in 2010 - Community driven development - First stable release: 1.0 in May 2015 - Latest stable release: 1.3 - 46'484 commits on Github - Largest project written in Rust: Servo
  5. 7 Features - Zero-cost abstractions - Move semantics - Guaranteed

    memory safety - Threads without data races - Trait based generics - Pattern matching - Type inference - Minimal runtime, no GC - Efficient C bindings
  6. 9 A C Program int main(int argc, char **argv) {

    unsigned long a[1]; a[3] = 0x7ffff7b36cebUL; return 0; } According to C99, undefined behavior. Output: undef: Error: .netrc file is readable by others. undef: Remove password or make file unreadable by others. 1 - We’re overwriting the return address on the stack. - Jumping right into libc 2 - The user is responsible for safety - We’re not good at that
  7. 10 Definitions - If a program has been written so

    that no possible execution can exhibit undefined behavior, we say that program is well defined. - If a language’s type system ensures that every program is well defined, we say that language is type safe
  8. 11 Type Safe Languages - C and C++ are not

    type safe. - Python is type safe: >> a = [0] >>> a[3] = 0x7ffff7b36ceb Traceback (most recent call last): File "", line 1, in <module> IndexError: list assignment index out of range >>> - Java, JavaScript, Ruby, and Haskell are also type safe. 1 - Our sample program had no type erorrs, yet exhibits undefined behavior. 2 - An exception is no undefined behavior. 3 - Every program that type safe languages accept is well defined.
  9. 12 It’s ironic. C and C++ are not type safe.

    Yet they are being used to implement the foundations of a system. Rust tries to resolve that tension. - Rust also allows unsafe code. But the great majority of programs does not need unsafe code.
  10. 14 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Program that calculates the greatest common denominator
  11. 15 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } The fn keyword introduces a function definition. Arrow denotes return value.
  12. 16 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } Variables are immutable by default
  13. 17 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Rust variable declarations: Name followed by type - [uif](8|16|32|64) and usize / isize
  14. 18 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Macro invocations with exclamation mark
  15. 19 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Type is inferred from context or suffix. Otherwise i32.
  16. 20 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - let introduces a local variable - Type is inferred
  17. 21 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Loops and conditions don’t need parentheses, only braces around body
  18. 22 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Blocks are expressions - return statement optional
  19. 23 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } - Program that returns the smaller of two values
  20. 24 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } - Text marks it as generic function - Defined for any type T where T is Ord - If a type is Ord, it supports a comparison - Ord is a Trait, will be handled later
  21. 25 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } ... min(10i8, 20) == 10; // T is i8 min(10, 20u32) == 10; // T is u32 min(“abc”, “xyz”) == “abc”; // Strings are Ord min(10i32, “xyz”); // error: mismatched types. -
  22. 26 Example 3: Generic Types struct Range<Idx> { start: Idx,

    end: Idx, } ... Range { start: 200, end: 800 } // OK Range { start: 1.3, end: 4.7 } // Also OK - A struct holds data - Can be generic
  23. 27 Example 4: Enumerations enum Option<T> { Some(T), None }

    - Algebraic Datatypes - For any type T, an Option<T> may be either: - None, which carries no value - Some(v) which carries the value v of type T - Resemble unions in C / C++, but remember their value type
  24. 28 Example 5: Application of Option<T> fn safe_div(n: i32, d:

    i32) -> Option<i32> { if d == 0 { return None; } Some(n / d) } - Function that returns a division result, or None if divisor is 0 - (Could also be written as a single expression)
  25. 29 Example 6: Matching an Option match safe_div(num, denom) {

    None => println!(“No quotient.”), Some(v) => println!(“Quotient is {}.”, v) } - Similar to a switch statement, but more powerful - Rust offers combinator methods for simplification, more on this later
  26. 30 Example 7: Traits trait HasArea { fn area(&self) ->

    f64; } - Any type implementing HasArea must provide a method called “area” that takes no parameters and returns an f64
  27. 31 Example 8: Trait Implementation struct Circle { x: f64,

    y: f64, radius: f64, } impl HasArea for Circle { fn area(&self) -> f64 { consts::PI * (self.radius * self.radius) } } - Traits are implemented for structs - Other way of looking at it: Methods (behavior) are attached to data
  28. 32 Example 9: Default Methods trait Validatable { fn is_valid(&self)

    -> bool; fn is_invalid(&self) -> bool { !self.is_valid() } } - A trait can provide default implementations - Can be overridden
  29. 33 Example 10: Trait Composition trait Foo { fn foo(&self);

    } trait FooBar : Foo { fn foobar(&self); } - Any trait implementing FooBar must also implement Foo - Example: A trait Number that requires to implement Add, Sub, Mul, Div
  30. 35 Three Key Promises - No null pointer dereferences -

    No dangling pointers - No buffer overruns 1 - Your program will not crash because you tried to dereference a null pointer 2 - Every value will live as long as it must 3 - Your program will never access elements outside of an array All ensured at compile time. What do they mean?
  31. 36 P1: No null pointer dereferences - Null pointers are

    useful - They can indicate the absence of optional information - They can indicate failures - But they can introduce severe bugs - Rust separates the concept of a pointer from the concept of an optional or error value - Optional values are handled by Option<T> - Error values are handled by Result<T, E> - Many helpful tools to do error handling
  32. 37 You already saw Option<T> fn safe_div(n: i32, d: i32)

    -> Option<i32> { if d == 0 { return None; } Some(n / d) } - But what if you want to return an error, not just None?
  33. 38 There’s also Result<T, E> enum Result<T, E> { Ok(T),

    Err(E) } - E can be any type, even String
  34. 39 How to use Results enum Error { DivisionByZero, }

    fn safe_div(n: i32, d: i32) -> Result<i32, Error> { if d == 0 { return Err(Error::DivisionByZero); } Ok(n / d) } - Good practice to define your own error types instead of using strings
  35. 40 Tedious Results fn do_calc() -> Result<i32, String> { let

    a = match do_subcalc1() { Ok(val) => val, Err(msg) => return Err(msg), } let b = match do_subcalc2() { Ok(val) => val, Err(msg) => return Err(msg), } Ok(a + b) } - Calling a lot of functions returning a result can become tedious
  36. 41 The try! Macro fn do_calc() -> Result<i32, String> {

    let a = try!(do_subcalc1()); let b = try!(do_subcalc2()); Ok(a + b) } - The try! macro does the same thing, unwrap or early return - Error signature must match! - What if the errors don’t match?
  37. 42 Mapping Errors fn do_subcalc() -> Result<i32, String> { …

    } fn do_calc() -> Result<i32, Error> { let res = do_subcalc(); let mapped = res.map_err(|msg| { println!(“Error: {}”, msg); Error::CalcFailed }); let val = try!(mapped); Ok(val + 1) } - Convert them with helper methods - map_err passes through a successful result while handling an error - Explanation on next slide
  38. 43 Mapping Errors let mapped = res.map_err(|msg| Error::CalcFailed); is the

    same as let mapped = match res { Ok(val) => Ok(val), Err(msg) => Err(Error::CalcFailed), } - Many other helper methods like this
  39. 44 Other Combinator Methods (1) Get the value from an

    option. Option.unwrap(self) -> T Option.unwrap_or(self, def: T) -> T Option.unwrap_or_else<F>(self, f: F) -> T where F: FnOnce() -> T
  40. 45 Other Combinator Methods (2) Map an Option<T> to Option<U>

    or U. Option.map<U, F>(self, f: F) -> Option<U> where F: FnOnce(T) -> U Option.map_or<U, F>(self, default: U, f: F) -> U where F: FnOnce(T) -> U Option.map_or_else<U, D, F>(self, default: D, f: F) -> U where F: FnOnce(T) -> U, D: FnOnce() -> U
  41. 46 Other Combinator Methods (3) Convert an option to a

    result, mapping Some(v) to Ok(v) and None to Err(err). Option.ok_or<E>(self, err: E) -> Result<T, E> Option.ok_or_else<E, F>(self, err: F) -> Result<T, E> where F: FnOnce() -> E - This is only a small selection. - There are similar methods on Result and others. - They all make it easier to work without having null pointers.
  42. 47 P2: No dangling pointers - Rust programs never try

    to access a heap-allocated value after it has been freed. - No garbage collection or reference counting involved! - Everything is enforced at compile time. 1 - Not an unusual promise, all type safe languages do this 2 3 - How is this done?
  43. 48 Three Rules - Rule 1: Every value has a

    single owner at any given time. - Rule 2: You can borrow a reference to a value, so long as the reference doesn’t outlive the value. - Rule 3: You can only modify a value when you have exclusive access to it. - 1: You can move a value from one owner to another, but when a value’s owner goas away the value is freed along with it. - 2: Borrowed references are temporary pointers; they allow you to operate on values you don’t own.
  44. 49 Ownership - Variables own their values - A struct

    owns its fields - An enum owns its values - Every heap-allocated value has a single pointer that owns it - All values are dropped when their owner is dropped -
  45. 50 Ownership: Scoping { let s = “Chuchichästli”.to_string(); } //

    s goes out of scope, text is freed - Variables that go out of scope are freed
  46. 51 Ownership: Move Semantics { let s = “Chuchichästli”.to_string(); //

    t1 takes ownership from s let t1 = s; // compile-time error: use of moved value s let t2 = s; } - Assigning to variables moves values (most of times)
  47. 52 Ownership: Copy Trait { let pi = 3.1415926f32; let

    foo = pi; let bar = pi; // This is fine! } - Types that implement the “Copy” trait (usually primitive types) are copied implicitly - Examples: char, bool, numeric types
  48. 53 Ownership: Clone Trait { let s = “Chuchichästli”.to_string(); let

    t1 = s.clone(); let t2 = s.clone(); } - Other types can implement Clone trait for explicit cloning - Three independent String objects - Each is owned by the variable binding
  49. 54 Ownership: Deriving Copy / Clone #[derive(Copy, Clone)] struct Color

    { r: u8, g: u8, b: u8 } - Implementing Copy and Clone is trivial for most types - So it can be auto-generated by the compiler - All values must be Copy / Clone too
  50. 55 But what about this? let s = “Hello, world”.to_string();

    print_with_umpff(s); println!(“{}”, s); error: use of moved value: `s` println!(“{}”, s); ^ note: `s` moved here because it has type `collections::string:: String`, which is non-copyable print_with_umpff(s); ^ - Now you know move semantics - Can cause problems though. - Does someone see the problem? - Ownership is moved into function and freed when function returns
  51. 56 Borrowing let s = “Hello, world”.to_string(); print_with_umpff(&s); println!(“Original value

    was {}”, s); - The function can borrow the value - Many functions can borrow at the same time, because they cannot modify
  52. 57 Mutable Borrowing let mut s = “Hello, world”.to_string(); add_umpff(&mut

    s); println!(“New value is {}”, s); - A mutable borrow grants exclusive access - Only one mutable borrow possible at a time - While you borrow a mutable reference to a value, that reference is the only way to access that value at all.
  53. 58 Borrowing prevents moving let x = String::new(); let borrow

    = &x; let y = x; // error: cannot move out of `x` because // it is borrowed - While borrowed, a move must be prevented - Otherwise you might end up with a dangling pointer
  54. 59 Lifetimes let borrow; let x = String::new(); borrow =

    &x; // error: `x` does not live // long enough - What is the problem here? - Lifetime of borrow is longer than lifetime of x - This can also be visualized differently:
  55. 60 Lifetimes { let borrow; { let x = String::new();

    borrow = &x; // error: `x` does not live // long enough } } - Now it should be obvious. - Using lifetime checking, the compiler guarantees that there are no dangling pointers.
  56. 61 Lifetimes - Sometimes the compiler is wrong about automatically

    inferred lifetimes - He needs more knowledge - Parameters and return values can be annotated with explicit lifetimes - Won’t be covered here :) -
  57. 62 P3: No buffer overruns - There’s no pointer arithmetic

    in Rust - Arrays in Rust are not just pointers - Bounds checks, usually at compile time (zero cost abstractions)
  58. 64 We’ll make this short - The Rust compiler does

    not know about concurrency - Everything works based on the three rules - I’ll only show a few examples
  59. 65 Threads let t1 = std::thread::spawn(|| { return 23; });

    let t2 = std::thread::spawn(|| { return 19; }); let v1 = try!(t1.join()); let v2 = try!(t2.join()); println!(“{} + {} = {}”, v1, v2, v1 + v2); - Simple example - No shared data
  60. 66 Mutexes / Arcs (1) let data = Arc::new(Mutex::new(0)); let

    data1 = data.clone(); let t1 = thread::spawn(move || { let mut guard = data1.lock().unwrap(); *guard += 19; }); let data2 = data.clone(); let t2 = thread::spawn(move || { let mut guard = data2.lock().unwrap(); *guard += 23; }); - Arc allow multiple references to the same data. (Safe pointers.) Arcs can be cloned. - Value of an Arc gets dropped when references are 0. - Mutexes own their values. Using lock() acquires mutex. - Locking returns a MutexGuard as proxy. When guard is dropped, lock is released. - Not posible to forget about releasing. - Arc pointer moved into closures
  61. 67 Mutexes / Arcs (2) t1.join().unwrap(); t2.join().unwrap(); let guard =

    data.lock().unwrap(); assert_eq!(*guard, 42); - Threads need to be joined, otherwise result might not yet be ready. - Again, we need to acquire a Mutex lock. - We don’t need another Arc reference though.
  62. 68 Channels use std::sync::mpsc::channel; Signature: fn channel<T>() -> (Sender<T>, Receiver<T>)

    - mpsc: Multiple producers, single consumer - Unidirectional channel - Returns two ends: sender and receiver - Sender can be cloned, but not receiver
  63. 70 «Why Rust?» Free e-book by O’Reilly, ~50 pages. Highly

    recommended! This presentation is actually based on that book. http://www.oreilly.com/programming/free/why-rust.csp
  64. 71 «Rust Book» Not actually a book. Official guide to

    learning Rust. Great resource. https://doc.rust-lang.org/book/