Multithreading, concurrency and async programming in Rust

Multithreading, concurrency and async programming

What about this talk Multithreading & Concurrency -Basics & thread
API -Communications between threads -Useful crates for multithreading & concurrency Asynchronous programming -A brief overlook of async/await syntax -Approaches for writing async applications -Usage of blocking calls in asynchronous code -Good practices in async programming -Useful crates for async stuﬀ

Multithreading & Concurrency

General information Multithreaded & concurrency programming is cool because: -
It gives an opportunity to run multiple tasks simultaneously - Using all possible resources when it’s done right - And therefore gives a better performance!* *but depends from the task: in some cases single-threaded processing can be faster

General information Multithreaded and concurrency programming are easy hard and
can lead to problems, such as: - Race conditions, where threads are accessing data or resources in an inconsistent order* - Deadlocks, where two threads are waiting for each other to ﬁnish using a resource the other thread has, preventing both threads from continuing - Introducing new kind of bugs that happen only in certain conditions and are hard to reproduce and ﬁx them properly *and therefore need to use synchronization primitives or lock-free structures

Basics & thread API

The threading model - Each thread has its own stack
and local state - The thread name and the size can be conﬁgured - Communication can be done through channels, thread synchronisation primitives or shared-memory data structures In addition to it, the Rust compiler also helps with: - Ownership rules applied for multiple threads - Checks for having either only the one mutable reference OR many immutable references to any resource

A simple example

A simple example - Compiler noticed that we’re trying to
re-use data from the main thread - Adding the move keyword will ﬁx the issue

Thread API Spawning a thread can be done with the
thread::spawn function: The parent thread can wait on the completion of the child thread. Any call to spawn produces a JoinHandle, which provides a join method for waiting:

Thread API For conﬁguring threads use the thread::Builder type: The
default stack size for spawned threads is 2 mebibyte (MiB) However, developers have two ways to change the stack size for spawned threads: - Pass the desired stack size to Builder::stack_size method - Set the RUST_MIN_STACK environment variable to an integer representing the desired stack size (in bytes) The stack size of the main thread is not determined by Rust.

Thread API Developers also have a way to declare variables
in the thread local context. They are instantiated with the thread_local! macro and the primary method is the with call:

Communication between threads

Communication between threads The Rust standard library provides a lot
of useful primitives: - Cell<T>, RefCell<T>, Rc<T>* - Message-based communication over channels (std::sync::mpsc) - Arc<T>, Barrier, Semaphore, Mutex<T>, RwLock<T> - Atomics for lockless concurrent programming *but keep in mind these types have certain restrictions

Cell<T> and RefCell<T> - Allows mutability in a controlled manner,
even in the presence of aliasing - The value can be accessed through set / get methods - The Cell<T> works only with Copy types (e.g. primitive types or user types deriving the Copy trait) - The RefCell<T> gives the similar functionality to Cell<T>, but operates over the reference, either mutable or immutable* *use it with caution, because it can trigger panics More info: Rust docs - Cell and RefCell

Rc<T> - A smart pointer that gives an opportunity to
share data by the reference - Increases a reference counter on cloning, decreasing when go out of scope - Mutation is disallowed by default: use in pair with the Cell / RefCell types for mutability - Cannot be sent between threads More info: Rust docs - Rc

Channels - Using for sharing data between threads - mpsc
stands for “multiple producers” - “single consumer” - The channel() call returns two parts - transmitter and receiver - Each thread is passed a copy of the transmitter with clone, and calls send - The receiver uses recv for reading values More info: Rust docs - mpsc

Arc<T> - Arc stands for “Atomic Reference Counting” - Identical
to the Rc<T> but designed for thread-safe cases - Good choice for cases with a read-only access More info: Rust docs - Arc

Barrier - Enables multiple threads to synchronize the beginning of
some computation - Good choice for cases with requirements for an execution alignment in diﬀerent code points More info: Rust docs - Barrier

Mutex<T> - A mutual exclusion primitive for protecting shared data
across threads - The data can only be accessed through the RAII guard returned from the lock and the try_lock methods, which guarantees that the data is only ever accessed when the mutex is locked More info: Rust docs - Mutex

RwLock<T> - Allows a number of readers or at most
one writer at any point in time - The read or try_read methods gives only the read-only access - The write or try_write methods allows modiﬁcation of the underlying data with the exclusive access More info: Rust docs - RwLock

Atomics - Inherits the memory model for atomics from C++20
- Each method takes an Ordering argument which represents the strength of the memory barrier for that operation: - Ordering::Relaxed No ordering constraints, only atomic operations - Ordering::Release All previous operations become ordered before any load of this value with Acquire (or stronger) ordering This ordering is only applicable for operations that can perform a store - Ordering::Acquire When coupled with a load, if the loaded value was written by a store operation with Release (or stronger) ordering, then all subsequent operations become ordered after that store This ordering is only applicable for operations that can perform a load - Ordering::AcqRel Has the eﬀects of both Acquire and Release together This ordering is only applicable for operations that combine both loads and stores - Ordering::SeqCst Like Acquire/Release/AcqRel (for load, store, and load-with-store operations, respectively) with the additional guarantee that all threads see all sequentially consistent operations in the same order

Atomics The usage is quite straightforward: - Acquire a location
of memory to begin the critical section - And release that location to end it For instance, a simple spinlock could be implemented like this: More info: The Rustonomicon - Atomics

Send and Sync traits - Special type of traits, which
are dictating whether it is safe to transfer ownership or pass a reference into another thread - Automatically implemented for the most standard types in Rust - Designed as the “markers”, so therefore they have no methods and don’t provide any built-in functionality - Implementing those traits manually is unsafe, because a developer has to ensure the invariants are upheld* The deﬁnitions are quite simple: - A type is Send if it is safe to send it to another thread - A type is Sync if it is safe to share between threads (&T is Send) *instead of it wrap your own types into one of the primitives (or their combination) More info: Rust book

Summary Rust helps developers to avoid common pitfalls in concurrency:
- A channel transfers ownership of the messages sent along it, so you can send a pointer from one thread to another without fear of the threads later racing for access through that pointer. Rust's channels enforce thread isolation - A lock knows what data it protects, and Rust guarantees that the data can only be accessed when the lock is held. State is never accidentally shared. "Lock data, not code" is enforced in Rust - Every data type knows whether it can safely be sent between or accessed by multiple threads, and Rust enforces this safe usage; there are no data races, even for lock-free data structures. Thread safety isn't just documentation; it's law - You can even share stack frames between threads, and Rust will statically ensure that the frames remain active while other threads are using them. Even the most daring forms of sharing are guaranteed safe in Rust Source: Fearless concurrency in Rust

Summary Cell RefCell Rc Receiver (channel) Sender (channel) Arc Barrier
Mutex RwLock Atomics Clone ✅ ✅ ✅ ✅ ❌ ✅ ❌ ❌ ❌ ❌ Send ✅ ✅ ❌ ✅ ✅ ✅ ✅ ✅ ✅ ❌ Sync ❌ ❌ ❌ ❌ ❌ ✅ ✅ ✅ ✅ ✅ - Cell, RefCell, Rc are designed to use in the context of the single thread - Channels, Arc, Barrier, Mutex, RwLock, Atomics are designed for multithreading

Useful crates

Rayon - A lightweight library for data parallelism - Guarantees
data-race freedom - Dynamically adapts for maximum performance - For the most, parallel iterators in particular are guaranteed to produce the same results as their sequential counterparts - Provides API for implementing your own parallel tasks and tweaking threadpools

Crossbeam - A universal library for concurrent programming in Rust
- Aiming to provide a rich set of tools for concurrency akin to java.util.concurrent and outdo Go channels in features and performance. - Provides a lot of additional primitives. For example: - AtomicCell<T>, which works just like Cell<T>, except it can also be shared among threads - CachePadded, for padding and aligning a value to the length of a cache line - WaitGroup, which allows threads to synchronize the beginning or the end of some computation. Inspired by Go’s WaitGroup from its standard library

Loom - A tool for concurrency permutation testing - Runs
tests multiple times, permuting the possible concurrent executions for each test under the C++11 memory model - Uses state reduction techniques to avoid combinatorial explosion

Async programming

Basic information async syntax and blockers: - async/await syntax stabilized
in 1.39 [in stable] / RFC 2394 / #50547 - std::task and std::future stabilized in 1.36 [in stable] / RFC 2592 / #59113 - Pin APIs stabilized in 1.33 [in stable] / RFC 2349 / #49150 - Pin as a method receiver stabilized in 1.33 [in stable] / RFC 2362 / #55786 - 2018 edition stabilized in 1.31 [in stable] - async as a keyword in 2018 edition stabilized in 1.28 [in stable] - impl Trait in return position stabilized in 1.26 [in stable] / RFC 1522 / #34511 Source: areweasyncyet.rs Future extensions: - async fn in trait method not stabilized yet ◦ Workaround is available as an attribute macro: async-trait ◦ Generic associated types (GAT) not stabilized yet / RFC 1598 / #44265 ◦ Named existentials and impl Trait variable declarations not stabilized yet / RFC 2071 / #63066 - Async iterators or stream unresolved - Async closures not stabilized yet / RFC 2394

Futures & combinators - The idea behind the “futures” concept
is to express a value that not ready yet - On a high level consider futures as a building block for doing a lot of concurrent operations in time - All futures in Rust are poll-based - All futures describe the steps to execution and will be polled only when they were spawned in the executor - And therefore compiler generates an inner state machine for the given future - The implementation of futures is independent from any of existing event loops - Futures can be declared in an imperative or in a functional way

Futures & combinators Full source code: imperative & functional Imperative
Functional

Futures & combinators Method name Description and_then Executes another future
after this one resolves successfully. The success value is passed to a closure to create this subsequent future then Chain on a computation for when a future finished, passing the result of the future to the provided closure join Joins the result of two futures, waiting for them both to complete map Maps this future's success value to a different value map_err Maps this future's error value to a different value or_else Executes another future if this one resolves to an error. The error value is passed to a closure to create this subsequent future select Waits for either one of two differently-typed futures to complete Future combinators docs: FutureExt The most used future combinators:

Futures & combinators Documentation: TryFutureExt::and_then and_then - Executes another future
after this one resolves successfully. The success value is passed to a closure to create this subsequent future - The provided closure f will only be called if this future is resolved to an Ok - Calling and_then on an errored future has no eﬀect

Futures & combinators Documentation: FutureExt::then then - Chain on a
computation for when a future ﬁnished, passing the result of the future to the provided closure f - The returned value of the closure must implement the Future trait and can represent some more work to be done before the composed future is ﬁnished - The closure f is only run after successful completion of the self future

Futures & combinators Documentation: FutureExt::join join - Joins the result
of two futures, waiting for them both to complete - This function will return a new future which awaits both futures to complete. The returned future will ﬁnish with a tuple of both results

Futures & combinators Documentation: FutureExt::map map - Map this future's
output to a diﬀerent type, returning a new future of the resulting type - This function is similar to the Option::map or Iterator::map where it will change the type of the underlying future. This is useful to chain along a computation once a future has been resolved

Futures & combinators Documentation: TryFutureExt::map_err map_err - Maps this future's
error value to a different value - This method can be used to change the Error type of the future into a different type - Calling map_err on a successful future has no effect

Futures & combinators Documentation: TryFutureExt::or_else or_else - Executes another future
if this one resolves to an error. The error value is passed to a closure to create this subsequent future - If this future resolves to an Ok, panics, or is dropped, then the provided closure will never be invoked. The Ok type of this future and the future returned by f have to match.

Futures & combinators Documentation: select function select - Waits for
either one of two diﬀerently-typed futures to complete - This function will return a new future which awaits for either one of both futures to complete. The returned future will ﬁnish with both the value resolved and a future representing the completion of the other work

Pin<T> and Unpin trait - Is usually talked about in
the context of async/await (but not limited to) - The pinning concepts was added because of the issues that async/await run into To get an idea why we need pinning in Rust, let’s look on the following example: Initial buﬀer Self-references to the buﬀer Most types don't have a problem being moved. These types implement a trait called Unpin. Pointers to Unpin types can be freely placed into or taken out of Pin. For example, u8 is Unpin, so Pin<&mut u8> behaves just like a normal &mut u8. The Pin type wraps pointer types, guaranteeing that the values behind the pointer won't be moved. For example, Pin<&mut T>, Pin<&T> or Pin<Box<T>> all guarantee that T won't be moved. So we need somehow to update all self-references when the structure moves. More info: Async book - Pinning

Before async/await syntax In the beginning was non-trivial: - API
of the Tokio crate had breaking changes - Had a lack of crates for covering common use cases with async - Explicit calls to executor for spawning futures and entry points - Hard to write and to maintain, especially for beginners But in the same time: - Quite eﬃcient and fast - Had a stable and a predictable behaviour - Debugging process wasn’t hard - Were invented a lot of experimental crates to use

Old approach Request handler Connection handler Server entry point Future
combinators

Old approach Make a presumption what was the issue here
… * *spoiler: the error wasn’t here at all, although it pointed to the right context

With async / await syntax After the 1.39 release we
ﬁnally can: - Use async/await keywords in function signatures and calls - Much easier propagate errors with a help of the ? operator - Let compiler converts Result<T, E> instances into compatible impl Future<Item=T, Error=E> types However, keep in mind that: - In some cases developers still need to use the old approach - Futures still require to be Send + Sync + ‘static - Most of things that we did with the old approach now handles implicitly

Simple example with the event loop Full source code: echo
example Currently Rust have many diﬀerent implementations of executors which can be found in tokio, async-std or bastion crates. But to keep it simple, let’s take a look on the example with tokio: An entry point Cretes a new future in a background Async read Async write Connection handler

Usage blocking calls in async Existing event loops presume that
tasks will only run for short periods at a time before yielding back to the thread pool. But sometimes developers need to do blocking calls, for example: - Perform synchronous operations - Don’t have async clients that cover our needs, but have blocking implementations For executing blocking calls developer can: - Use the appropriate method depends on the crate: - tokio_threadpool::blocking method in the tokio crate - bastion::blocking macro in the bastion crate - async_std::task::block_on function in the async-std crate - Do a regular async_std::spawn call, because it measures the time that requires to perform the operation and then spawn the task in background when it needed

Good practices in async programming From my personal experience I
recommend to: - Minimize the size of chained futures because we want to be productive and have small compilation time in summary - Enable logger (e.g. use the log crate) and add appropriate calls - When debugging: Use dbg! / println! macro when it’s necessary Set RUST_BACKTRACE=1 in env for getting stack traces - Do load testing for the result program, because it helps to ﬁnd issues and bugs on earlier stages. For example you can try: Gatling (Scala) MZBench (Erlang) Locust (Python) Artillery (JavaScript)

Fixing ownership issues

Way #1: Using the .clone() calls

Way #1: Using the .clone() calls Pros: - One of
the easiest ways to ﬁx ownership issues - Good ﬁts for cases when necessary to use it once and don’t need to pass it through the all chain of futures Cons: - Can create an object copy if it isn't a reference type - The more clone calls you do, the harder code will be to maintain

Way #1: Using the .clone() calls Transferring ownership

Way #2: Forwarding arguments Pros: - Eliminates a lot of
potential issues with ownership model - An ideal solution when needs to use data everywhere - Easier to maintain codebase Cons: - Requires refactoring code (sometimes signiﬁcantly) - The used wrappers / structures must be thread-safe

Way #2: Forwarding arguments Transferring ownership Re-usage and pass further

Useful crates

Tokio Provides an event-driven, non-blocking I/O platform for writing asynchronous
applications with the Rust programming language. Features: - A multithreaded, work-stealing based task scheduler - A reactor backed by the operating system's event queue (epoll, kqueue, IOCP, etc.) - Asynchronous TCP and UDP sockets - Cancellation and backpressure features out-of-the-box - Scales well without adding overhead to the application, allowing it to thrive in resource constrained environments

Async-std A mature and stable port of the Rust standard
library, designed to make async programming easy, eﬃcient, worry- and error-free. Features: - Asynchronous primitives for ﬁlesystem and network I/O - Exposes a task in a model similar to the thread module in std - async/await compatible versions of primitives (e.g. Arc, Mutex) - Provides the similar API that the Rust standard library has but for async purposes - Has the built-in blocking calls detection mechanism - Compatible with other popular crates (e.g. Tokio, Bastion)

Actix Actor framework for concurrent applications in Rust Features: -
Built on the top of Tokio - Based on futures for asynchronous message handling - Actor communication in a local / thread context - Lightweight supervision support out-of-the-box - Typed messages (no Any type). Generic message are allowed. - Built for extensible and fast applications Written in Microsoft by Nikolai Kim during his work on Azure IoT services. Open sourced in 2017. Battle-tested in production.

Project that focuses on building highly-available, fault-tolerant runtime systems with
dynamic, dispatch-oriented, lightweight process model Features: - Actors model out-of-the-box - Built for utilizing all system resources efficiently - Heavily inspired by Erlang OTP / Akka framework - Message-based communication between actors - Completely asynchronous runtime with NUMA-aware and cache- affine SMP executor* - Supervisors support for managing actor’s lifecycle and more…** *currently available with the “unstable” feature flag **ping me for more info, I am the one of the core developers in the project ;) Bastion

Books & links - [Chapter] The Rust Programming Language: Fearless
concurrency - [Chapter] A Gentle Introduction To Rust: Threads, Networking and Sharing - Tokio.rs documentation: Getting started - Async programming in Rust

Multithreading, concurrency and async programmi...

Multithreading, concurrency and async programming in Rust

More Decks by Valeryi Savich

Other Decks in Programming

Featured

Transcript