Slide 1

Slide 1 text

Compile-Time Social Coordination

Slide 2

Slide 2 text

A biologist,

Slide 3

Slide 3 text

a physicist,

Slide 4

Slide 4 text

and a Rust programmer

Slide 5

Slide 5 text

are observing a house.

Slide 6

Slide 6 text

Two people go into the house.

Slide 7

Slide 7 text

A while later,

Slide 8

Slide 8 text

three people exit the house. The biologist says…

Slide 9

Slide 9 text

"They reproduced."

Slide 10

Slide 10 text

The physicist says…

Slide 11

Slide 11 text

"There was an error in the measurement.

Slide 12

Slide 12 text

The Rust programmer says, "There are now

Slide 13

Slide 13 text

-1 -1 persons in the house” Three people, three di ff erent sets of assumptions about how the world works.

Slide 14

Slide 14 text

I've been programming for a long time. More than 20 years now. Over that period, I've held several jobs across various unrelated industries, with diverse teams, and used many programming languages. Each scenario came with its own unique problems. But, in every environment, the same issue has come up over and over again.

Slide 15

Slide 15 text

That issue was maintaining consistency in code written by di ff erent people and at di ff erent times.

Slide 16

Slide 16 text

A well-architected, well-implemented program is internally consistent. There are patterns and design choices that are adhered to in remote locations across the code. When done well, it's a beautiful thing. But if you change or add code, how do you keep consistency with the design choices already made?

Slide 17

Slide 17 text

Let me give you an example of what can go wrong when consistency is not maintained.

Slide 18

Slide 18 text

klass Poison(entity) onTick() entity.health -= 1 Here's a fi ctional game component in a made-up, maximally terse, and permissive language called HazardLang.

Slide 19

Slide 19 text

klass Poison(entity) onTick() entity.health -= 1 We have a Poison class, which encapsulates game component logic.

Slide 20

Slide 20 text

klass Poison(entity) onTick() entity.health -= 1 And an event that damages the entity with each tick of the game loop by subtracting 1 from it’s health.

Slide 21

Slide 21 text

klass Poison(entity) onTick() entity.health -= 1 Can you spot the bug? It's a trick question. The bug can't be seen when looking at this code in isolation. The bug only manifests when this code is used within a system of components that make di ff erent sets of assumptions. Let's look at another component that, in practice, would be in another fi le.

Slide 22

Slide 22 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 Being in another fi le may make it less likely that you ever see the code side by side.

Slide 23

Slide 23 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 This missile class

Slide 24

Slide 24 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 has an event, onCollide

Slide 25

Slide 25 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 that damages any entity it collides with.

Slide 26

Slide 26 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 The di ff erence between this component and the previous

Slide 27

Slide 27 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 is that it locks the entity before modifying its state.

Slide 28

Slide 28 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 There is no obvious problem with either the Poison or Missile component taken individually. Only by looking at both simultaneously can we see that each makes di ff erent assumptions about the system they belong to. One locks the entity before modifying its state, and the other does not.

Slide 29

Slide 29 text

klass Missile onCollide(entity) lock(entity) entity.health -= 5 klass Poison(entity) onTick() entity.health -= 1 Which one is correct? This is another trick question. No component by itself can be either correct or incorrect. The whole program is only correct if every component agrees with the same set of assumptions as every other component. The Poison component may assume that the system has two copies of the state, one copy for updating, while another thread looks at a read-only copy for rendering. The Missile component may assume the system allows concurrent entity updates.

Slide 30

Slide 30 text

The larger the program, the more chances there are for inconsistencies. Since any line could create an inconsistency with any other line, the potential for inconsistency grows exponentially as a function of the program's size. This is why large-scale development is di ffi cult. Over the last 20 years, much of my attention has been devoted to cultivating practices for scaling codebases without creating inconsistency. What did I learn?

Slide 31

Slide 31 text

Cease development, read all 27 million lines of the program and its dependencies. After a month, when you fully understand how each line interacts with every other line, make a small change and hope nobody else touched anything over that time. Repeat until you go out of business. I'm being sarcastic. What are some practical things we do?

Slide 32

Slide 32 text

We write comments! Comments help, but not enough. The architecture is spread out all over the code. To explain how every line conforms with the overall design would be redundant, so people don't do that. Worse, if you are writing new code, any relevant comments would be, by de fi nition, somewhere else where you can't see them. You can't have new code agree with the comments if you don't know the comments exist.

Slide 33

Slide 33 text

We gain experience and share tribal knowledge! It helps, but not enough. One way we disseminate lessons learned over time is by communicating "best practices." An example of a best practice is "No global variables." But, nobody agrees on what the best practices should be. The reason nobody agrees is that everything is situational. If you are writing embedded software, global variables may indeed be the best answer to a problem.

Slide 34

Slide 34 text

So, best practices are more like "ideas to consider because I got into trouble a few times." As such, best practices are not a solution to fi guring out what constraints apply to the code you are writing right now. The neglected company wiki is not the answer, nor is code review. I could go on, but that's not the point.

Slide 35

Slide 35 text

Here is the point. The more I consider the problem of maintaining consistency, the more I am convinced that the best return on our investment as a discipline is in the compiler and compiler-aided solutions. Unlike communication or mentorship, the compiler scales to any team size, giving personalized advice to every contributor exactly when they need it. Unlike a company wiki, the compiler cannot be ignored or out of date. Unlike some random blog post on the internet, the compiler has complete knowledge of your project's source.

Slide 36

Slide 36 text

For the remainder of the time, I want to tell you how Rust's compiler and standard API work together to create a pit of success for compile-time social coordination, starting with the locked entity example. We'll also look at how you can leverage the same features in your libraries to create consist pits of success of your own.

Slide 37

Slide 37 text

let mutex = Mutex :: new(entity); Let's look at how a Mutex is used in Rust, preventing the problem we saw with the game components. The fi rst line moves the entity into the Mutex.

Slide 38

Slide 38 text

let mutex = Mutex :: new(entity); entity.health -= 1; Now, if we try to access the entity data without locking, the whole program fails to compile. The compiler will tell us that the entity cannot be accessed because it’s been moved into the Mutex. So, we roll that back.

Slide 39

Slide 39 text

let mutex = Mutex :: new(entity); [drink]

Slide 40

Slide 40 text

let mutex = Mutex :: new(entity); mutex.lock().unwrap().health -= 1; Only by locking the Mutex can we modify the underlying data. Past the 2nd line, the lock is no longer held. And, once again the data in the entity is not accessible. For some of you, this is not new information.

Slide 41

Slide 41 text

Of course, a Mutex would own its data in Rust. Of course, you cannot modify data while the lock is not held. But, I want for you to re-discover the Joy and quiet brilliance of this API. Some people new to Rust may be hearing it for the fi rst time. I am con fi dent they didn't hear it before Rust because this API is impossible in C++, C#, Python, JavaScript, Java, PHP, Haskel, Go, Swift, or any other mainstream language.

Slide 42

Slide 42 text

People coming from these languages may have recently spent time debugging a forgotten call to lock. Or, they may have recently stored a reference to data protected by the lock to use later on the UI thread. Or maybe they locked the wrong lock, or forgot to release the lock. None of these bugs are possible in Rust.

Slide 43

Slide 43 text

We can go even further. The consistency guarantee a ff orded by the Mutex API allows us to make some wild optimizations that would never get past code review in any other language.

Slide 44

Slide 44 text

There is an innovative API on Mutex get_mut which takes a unique reference to self, returning a wrapper over a unique reference to our data. The comment for get_mut says “Since this call borrows the Mutex mutably NO ACTUAL LOCKING needs to take place - the mutable borrow statically guarantees no locks exist.”

Slide 45

Slide 45 text

So there is a way to get access to the data in the entity without locking, but it’s checked by the compiler! Whoah. Imagine committing some code in another language with a comment stating a lock does not have to be acquired because we know nothing else has access to the data at this point.

Slide 46

Slide 46 text

First, you would not be trusted to make this assertion. Even if your claim were veri fi ed, the change would likely be rejected in review because we do not know if the assertion will hold *in the future*. This is again the same social coordination problem playing out across time.

Slide 47

Slide 47 text

Because the social coordination problem is so hard, we've become accustomed to the habit of engineering sub-optimal software to avoid making mistakes. We've even invented entire architectures that are elaborate kiddie gloves hiding the important part of the problem we actually are trying to solve - transforming the data.

Slide 48

Slide 48 text

What at fi rst glance appears to be a restriction imposed by the compiler actually grants us the freedom to write better software without being constrained by the limitations of communication and shared knowledge between its authors.

Slide 49

Slide 49 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) Let's look at another example in HazardLang. This time, of serialization.

Slide 50

Slide 50 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) Here we have a generic serialization function in HazardLang.

Slide 51

Slide 51 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) It takes a collection of names and values to serialize

Slide 52

Slide 52 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) zips them together to form a property collection

Slide 53

Slide 53 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) loops over each property,

Slide 54

Slide 54 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) then writes an tag for each name

Slide 55

Slide 55 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) followed by a write of the corresponding value to the fi le.

Slide 56

Slide 56 text

funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) Where's the bug? The theme of this talk is bugs that come about through poor social coordination. Bugs that don't fi t on one screen. Let's take a look at another fi le.

Slide 57

Slide 57 text

/ / Require: Consistent ordering / / of names for schema funk tag(names, name) names.indexOf(name) funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) The scheme used here by HazardSerializer is to have the index in the schema serve as the tag. There is a comment stating that the schema needs to be consistent. Ok. No bug yet, but our story is not over. Remember this part about the schema needing to be consistent.

Slide 58

Slide 58 text

funk writeCustomer(customer, file) schema = [“name”, “date”] file.write(tag(“name”, schema)) file.write(customer.name) file.write(tag(“date”, schema)) file.write(customer.date) One day the original developer leaves and another developer is hired to replace them. The new developer writes this usage of the HazardSerializer. Maybe they didn’t notice the utility function the original developer wrote, so they do some things by hand.

Slide 59

Slide 59 text

funk writeCustomer(customer, file) schema = [“name”, “date”] file.write(tag(“name”, schema)) file.write(customer.name) file.write(tag(“date”, schema)) file.write(customer.date) They setup their schema with a predetermined consistent order - just like the docs said to.

Slide 60

Slide 60 text

funk writeCustomer(customer, file) schema = [“name”, “date”] file.write(tag(“name”, schema)) file.write(customer.name) file.write(tag(“date”, schema)) file.write(customer.date) And they write tagged property names and values, just like they are supposed to!

Slide 61

Slide 61 text

funk writeCustomer(customer, file) schema = [“name”, “date”] file.write(tag(“name”, schema)) file.write(customer.name) file.write(tag(“date”, schema)) file.write(customer.date) So the new dev tries their usage of HazardSerializer and writes a fi le. But when they take a look in the app the data is all wrong. They ask around, and get the story that because there was so much trouble maintaining consistency in the schema between the server and the client that the convention among the team is to use alphabetical ordering in the schemas.

Slide 62

Slide 62 text

funk writeCustomer(customer, file) schema = [“name”, “date”] file.write(tag(“name”, schema)) file.write(customer.name) file.write(tag(“date”, schema)) file.write(customer.date) Here it’s using “name”, “date” for the writer, but the reader (which is implemented somewhere else) uses “date”, “name”. Ugh. The new dev, being the proactive sort and ready to make a good impression decides to fi x this once and for all.

Slide 63

Slide 63 text

/ / Require: Consistent ordering / / of names for schema funk tag(names, name) names.indexOf(name) funk tag(names, name) names.sort() names.indexOf(name) The dev makes this commit. The old version is on the top, and the new version is on the bottom. The dev added a call to names.sort() which ensures that there is a consistent ordering that adheres to the internal convention of having alphabetical names. Their code now works. There is an added bene fi t that there are fewer requirements for calling this function so this bug won’t be hit in the future! And we ship. Yay! Except…

Slide 64

Slide 64 text

funk tag(names, name) names.sort() names.indexOf(name) funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) Except… nobody looked at these two pieces of code at the same time. The old serialize function that I showed you at the start, with the new “ fi xed” tag function. Do you know how many things a person can keep in their head at once? Three to four. So even in the few minutes I distracted you with the story about the new dev we may have forgotten to consider how these would interact.

Slide 65

Slide 65 text

funk tag(names, name) names.sort() names.indexOf(name) funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) So, what's the problem? Well, if you look at both sections at the same time, you can see that we are

Slide 66

Slide 66 text

funk tag(names, name) names.sort() names.indexOf(name) funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) iterating over a list

Slide 67

Slide 67 text

funk tag(names, name) names.sort() names.indexOf(name) funk serialize(names, values, file) props = zip(names, values) foreach props => prop file.write(tag(prop.name, names)) file.write(prop.value) while modifying it. Oops. The program probably won't crash. Instead, it will write garbage data, which is arguably worse. At no point in time was this serialize function and the new tag function on the same screen at the same time. People changed di ff erent bits fi xing the problems they were aware of, creating local consistencies but global inconsistency.

Slide 68

Slide 68 text

Remember that these examples are simpli fi ed. Real codebases are comprised of huge directed graphs of function calls being mutated concurrently by multiple people. Even if two inconsistent nodes in that graph were just a few hops away, there could be hundreds of nodes reachable within the same distance. Finding the inconsistency is much like fi nding the needle in a haystack if you don't know a-priori where to look. Our example may seem contrived, but the issue is common enough that a Google search for the phrase "don't iterate list while modifying" comes up with over 50 million results.

Slide 69

Slide 69 text

This time-lapse animation shows only the fi les and directory structure of a project that I worked on at The Graph to index blockchain data. If the nodes for structs and functions were included here there would be more than 3500 more nodes on the screen and uncountably more connections. Yet, this is a modestly sized workspace maintained by relatively a small team. [drink]

Slide 70

Slide 70 text

Avoiding iterating over a list while modifying it in HazardLang requires social coordination. But it's a mistake that I've never made in Rust - even when working with other people! The compiler ensures for me that all the connected nodes in our call graph are consistent. In Rust, there are three ways to pass values.

Slide 71

Slide 71 text

&T You can use a shared reference to T. A shared reference is immutable, unless you implement special protections to guard against the problems that come with shared mutability. We call that “interior mutability”. But, the typical shared reference is read-only.

Slide 72

Slide 72 text

&T &mut T There is also &mut T (unique reference to T), which enables the permission to write to the value.

Slide 73

Slide 73 text

&T &mut T T And fi nally, T - which transfers ownership of the value. Using these three types, we make explicit what are hidden contracts in HazardLang. The Rust compiler can verify these contracts. Let's see how it does that.

Slide 74

Slide 74 text

Our serializer bug boiled down to iterating over a list while modifying it. If we try the same in Rust, we get a compiler error:

Slide 75

Slide 75 text

for _ in names.iter() { names.sort(); } This is the distilled version of the bug.

Slide 76

Slide 76 text

for _ in names.iter() { names.sort(); } Here at names.iter, a temporary is constructed that maintains the state of the iterator. The iterator holds a shared reference to names.

Slide 77

Slide 77 text

for _ in names.iter() { names.sort(); } But on the following line, the call to sort takes names by unique reference. It is a contradiction for a reference to be both unique and shared at the same time! Contradictions are bugs. What's neat here is that Rust will detect this contradiction through any number of layers and structs and function calls so that even if the code is not simple like in the example, it will not compile with the inconsistency. The bug we introduced in the serializer cannot occur in Rust because the whole program would fail to compile.

Slide 78

Slide 78 text

We’re almost done. I promised to show you how to use the compiler to enforce your own rules that would otherwise require social coordination. We will use the type system and compile errors to teach new developers architectural decisions. This is going to be a bit of a doozy. So please prepare yourself by enjoying this picture.

Slide 79

Slide 79 text

Ok. Let’s go.

Slide 80

Slide 80 text

First, let's set up the situation. At Edge & Node, we write multi-threaded web servers that each serve thousands of requests every second. These servers read data from a database. The database has a connection limit, and connections take time to set up so to avoid going over the limit or incurring the setup cost on every request we use a connection pool.

Slide 81

Slide 81 text

One day we notice that a server instance stops serving requests. There is no warning. Everything stops, CPU usage fl atlines, no disk usage, no queries served. If we restart, everything is ok again, until the next time it happens. Why? Note that at this point, you're looking at an issue that's going to be di ffi cult to debug. It's a real server with lots of code. Customers are panicked. And the issue only occurs once every few days under vast amounts of load. There is no stack trace in the logs or a smoking gun of any kind.

Slide 82

Slide 82 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); Here’s the bug we found.

Slide 83

Slide 83 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); First, get a connection from the pool.

Slide 84

Slide 84 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); Then save the data using the connection.

Slide 85

Slide 85 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); And, lastly, emit an event to notify any listeners that there is new data available.

Slide 86

Slide 86 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); Where's the bug? It's not here. It's in the interaction of this code with other code. Let's look at some more code from a fi le far, far away.

Slide 87

Slide 87 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } The code in emit_event also appears straightforward. Our PubSub goes through the database,

Slide 88

Slide 88 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } so we grab a connection

Slide 89

Slide 89 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } and write an event to it.

Slide 90

Slide 90 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } There is no obvious bug in this code either!

Slide 91

Slide 91 text

The problem is that when emit_event is called, we are already holding a connection from the connection pool. The connection held is not returned to the pool until it is dropped after emit_event returns. In normal circumstances, this is ok. The second connection is acquired, and then both are released. First, the connection in emit_event is released, then the outer connection is released. But, rarely, if 50 requests hit emit_event simultaneously, they are already holding the limit of 50 connections, so the call to get_connection within emit_event never returns because the connection pool is empty.

Slide 92

Slide 92 text

Since emit_event never returns, none of the outer connections are returned to the pool, and the whole request pipeline is deadlocked across all threads. In order to understand this bug, you have to be aware of very speci fi c architectural details. You have to know that connections are pooled. You have to know that events go through the database. You have to know that connections return to the pool on drop.

Slide 93

Slide 93 text

You have to know everything that those disparate details infer. And you have to know for any function you are writing that no caller of your function holds a connection if any call you make might try to acquire one. So you have to know all the code up and down the stack at all points and keep all this in your head on top of thinking about whatever problem you are actually trying to solve. Since interacting with the database happens in many places there is a large surface area of code susceptible to this bug.

Slide 94

Slide 94 text

At this moment of discovery, experience and tribal knowledge are formed. The developer working on the bug says, "Aha! It is incorrect, in the general case, to hold on to a pooled resource and ask for another from the same pool. Doing so will always eventually deadlock." They share this information with the team, write a blog post, add a comment to get_connection, and go on a conquest to stamp out every instance of this bug they can fi nd. But this bug is really subtle. It's easy to miss even when you know what to look for because you have to analyze regions of a directed graph of function calls.

Slide 95

Slide 95 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] You have to look at the scope of the liveness of the connection. In this case, the scope is between get_connection past the end of emit_event. You have to see what function calls overlap with that scope.

Slide 96

Slide 96 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] Then, traverse the directed graph of calls until you visit every reachable node and verify that none of those nodes attempts to get a connection.

Slide 97

Slide 97 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] ?

Slide 98

Slide 98 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] ? ?

Slide 99

Slide 99 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] ? ? ?

Slide 100

Slide 100 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); fn emit_event(kind: &Event) { let connection = get_connection(); connection.write_event(kind); } ] ? ? ? Even if you get it right, someone can make a minor edit in the middle of the graph in the future. This edit may completely change the set of reachable nodes. They may be unaware that upstream a connection is held while downstream a new connection is obtained because from where the edit is made, they may see neither.

Slide 101

Slide 101 text

So here is this bug. It’s hard to detect, easy to create, is not fi xable via architecture, and hurts users in production.

Slide 102

Slide 102 text

It's time for Compile-Time Social-Coordination! What we want to do as leaders is to take our learnings about the resource starvation hazard inherent to pooled resources and encode those learnings in the type system so that the compiler can then teach the rest of the team at scale and at the appropriate time.

Slide 103

Slide 103 text

The rule that we want to enforce to prevent this bug from coming up again is that each request may hold up to one connection, but never more during the scope of the request.

Slide 104

Slide 104 text

struct Token { ... } The solution is to create a Token representing the permission to obtain a connection from the pool. We can ensure this permission is granted once per request by making the Token constructor private.

Slide 105

Slide 105 text

struct Token { ... } fn get_connection<'a>(token: &'a mut Token) -> Connection<'a> { ... } get_connection is then appended

Slide 106

Slide 106 text

struct Token { ... } fn get_connection<'a>(token: &'a mut Token) -> Connection<'a> { ... } to take a unique reference to that token. What that does is to tie the unique loan of the token, to the loan of the connection from the pool.

Slide 107

Slide 107 text

struct Token { ... } fn get_connection<'a>(token: &'a mut Token) -> Connection<'a> { ... } The connection is returned to the pool on drop, so only by returning the connection to the pool can we release our loan of the token.

Slide 108

Slide 108 text

struct Token { ... } fn get_connection<'a>(token: &'a mut Token) -> Connection<'a> { ... } What is the e ff ect of this change?

Slide 109

Slide 109 text

let connection = get_connection(); connection.write_data(&data); emit_event(&Event :: SavedData); Returning to the callsite of emit_event

Slide 110

Slide 110 text

let connection = get_connection(&mut Token); connection.write_data(&data); emit_event(&Event :: SavedData); we now need to pass our token into get_connection.

Slide 111

Slide 111 text

let connection = get_connection(&mut Token); connection.write_data(&data); emit_event(&Event :: SavedData, &mut Token); And we need to pass our token into emit_event for it to be able to obtain a connection. But now, this won’t compile because the connection is still alive when we try to borrow the token the second time.

Slide 112

Slide 112 text

let connection = get_connection(&mut token); connection.write_data(&data); drop(connection); emit_event(&Event :: SavedData, &mut token); The compiler forces us to add this line to return the connection to the pool before calling emit_event. This fi xes the bug, not just here, but everywhere it might exist in the source now and in the future.

Slide 113

Slide 113 text

That's it. A dozen lines of code to set up the rules and the graph traversal search for inconsistency is now mechanically executed by the compiler, removing the error- prone and easily forgotten work from the developer. As a bonus, the Token is removed at compile time. There is no heap allocation or any runtime cost at all.

Slide 114

Slide 114 text

All of these problems: locking entity data, modifying a list while iterating over it, and resource starvation have two things in common. One is that they happen all the time. The second, is that they are all a part of a broader class of problems fi xed as a natural consequence of

Slide 115

Slide 115 text

‘superpower the borrow checker. In fact, many other common social coordination problem like memory management (if I pass a pointer to your library whose responsibility is it to free that memory?) Safe global variables, high-performance non-defensive code, security, even WASM support are all underpinned by the borrow checker.

Slide 116

Slide 116 text

The borrow checker is the beating heart of Rust and is why I use Rust. You could say I use Rust because of its safety, performance, web assembly, productivity, excellent tooling, supportive community, its empowerment, etc. All true. But... none of these I consider di ff erentiators. They are important, but are literally the minimum bar. I have no use for any language where I cannot write programs with excellent runtime performance or which does not compile to the platforms I care about, for example. Among the small set of languages that meets the minimum standard, I ask what sets them apart. For me, it is lifetimes and the borrow checker.

Slide 117

Slide 117 text

Lifetimes have a reputation for being hard to learn. These fears aren't entirely misguided. Learning Rust is hard! It took me longer to learn Rust than any other language. But, there is a narrative being perpetuated in our community about the borrow checker being di ffi cult, that people “ fi ght the borrow checker”. While true from a certain lens, I believe this narrative to be misguided and counterproductive. Do you want to know what was harder than learning lifetimes?

Slide 118

Slide 118 text

Learning the same lessons through 20 years of making preventable mistakes.

Slide 119

Slide 119 text

The whole result is refreshing because there is a single unifying concept that provides a bene fi t across almost all APIs. The accumulation of many small wins adds up. If you want to know in a sentence what’s so important here, it’s that there is fi nally a language that both has a string concatenation method, and I’m not afraid to use it.

Slide 120

Slide 120 text

At the risk of being hyperbolic, I believe that the borrow checker has rendered obsolete much of the knowledge that I've gained over the past 20 years. And I think we haven't even seen yet how far this experiment will go. Suppose the future of programming can shed defensive architectural patterns, endless debugging, passing on best practices and tribal knowledge manually, and learn to love one concept - that of lifetimes. In that case, we will see farther and accomplish more than our predecessors.

Slide 121

Slide 121 text

If you are not yet using Rust, that is the tradeo ff that I present to you. The choice is now yours.

Slide 122

Slide 122 text

[email protected] edgeandnode.com If you liked the idea of solving social coordination at compile time, you may also enjoy solving social coordination through incentive systems - which is one of the things I work on at Edge & Node while using Rust. If that sounds appealing to you, we are hiring Rust developers. You can contact me at [email protected] for any questions about this talk or what we're building. Thanks.