Slide 1

Slide 1 text

PWL NYC IRONFLEET (An extremely opinionated introduction)

Slide 2

Slide 2 text

Ines 
 Sombra About Me @randommood

Slide 3

Slide 3 text

System verification My IronFleet Pitch Parting Thoughts Our Journey TODAY

Slide 4

Slide 4 text

15”

Slide 5

Slide 5 text

The GOAL

Slide 6

Slide 6 text

Formal Methods “Scholarly” Testing The WAYS

Slide 7

Slide 7 text

Formal Methods Correctness tied to a specification (that you provide) & then you implement it FMs when applied correctly tend to result in systems with the highest integrity Usually targeted to smaller components. High investment + high rewards

Slide 8

Slide 8 text

Tedious, slow, and difficult to get right... (and then you still have to implement it)

Slide 9

Slide 9 text

Safety Liveness Something bad never happens Formal Methods find invariant violations to show if safety is not provided Something good eventually happens Ensuring liveness is critical, since a liveness bug may render the entire system unavailable

Slide 10

Slide 10 text

Only need to reason about two system states at a time A behavior is safe if each step between states preserves the system’s invariants Reasoning about infinite series of system states Challenging for automated theorem provers (timeouts are very likely) Safety Liveness

Slide 11

Slide 11 text

Only need to reason about two system states at a time A behavior is safe if each step between states preserves the system’s invariants Reasoning about infinite series of system states Challenging for automated theorem provers (timeouts are very likely) Safety Liveness HARD! !

Slide 12

Slide 12 text

I’m tired!

Slide 13

Slide 13 text

2015

Slide 14

Slide 14 text

Practical formal verification of (non-trivial) distributed systems is a long time away

Slide 15

Slide 15 text

Practical formal verification of (non-trivial) distributed systems is a long time away Wrong? "

Slide 16

Slide 16 text

IronFleet introduces A methodology that slices a system into specific layers to make verification of practical distributed system implementations feasible

Slide 17

Slide 17 text

High level spec Distributed protocol Implementation Plus
 Refinements!

Slide 18

Slide 18 text

First system to mechanically verify liveness properties of a practical protocol & its implementation Ironfleet, you want to read it

Slide 19

Slide 19 text

Proofs that reason all the way down to the bytes of the UDP packets sent on the network, guaranteeing correctness despite packet drops, reorderings, or duplications Ironfleet really, go read it

Slide 20

Slide 20 text

Two Distributed Systems Paxos-based replicated- state-machine library Distribution for reliability 18,200 requests/second Sharded key-value store Distribution for improved throughput /moving “hot” keys to dedicated machine 28,800 requests/second IRONRSL IRONKV

Slide 21

Slide 21 text

Two Distributed Systems Prove complete functional correctness & its key liveness property: if the network is eventually synchronous for a live quorum of replicas, then a client repeatedly submitting a request eventually receives a reply IRONRSL GUARANTEES Proved complete functional correctness & an important liveness property: if the network is fair then the reliable-transmission component eventually delivers each message IRONKV GUARANTEES

Slide 22

Slide 22 text

VERIFICATION OVERVIEW

Slide 23

Slide 23 text

High level specification Specs for IronRSL - 85 lines & KV - 34 lines!

Slide 24

Slide 24 text

Abstract distributed protocol Introduces the concept of individual hosts that communicate only via network messages Prove that the distributed protocol-based specification is a refinement of the top-level specification, and for this TLA-style techniques are used as embodied in the Dafny language

Slide 25

Slide 25 text

Implementation Writes single-threaded imperative code to run on each host using Dafny Prove that the host implementation refines the host state machine in the distributed protocol layer Show that a distributed system comprising N host implementations refines the distributed protocol of N hosts

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

The developer UX

Slide 30

Slide 30 text

The developer UX “Dafny provides near-real-time IDE- integrated feedback. As the developer writes a given method or proof, she typically sees feedback in 1–10 seconds indicating whether the verifier is satisfied...“

Slide 31

Slide 31 text

The developer UX “Our build system tracks dependencies across files and outsources, in parallel, each file’s verification to a cloud virtual machine. Thus, while a full integration build done serially requires approximately six hours, in practice, the developer rarely waits more than 6–8 minutes”

Slide 32

Slide 32 text

“IronRSL (including replication, view changes, log truncation, batching, etc.) & IronKV (including delegation and reliable delivery) worked the first time we ran them.”

Slide 33

Slide 33 text

Want!

Slide 34

Slide 34 text

Used TLA embedding to build a library of fundamental TLA proof rules verified from first principles… which is a useful artifact for proving liveness properties More IronFleet tricks

Slide 35

Slide 35 text

A few questions What will out tests look like going forward? Libraries of TLA+ methods for liveness Tricks for verifying imperative code, weird?

Slide 36

Slide 36 text

THANK YOU! github.com/Randommood/ PWLNYC2016