Mutation Control for Concurrency

Mutation Control for Concurrency Colin Gordon (University of Washington) on
work with Matt Parkinson (MSR), Jared Parsons (MS), Aleks Bromfield (MS), Joe Duffy (MS), Dan Grossman (UW), Mike Ernst (UW) LaME 2013

Mutation Control • Goal: Allow controlled mutation • Fine-grained restrictions
on permitted mutation – Includes immutability • Part of language design – Primitives exploit mutation control • Examples – Reference immutability, typestate, regions & effects, …

Reasoning About Mutation • What can be updated? • When
can it be updated? • What can it be updated to? How can it change? • Which aliases / modules / threads can update? Key problem: Answers to these questions are entirely implicit in most languages with mutation.

Design Space for Mutation Control • Which properties can we
constrain? – Who/What/When/Where • How do we specify constraints? • How precise are constraints? • How much specification effort is required? • Level of guarantees: – Non-interference vs. invariants vs. functional correctness

Secondary Desirable Properties • Modest annotation burden • Incremental guarantees
– “Ordinary” code “just works” • Local checks for rich properties • Single toolset for sequential & concurrent – Avoid extra concurrency-only concepts • Closely match mental models

Open Questions for Mutation Control • Granularity of control •
Method of integration • Specification burden • Spec expressiveness • Spectrum: Non-interference -> invariants -> functional correctness

Mutation Control for Concurrency • Design space for mutation control
• Simple & Practical – Uniqueness and Reference Immutability for Safe Parallelism (OOPSLA’12) • Desirable properties of reference immutability • Expressive Foundations – Rely-Guarantee References (PLDI’13 + WIP) • Open Questions

A Prototype Extension to C# • Extend C# for safe
parallelism • Statically enforce data-race freedom • Safe task & data parallelism –No locks –No mutable statics/globals • Real: millions of LOC –Experience report later

Data-Race-Free, Parallel, Imperative OOP • Prove soundness for • Data-race
freedom • Reference immutability • External uniqueness • Borrowing (new technique) • Precise generics over permissions

Safe Parallelism Essentials • Separate read and write effects –As
in DPJ, Habanero Java, etc. • Invariant: At all times, for all objects, either XOR X X X Possible write access Read-only access

Reference Immutability • writable: "normal" reference • readable: deeply read-only
o Cannot get writable reference through a readable o x:readable ⊢ x.f : t ⇒ t is never writable • immutable: permanently immutable • As in Tschantz et al. ‘05, Zibin et al. ‘07, Zibin et al. ‘10

External Uniqueness Unique reference: only reference to an object Externally-Unique
reference: only external reference into a group of objects Isolated and Immutable can both reach some objects

Reference Immutability + Uniqueness • writable: "normal" reference • readable:
deeply read-only o Cannot get writable reference through a readable – x:readable ⊢ x.f : t ⇒ t is never writable • immutable: permanently immutable • isolated: externally-unique reference

Symmetric Parallelism f

Symmetric Parallelism List<X> pmap(List<X> l, Func1<X,X> f) { head.next =
rest; return head; } List<X> head = null; head = new List<X>; head.elem = f(l.elem); List<X> rest = null; if (l.next != null) rest = pmap(l.next, f);

What About Writable + Parallelism? Hide writable references

Asymmetric Parallelism Integer i = ...; isolated IntegerList lst=new IntegerList();
... // populate list isolated SortFunc f = new SortFunc(); // Sort in parallel with other work f.apply(lst); || i.val = 3;

It Lives! A Prototype in Use Models a C# extension
in active use at Microsoft. • Some differences from formal system o e.g. first-class tasks, unstrict blocks • Millions of lines of code • Web server, MPEG decoder, … • Nearly all parallelism checked o Exceptions in runtime system • Anecdotally: More RI finds more bugs • Largest industrial use of such a system

More in OOPSLA’12 • Design evolution • Rough spots in
practice • More on uniqueness & borrowing • Generics for permissions • Proof by embedding into a program logic – The Views Framework (POPL’13; See Matt Parkinson’s keynote on Thursday!)

Advantages of RI for Parallelism • Modest annotation burden •
Works (data race freedom!) • Multi-purpose – Even unsound checking helps: • Found data races even without parallelism checks – Single-threaded benefits: • Lightweight caller-callee contract • Found extra (expensive!) defensive copying

High-Level Sources of Benefit • “Unverified” code has a type
– Incremental refinement • Local interference checks – Safe concurrency from local type env. Properties – Local summaries of global behavior • T in readable T is irrelevant • Single model for sequential and concurrent – Single tool behaves the same in both cases – Concurrency as a modest extension • Concurrency primitives introduce no new concepts • Effective as a mental model

Rely-Guarantee References • Fine-grained mutation control – Generalize per-reference mutability
permissions to arbitrary relations • E.g. monotonically increasing counter – Generalize reference immutability – Express how state changes, not just whether • Exploit similarity between threads and aliases • Preserve data structure invariants

A Duality: Threads & Aliases • Mutation by aliases ≈
thread interference – Analyses for one can inspire analyses for the other • Actions through aliases can be seen as concurrent • Rely-Guarantee reasoning is good for threads – Summarizes possible interference – Good match for concurrent data structures

Rely-Guarantee for Threads • Characterize thread interference: 1. Rely summarizes
expected interference 2. Guarantee bounds thread actions 3. Stable assertions are preserved by interference 4. Compatible threads: each rely assumes at least the other’s guarantee

Rely-Guarantee for References • Characterize alias interference: 1. Rely summarizes
alias interference 2. Guarantee bounds actions through this alias 3. Stable predicates preserved by interference 4. Compatible aliases: if x == y, then x.G ⊆ y.R && y.G ⊆ x.R • Subsumes ML references! (OCaml, SML, etc.) – (Incremental)

Rely-Guarantee Reference Type ref{τ|P}[R,G] standard reference Rely (e.g. ≈) Guarantee
(e.g.≤) Predicate (e.g. >0)

Rely-Guarantee for Alias Interference x:ref{ℕ|>0}{==,≤} y:ref{ℕ|>0}{≤,==} 2 3 x:=!x+1; ≤(2,3)
⇒ ≤(2,3) 2 > 0 ∧ ≤(2,3) ⇒ 3 > 0 R G R G 1 Rely, 2 Guarantee, 3 Stable, 4 Compatible ❹ ❸ ❷ ❶

Reference Immutability via Rely-Guarantee References • writable T ≝ ref{T|any}[havoc,havoc]
• readable T ≝ ref{T|any}[havoc,≈] • immutable T ≝ ref{T|any}[≈, ≈] • Suggests a spectrum: ML refs ⊆ RI ⊆ ... ⊆ RGref

Things I’m Skipping Over • References add complexity vs. threads
– References nest; threads don’t – Requires additional constraints for pointer structures – Making new aliases is subtle

RGrefs for Concurrency • Naïve RGrefs are unsound for concurrency
– Does x := !x + 1 really increment? • But RGrefs subsume RI – Hijack ideas from RI for safe concurrency • General rely-guarantee works for concurrency – Restrict reasoning for exprs with ! – Exploit conversion: • (convert r) has same properties as r with weaker type

Applying RGrefs to Concurrency • Borrow ideas from RI work
• RG is a good match for fine-grained structures: – All easy to specify as rely-guarantee references – O’Hearn et al. identify 18 invariants and step restrictions for lock-free set correctness [1] – Correctness for Michael-Scott Queue, Trieber stack has similar restrictions – Some similarity to Turon’s CaReSL [2] [1] PODC 2010 [2] POPL’13, ICFP’13

Are RGrefs Too General? • Maybe • Lays out a
design space – Basic mutation ⊆ RI ⊆ … ⊆ RGref • Can build higher-level abstractions above it – E.g. deterministic execution: • Strict non-interference ⇒ determinism • Monotonicity is halfway to determinism

Open Questions for Mutation Control • Granularity of control •
Method of integration • Specification burden • Spec expressiveness • Spectrum: Non-interference -> invariants -> functional correctness

What Granularity of Control? • Threads seem too coarse •
Modules also too coarse, but a good match for abstractions (CAP, Explicit Stabilization) • Individual objects too fine • “Reachability slices” (RI, RGref) seem better, but both too coarse and too fine • Regions? (DPJ, Local Rely-Guarantee) • Other granularities? Missing abstractions?

How Do We Integrate Mutation Control? • Attached to references
– RI, RGref, Ownership/Universe Types – specification matches point of mutation • Attached to method specifications – Sep. Logic, Chalice, others – Formalizes current best practices • Attached to regions – Local Rely-Guarantee, DPJ • Where else? Thread init/end?

Reducing Specification Burden • Lightweight – RI, universe types, (some)
ownership types • Heavyweight – RGref, RGsep, LRG, CAP, most program logics • Inference possible, pressing – Lots of separation logic work – Vafeiadis’s RGsep inference – Huang & Milanova’s RI inference • How does inference change the acceptable specification complexity?

Specification Expressiveness • RI or DPJ are slightly inflexible •
RGref/SL is possibly too expressive • How does inference interact with expressiveness?

Non-Interference, Invariants, and Functional Correctness • Mutation control is a
spectrum: – Non-interference • RI, RCC/Java, DPJ – Invariants (and hybrids) • RGref, RGSep, Conc. Sep. Logic – Correctness • CAP, CaReSL • I hypothesize a pay-as-you-go model is a prerequisite for adoption

Conclusions • Mutation control is the right approach to shared
memory concurrency – Single toolset with sequential & concurrent benefits – “Normal” code has a type in a more expressive system • Many questions remain – Granularity, integration, expressiveness, specification burden? – Non-interference / invariants / full correctness?

Mutation Control for Concurrency

Mutation Control for Concurrency

More Decks by Colin S Gordon

Other Decks in Research

Featured

Transcript