$30 off During Our Annual Pro Sale. View Details »

Your Flying Car is Ready: Amazing Programming From the Future, Today!

Craig Stuntz
September 29, 2014

Your Flying Car is Ready: Amazing Programming From the Future, Today!

What if simply writing tests was enough to produce a program which makes them pass? What if your compiler could guarantee that your “Heartbleed-free” OpenSSL replacement follows the TLS specification to the letter, and even finds inconsistencies in the specification itself? What if you could write a test which showed that your code had no unintentional behavior, ever? Microsoft Research is well known for its contributions to Kinect, F#, the Entity Framework, and more, but it's also the home of a number of programming tools which do things which many programmers would consider surprising, if not impossible. But they work, and in this session you'll see them in action.

Like the idea of code contracts, but concerned about runtime performance and errors? The Dafny language can check contracts at compile time. The Z3 theorem prover can solve problems from specifications alone, and is used to make Hyper-V and Windows Azure memory safe. The F7 specification language for F# was used by its authors used it to not only produce a TLS implementation which probably follows the spec, but also identified a dangerous hole in the TLS specification itself. You'll learn how Amazon uses the TLA+ specification language to prove that there are no edge cases in its internal protocols. Far from being research toys, these tools are in daily use in cases where stability, security, and reliability of code matters most. Can they help with your hardest problems? You might be surprised!

Craig Stuntz

September 29, 2014
Tweet

More Decks by Craig Stuntz

Other Decks in Programming

Transcript

  1. CRAIG STUNTZ
    PROJECT
    DATE CLIENT
    2014.09.29
    YOUR FLYING CAR IS READY
    AMAZING PROGRAMMING TOOLS FROM THE FUTURE, TODAY!
    https://www.flickr.com/photos/ellenm1/7847402208

    View Slide

  2. SLIDES https://speakerdeck.com/craigstuntz/your-flying-car-is-ready
    Slides are already online.

    View Slide

  3. “THE FUTURE IS
    ALREADY HERE —
    IT’S JUST NOT
    VERY EVENLY
    DISTRIBUTED.”
    WILLIAM GIBSON
    Spoilers alert! Here’s the whole talk.
    1. Software is broken; solves the wrong problem incorrectly.
    2. Software is broken for a reason: Inessential complexity.
    3. It’s possible to produce software which solves harder problems and isn’t broken.
    4. The way to do better exists and is used to produce software you use every day.

    View Slide

  4. “A LANGUAGE THAT
    DOESN'T AFFECT THE
    WAY YOU THINK ABOUT
    PROGRAMMING, IS
    NOT WORTH
    KNOWING.”
    ALAN PERLIS
    EPIGRAMS IN
    PROGRAMMING
    https://www.flickr.com/photos/randyread/2385812579
    Will talk about tools most people don’t know exist, when it makes sense to use them. When not. “All languages are the same, just syntax.” <- Wrong!

    View Slide

  5. “SOMETIMES WE
    DON’T PROGRAM
    TO SHIP; WE
    PROGRAM TO
    UNDERSTAND
    PROGRAMMING.”
    NADA AMIN
    PROGRAMMING
    SHOULD EAT ITSELF
    Do programming languages exist to produce programs? You can create a program without a PL, though it’s harder. We program not for its own sake
    (mostly) but to solve business problems. PLs and compilers produce exe code, yes, but find syntax errors, semantic errors, and are “tools of
    thought.” (And much more common!)
    My real goal here: To expand the set of problems you think you can solve with programming. To do that, you need new ways of approaching a language,
    not just tooling.

    View Slide

  6. What Is the Upper Limit
    of Software Quality?
    function three() {
    return 1 + 2;
    }
    Want to show you a function I wrote. I’ll apologize in advance for this slide since it involves math. What’s interesting? It’s perfect! This is the only defect-
    free JS I’ll be showing you today. It also composes well.
    The notion of “perfect” code is controversial. But it’s clearly possible!
    How much quality are we willing to pay for? Does it depend on the application?

    View Slide

  7. “LIFE WAS
    SIMPLE BEFORE
    WORLD WAR II.
    AFTER THAT, WE
    HAD SYSTEMS.”
    GRACE HOPPER
    Perfect code is trivial.
    Perfect programs, systems harder. Why is composition harder in some cases? This is essential!
    There are always external factors. That’s fine.

    View Slide

  8. “BEWARE OF BUGS
    IN THE ABOVE
    CODE; I HAVE ONLY
    PROVED IT
    CORRECT, NOT
    TRIED IT.”
    DONALD KNUTH
    NOTES ON THE VAN EMDE BOAS
    CONSTRUCTION OF PRIORITY DEQUES: AN
    INSTRUCTIVE USE OF RECURSION
    https://www.flickr.com/photos/gem66/38298868
    Dangerous ideas! I'll be showing a lot of languages which are “still in the lab.”
    You may find some of this useful in your work tomorrow, but not all experiments succeed.

    View Slide

  9. In particular the lab is Microsoft Research. Many people know Kinect, WorldWide Telescope, F#, Entity Framework, Pex.

    View Slide

  10. “IF YOU’RE GOING
    TO USE CUTTING
    EDGE TECHNOLOGY,
    DON’T EXPECT NICE
    BLOG POSTS THAT
    TELL YOU IT’S EASY.”
    JOE ARMSTRONG
    CHICAGO ERLANG PRESENTATION
    https://www.flickr.com/photos/vanchett/3180276972
    I have a rule of thumb for application architecture. Consider the tech you want to be using in 5 years, because… Never specify tech the Hacker News
    Hipsters tell you that you should be using today. Try to see into the future. This is hard!
    However, every tool I discuss is real and is used in production software, including software you might use every day.

    View Slide

  11. “YOUNG MAN, IN
    MATHEMATICS YOU
    DON'T UNDERSTAND
    THINGS. YOU JUST
    GET USED TO THEM.”
    JOHN VON NEUMANN
    LETTER TO FELIX T. SMITH
    https://www.flickr.com/photos/36621927@N00/8378574271
    These languages operate very differently than those you probably use in your day to day work. Don't worry if you don't follow every bit of syntax. To be
    quite honest, I don't fully understand all of this stuff myself. The important thing is to know what is available, and to think about problems in new ways.

    View Slide

  12. Some Specialized
    Languages
    Assembly

    SQL

    C

    F#

    C#?
    I’ll be talking about specialized tools.
    We’re biased towards general purpose languages.
    But we happily use SQL when needed.
    We grow domain-specific languages to GP when necessary.

    View Slide

  13. “IF DEBUGGING IS THE
    PROCESS OF
    REMOVING SOFTWARE
    BUGS, THEN
    PROGRAMMING MUST
    BE THE PROCESS OF
    PUTTING THEM IN.”
    ATTRIBUTED TO
    EDSGER DIJKSTRA
    Sounds like Dijkstra!
    Let’s talk about bugs. Broken code should be obvious.
    Cognitive overhead from inessential complexity turns out to be surprisingly high.
    Let’s examine some buggy code. Been a hell of a week; the Internet keeps giving!

    View Slide

  14. JavaScript
    function add(a, b) {
    return
    a + b;
    }
    In study of buggy code, makes sense to start with JavaScript.
    In contrast to earlier example, completely broken. No error. Anyone know what it returns?
    JS, so we never specified the return type. Type checker would find.
    A test might find the bug
    This is not a good part

    View Slide

  15. http://blog.erratasec.com/2014/09/bash-bug-as-big-as-heartbleed.html#.VCN_7StdWwE
    PLs are usually specified in EBNF. Machine verifiable specs are easy (not always true!); bash doesn’t use them and has evolved to the point where can’t be
    specified.
    So there are edge cases…. Syntax which should not be allowed.
    Even when (maybe!) not wrong: Also Ruby. parser.y impenetrable. Half the size of all of Lua for parser alone. Other implementations are probably different
    than MRI. Formal PL grammars keep parsers maintainable.

    View Slide

  16. Goto Fail
    static OSStatus
    SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
    uint8_t *signature, UInt16 signatureLen)
    {
    OSStatus err;
    ...
    !
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
    goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
    goto fail;
    ...
    !
    fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
    }
    Besides the obvious…
    $AAPL lost $20B when iOS update couldn’t dial the phone.
    “fail” is not always fail, and “err” is not always err.
    Crappy code, maybe, but real production code, and not the worst I’ve ever seen.

    View Slide

  17. "Quality software costs money — Heartbleed was free." Paul-Henning Kamp DOI:10.1145/2631095
    Not a TLS spec bug (though those can happen). Possibly creeping featuritis in spec.
    Also, strings are kind of broken. That’s true for most languages, not just C.

    View Slide

  18. C#
    static Type GetType() where T : new()
    {
    T t = new T();
    return t.GetType(); // *
    }
    !
    static void Main(string[] args)
    {
    Console.WriteLine(GetType().ToSting());
    }
    Appears contrived, but useful, because type system is broken and it fits on a slide.
    * line throws.
    Deeply weird that you can new up something and can’t ask for its type.
    Type systems help, but not 100% safe.

    View Slide

  19. F#
    let average (someList: int list) =
    (List.sum someList)
    / (List.length someList)
    Type system is restricted. There is no type for a non-empty list. Could check in code, but requires code + test. Better type system could do it for us.

    View Slide

  20. CAN WE DO
    BETTER?
    OR MUST WE DO BETTER?
    https://www.flickr.com/photos/jurvetson/5872448596
    How?

    View Slide

  21. “Attempting to prove any nontrivial theorem
    about your program will expose lots of bugs.

    “The particular choice of theorem makes little
    difference!

    “Typechecking is good because it proves lots and
    lots of little theorems about your program.”
    –Benjamin C. Pierce
    http://www.cis.upenn.edu/~bcpierce/papers/harmful-mfps.pdf
    Use strong types! There is a deep relationship between programs and mathematical proofs. Talk to me after, but types good!
    Strong types (especially for real strong types) are awesome for refactoring. Slash + burn.
    Don’t fix the bug. Change the data types to make the state which caused it impossible.
    C# types maybe not strong enough to succinctly only allow state which is correct by construction.

    View Slide

  22. “PROGRAM TESTING
    CAN BE USED TO
    SHOW THE
    PRESENCE OF BUGS,
    BUT NEVER THEIR
    ABSENCE.”
    EDSGER DIJKSTRA
    STRUCTURED PROGRAMMING
    Tests are ∃, strong types are ∀.
    Testing is great; property-based testing (QuickCheck, etc.) even better
    Testing is evidence, not a proof
    But the 80/20 rule may hold

    View Slide

  23. “WHAT’S TRUE OF
    EVERY BUG FOUND
    IN THE FIELD? IT HAS
    PASSED THE TYPE
    CHECKER… AND ALL
    OF THE TESTS.”
    RICH HICKEY
    SIMPLE MADE EASY
    https://www.flickr.com/photos/grouperkun/5351080866
    Simplicity, conceptual clarity, is the silver bullet, not languages.
    "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart
    enough to debug it.” Brian Kernighan
    When you diverge from essential complexity, you’re creating a maintenance problem.
    Microkernels!

    View Slide

  24. Hoare Verification
    {P} C {Q}

    Partial vs. total correctness

    Considered to be high-effort; 100/100
    Tony Hoare, Algol (who has heard of Algol?)
    Anyone ever seen a null reference error?
    One can totally specify software.
    Precondition->Command->Postcondition

    View Slide

  25. Z3
    http://rise4fun.com/Z3
    http://z3.codeplex.com/
    https://www.flickr.com/photos/laughingsquid/102654032
    So here’s your flying car! Z3 is a theorem prover. Sounds like math, but stay with me. When you hear “theorem prover,” think really strong types. Equivalent.
    Specification = always true about a system. “Formal specification” = verifiable by a machine.
    "Type errors are not just red flags: in a sufficiently well-specified theory, all errors are type errors." Evan Jenkins
    SMT solver. Take some specs, simplify them algebraically, and efficiently prove the spec satisfiable or not, with examples.

    View Slide

  26. A Simple Problem
    Write a Ruby program that determines the
    smallest three digit number such that
    when said number is divided by the sum
    of its digits the answer is 20.
    Example: Number = 123. Sum of digits = 6
    123/6 = 20.5, so not a solution.
    Hate it when speakers read slides out loud, but…
    Picked this because it’s a very simple problem.
    Z3 is ASM. Usually not used directly. Will see examples of systems which use it.

    View Slide

  27. Ruby
    Most people would try something like… Seems reasonable, but it’s brute force solution. Have we fully understood the problem?
    Also, this is wrong. You probably have to be fairly good with Ruby to figure out why. Ruby is complicated; just try and parse it! Cognitive overhead for
    even a simple problem is very high!

    View Slide

  28. Ruby
    Looks efficient! Is this right? Remember, I like to put buggy code on my slides! Bestowed? Is it the best solution? Do you want this in the code you
    maintain?

    View Slide

  29. SMT-LIB
    I know, (). SMT-LIB language used to compare/benchmark solvers. You don’t typically use this for production. Minimal interface to Z3.
    Did live in rehearsal. Awkward! See me in person for demos.
    “Formal” spec for most of problem. Machine checkable. Omitted one part.

    View Slide

  30. SMT-LIB
    Variable?

    View Slide

  31. SMT-LIB
    Test-only programming.
    Does forall make sense?

    View Slide

  32. SMT-LIB

    View Slide

  33. http://rise4fun.com/Z3/7VZh
    Had to write digit-sum

    View Slide

  34. Note that the model is valid SMT-LIB code. Optimized! Really important.
    Complex definitions tend to be wrong when first written out. They can also be complete nonsense!
    !

    View Slide

  35. Add one clause

    View Slide

  36. Not “can’t find.” Doesn’t exist.

    View Slide

  37. Who uses this? Hyper-V hypervisor. If this is wrong the world ends. 100000 lines C, 5000 lines x64 ASM
    Complex implementation, fairly simple spec.
    Around 1.5 person years, incl learn VCC. 18 hours execution. Xen flaw to be disclosed Wednesday.
    Also Dafny, F*, etc.

    View Slide

  38. DAFNY
    http://rise4fun.com/Dafny
    http://research.microsoft.com/dafny
    https://www.flickr.com/photos/marcusjb/440973101/

    View Slide

  39. Dafny
    Useful for education. Correctness more important than executability. Looks like code contracts but proven at compile time! Imperative code. What must
    go right?
    Solver proves that mathematical and imperative definitions equivalent. Important, especially for optimization. Similar to 180 example.

    View Slide

  40. Who uses Dafny? Rice University “Reasoning about algorithms”

    View Slide

  41. https://www.flickr.com/photos/ayman/21226117
    F* based on F#. Subsumes F7 and other MSR projects.

    View Slide

  42. Append function for length-indexed list.
    Heavy effort, heavy return.
    Remember C# variance annotations: Useful, even if you don’t write them!

    View Slide

  43. View Slide

  44. F7 source for miTLS

    View Slide

  45. Compiles to…

    View Slide

  46. Funny thing about formally verifying specs.
    Sounds awesome that the code meets spec.

    View Slide

  47. TLA+
    http://research.microsoft.com/en-us/um/people/lamport/tla/tla.html
    "Writing is nature’s way of letting you know how sloppy your thinking is." Richard Guindon
    "Mathematics is nature’s way of letting you know how sloppy your writing is.... Formal mathematics is nature’s way of letting you know how sloppy your mathematics is."
    Leslie Lamport
    "Specification is not an end in itself; it is just a tool that an engineer should be able to use when appropriate." p. 76
    "TLA+ is particularly effective at revealing concurrency errors—ones that arise through the interaction of asynchronous components." TLA book, p. 76.

    View Slide

  48. http://lorinhochstein.wordpress.com/2014/06/04/crossing-the-river-with-tla/

    View Slide

  49. View Slide

  50. http://somethingofthatilk.com/index.php?id=135

    View Slide

  51. View Slide

  52. View Slide

  53. View Slide

  54. Who Uses TLA+?
    http://research.microsoft.com/en-us/um/people/lamport/tla/formal-methods-amazon.pdf
    Anything on AWS: Netflix, Heroku

    View Slide

  55. WHAT HAVE WE LEARNED?
    Thinking about specification, and formal specs keep you honest! Force you to consider whole problem.
    Make a spec which is internally consistent. Double entry check vs. code.
    Useful when problem domain too large (AWS) or too complex (Ruby) to test.
    Proves optimized code equivalent to readable code.

    View Slide

  56. ARE FLYING
    CARS A BAD
    IDEA?
    https://www.flickr.com/photos/bobjagendorf/4934950194/
    Tooling is an issue. Proving production code matches spec can be challenging. __agl verify ECC C code
    Impractical for complex systems. Good when it makes you simplify!
    Exhaustive testing, when possible, can give you similar return for less effort. Not always possible.

    View Slide

  57. Gratitude
    The people of Microsoft Research

    Others I’ve learned from

    SMT-LIB: Laurentiu Nicola (blog comment)

    Dafny: Swarat Chaudhuri’s articles

    TLA+: Chris Newcombe, Tim Rath, Fan Zhang, Bogdan
    Munteanu, Marc Brooker, and Michael Deardeuff and
    Lorin Hochstein’s blog

    Photographers (credited on each slide where used)

    My family, employer, and coworkers, for putting up with
    me spending time on this stuff

    View Slide

  58. CRAIG STUNTZ
    @CraigStuntz
    [email protected]
    http://blogs.teamb.com/craigstuntz
    http://www.meetup.com/Papers-We-Love-Columbus/
    Least interesting part, but….
    Questions?

    View Slide