$30 off During Our Annual Pro Sale. View Details »

Deliver Your Software in an Envelope by Augie Fackler and Nathaniel Manista

PyCon 2014
April 13, 2014
500

Deliver Your Software in an Envelope by Augie Fackler and Nathaniel Manista

PyCon 2014

April 13, 2014
Tweet

More Decks by PyCon 2014

Transcript

  1. Augie: Mercurial, Python libraries. Nathaniel: Tech Lead of Melange, has contributed to Pylint.
    Transition: you test it, you document it, but how often have you gotten grief for regressions that didn't break tests?
    Deliver Your Software In
    An Envelope
    Augie Fackler & Nathaniel
    Manista
    Google, Inc.
    13 April, 2014
    Deliver Your Software in an Envelope http://localhost:8080/print/
    1 of 59 4/13/14, 12:57 PM

    View Slide

  2. Have you ever changed the behavior of your software in ways that don't break any promises but still gotten grief from your
    clients? Image source: http://commons.wikimedia.org/wiki/File:Angry_mob_of_four.jpg
    Deliver Your Software in an Envelope http://localhost:8080/print/
    2 of 59 4/13/14, 12:57 PM

    View Slide

  3. Behavioral Envelopes
    Deliver Your Software in an Envelope http://localhost:8080/print/
    3 of 59 4/13/14, 12:57 PM

    View Slide

  4. Lockheed F-104A Starfighter
    Aviation uses a notion of “performance envelopes” for describing the valid uses of an aircraft.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    4 of 59 4/13/14, 12:57 PM

    View Slide

  5. We believe that you can; let's explore what that might look like.
    Can we do this for software?
    Deliver Your Software in an Envelope http://localhost:8080/print/
    5 of 59 4/13/14, 12:57 PM

    View Slide

  6. Axes not even yet known. Vague sense that for software, like other systems, there are right ways and wrong ways to use it.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    6 of 59 4/13/14, 12:57 PM

    View Slide

  7. Examples
    Deliver Your Software in an Envelope http://localhost:8080/print/
    7 of 59 4/13/14, 12:57 PM

    View Slide

  8. math.sqrt
    Deliver Your Software in an Envelope http://localhost:8080/print/
    8 of 59 4/13/14, 12:57 PM

    View Slide

  9. Acceptable inputs are numbers greater than or equal to zero. DO NOT BOTHER TALKING ABOUT imaginary numbers.
    Transition to "but at another level of abstraction..."
    Deliver Your Software in an Envelope http://localhost:8080/print/
    9 of 59 4/13/14, 12:57 PM

    View Slide

  10. Think of a video game. One of my favorites will run with 4G of RAM, but you'd better close your browser and other RAM
    hogs before you do, or it'll be a somewhat swappy experience. I've got a low end GPU, so I have to turn the texture quality
    down.
    "System Requirements"
    Deliver Your Software in an Envelope http://localhost:8080/print/
    10 of 59 4/13/14, 12:57 PM

    View Slide

  11. Works at multiple levels of abstraction - an "HTTP server" can be a on object in a language interpreter, a process on a
    machine, or a machine itself.
    HTTP server
    (multiple levels of abstraction)
    Deliver Your Software in an Envelope http://localhost:8080/print/
    11 of 59 4/13/14, 12:57 PM

    View Slide

  12. Server:
    class HttpServer(object):
    def __call__(self, request):
    """HttpRequest -> HttpResponse."""
    Forbidden input:
    "the quick brown fox jumps over the lazy dog"
    Deliver Your Software in an Envelope http://localhost:8080/print/
    12 of 59 4/13/14, 12:57 PM

    View Slide

  13. Large space: universe of all values; small space: all values of type HttpRequest.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    13 of 59 4/13/14, 12:57 PM

    View Slide

  14. Server:
    httpserver.exe
    Forbidden input:
    /dev/urandom
    Deliver Your Software in an Envelope http://localhost:8080/print/
    14 of 59 4/13/14, 12:57 PM

    View Slide

  15. Plane is all bytestreams. Field of dots because the differentiation here is done value-by-value because we don't have types to
    differentiate.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    15 of 59 4/13/14, 12:57 PM

    View Slide

  16. Deliver Your Software in an Envelope http://localhost:8080/print/
    16 of 59 4/13/14, 12:57 PM

    View Slide

  17. Deliver Your Software in an Envelope http://localhost:8080/print/
    17 of 59 4/13/14, 12:57 PM

    View Slide

  18. Deliver Your Software in an Envelope http://localhost:8080/print/
    18 of 59 4/13/14, 12:57 PM

    View Slide

  19. So far we've introduced the idea of defining an envelope for a system's domain (its inputs). It turns out that it is helpful to
    define one for its outputs as well, --TRANSITION-- though not so much for systems that have exact (deterministic) outputs
    Deliver Your Software in an Envelope http://localhost:8080/print/
    19 of 59 4/13/14, 12:57 PM

    View Slide

  20. The utility of an expression like this is that it produces the same exact value every time.
    bool(None)
    Deliver Your Software in an Envelope http://localhost:8080/print/
    20 of 59 4/13/14, 12:57 PM

    View Slide

  21. This too. --TRANSITION-- But great for systems that feature deliberately inexact outputs
    3 * 4
    Deliver Your Software in an Envelope http://localhost:8080/print/
    21 of 59 4/13/14, 12:57 PM

    View Slide

  22. random.random()
    Deliver Your Software in an Envelope http://localhost:8080/print/
    22 of 59 4/13/14, 12:57 PM

    View Slide

  23. Deliver Your Software in an Envelope http://localhost:8080/print/
    23 of 59 4/13/14, 12:57 PM

    View Slide

  24. Systems that knowingly return fast but inaccurate answers
    Deliver Your Software in an Envelope http://localhost:8080/print/
    24 of 59 4/13/14, 12:57 PM

    View Slide

  25. Systems that anticipate being attacked by an adversary. * Proprietary commercial systems exposing only an API * Game
    systems for which implementation details would give cheaters an advantage --TRANSITION-- So what does this add up to
    generally?
    Deliver Your Software in an Envelope http://localhost:8080/print/
    25 of 59 4/13/14, 12:57 PM

    View Slide

  26. A wormhole between universes! Input dimensions: parameter values, assertions about the state of the world. Output
    dimensions: return values, raised exceptions, side effects. We will say "behavioral envelope" to mean the sum of "input
    envelope" and "output envelope".
    Deliver Your Software in an Envelope http://localhost:8080/print/
    26 of 59 4/13/14, 12:57 PM

    View Slide

  27. You likely already are in some ways. Function input domains. Output ranges. Blocking and non-blocking semantics. Thread
    safety. The pressure we're putting on you here is that if you state your envelope, you're guaranteed to have thought about it
    first. We see a lot of code that was written without such thinking. This is how you hold off the angry mob.
    Weak Thesis: You should state your
    system's behavioral envelope.
    Strong Thesis: Your statement of your
    system's behavioral envelope should be
    your only statement of your system's
    behavior.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    27 of 59 4/13/14, 12:57 PM

    View Slide

  28. How?
    1. Type system.
    2. Machine-checked constraints.
    3. Documentation.
    public static double sqrt(double n) {
    // code goes here
    }
    def sqrt(n):
    """Returns the square root of N.
    >>> sqrt(5)
    2.23606797749979
    >>> sqrt(-1)
    ValueError
    """
    # code goes here
    Deliver Your Software in an Envelope http://localhost:8080/print/
    28 of 59 4/13/14, 12:57 PM

    View Slide

  29. Same function in both languages, but Haskell's type system is another mechanism of expression available to both machines
    and programmers.
    map
    map: (a0 -> b0) -> [a0] -> [b0]
    Deliver Your Software in an Envelope http://localhost:8080/print/
    29 of 59 4/13/14, 12:57 PM

    View Slide

  30. Also note how the input envelope for map is interesting in the way it demands a certain kind of agreement between the types
    of the inputs.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    30 of 59 4/13/14, 12:57 PM

    View Slide

  31. Tests are machine-checked constraints of your system's envelope. Now maybe this doesn't look like something you'd want to
    share with your clients as part of your system's definition... --TRANSITION-- Pycurracy test in something more like natural
    language.
    def testPageLoads(self):
    """Tests that page loads properly."""
    user = profile_utils.seedNDBUser(
    host_for=[self.program])
    profile_utils.loginNDB(user)
    response = self.get(_getOrgAppShowUrl(self.org))
    self.assertResponseOK(response)
    Deliver Your Software in an Envelope http://localhost:8080/print/
    31 of 59 4/13/14, 12:57 PM

    View Slide

  32. This is test code that you could ship to clients of your system as persuasive evidence that your system is suitable for some
    purpose.
    Given
    I go to GSoC Home Page
    When
    I click "Apply_Org" link and wait
    Then
    I see "test_org" title
    Deliver Your Software in an Envelope http://localhost:8080/print/
    32 of 59 4/13/14, 12:57 PM

    View Slide

  33. If you need to, you can always just say "that input is not allowed". Or "this function will never...", or "overriding subclasses
    must...". --TRANSITION-- How do these systems interact? Are they redundant? Complementary?
    DO NOT COVER INTEGRATION WITH TYPE SYSTEM YET
    Deliver Your Software in an Envelope http://localhost:8080/print/
    33 of 59 4/13/14, 12:57 PM

    View Slide

  34. Story of a space being defined by a space being defined by a continuous barrier on one side and a single point on the other
    side. Transition into exploring this as software, rather than metaphor.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    34 of 59 4/13/14, 12:57 PM

    View Slide

  35. Consider again an HTTP server, this time accepting and returning strings.
    class HttpServer(object):
    def __call__(client_string):
    """string -> string"""
    # code goes here
    Deliver Your Software in an Envelope http://localhost:8080/print/
    35 of 59 4/13/14, 12:57 PM

    View Slide

  36. In the space of all values we can separate out and only allow strings. The edge of the envelope is smooth because we're
    using a type system.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    36 of 59 4/13/14, 12:57 PM

    View Slide

  37. Within the space of all strings, some parse as HTTP requests and some don't. We represent those as points because type
    systems can't make this discrimination within the string type.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    37 of 59 4/13/14, 12:57 PM

    View Slide

  38. The values we choose as test data in our system tests serve to illustrate the edge of our envelope that cannot be described by
    the type system. --TRANSITION-- to javadoc showing how these types and documentation can work together.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    38 of 59 4/13/14, 12:57 PM

    View Slide

  39. No tech writer or engineer put those types in that documentation - in this case the type system and documentation system are
    working together.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    39 of 59 4/13/14, 12:57 PM

    View Slide

  40. Who recognizes this text? Despite the legal disclaimer of fitness, the engineering reality is that we would like our clients to
    have some assurance that our software achieves something. Live, running code in the context of a demo is a great way to
    persuade that our software does something useful. What conducts a demonstration of live, running code? A test. So ship
    your tests to your clients.
    THE SOFTWARE IS PROVIDED "AS
    IS", WITHOUT WARRANTY OF ANY
    KIND, EXPRESS OR IMPLIED,
    INCLUDING BUT NOT LIMITED TO
    THE WARRANTIES OF
    MERCHANTABILITY, FITNESS FOR A
    PARTICULAR PURPOSE AND
    NONINFRINGEMENT.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    40 of 59 4/13/14, 12:57 PM

    View Slide

  41. Machine-checkable documentation wins. Good checkable-documentation formats include ([doctest](http://docs.python.org
    /2/library/doctest.html), [Sphinx](http://sphinx-doc.org/), Go’s [example test functions](http://golang.org/pkg/testing/), and
    [cram](https://pypi.python.org/pypi/cram) for shell tools, Docs are good, but explaining is losing. Bonus points if your docs
    are automatically tested.
    def sqrt(n):
    """Returns the square root of N.
    >>> sqrt(5)
    2.23606797749979
    >>> sqrt(-1)
    ValueError
    """
    # code goes here
    Deliver Your Software in an Envelope http://localhost:8080/print/
    41 of 59 4/13/14, 12:57 PM

    View Slide

  42. Fewer promises make everyone happier - it's easier to understand the library's functionality as a user, and it's less to maintain
    for an author. Aim for the least-committed envelope possible without crippling your clients or hampering their performance.
    Narrow the input side but widen the output side. Transition to examples. Good/bad envelope definition examples. Difference
    between what must be and what happens to be.
    Less (Guarantees) Is More (Freedom To
    Keep Those Guarantees)
    Deliver Your Software in an Envelope http://localhost:8080/print/
    42 of 59 4/13/14, 12:57 PM

    View Slide

  43. Examples of Doing It
    Wrong
    Deliver Your Software in an Envelope http://localhost:8080/print/
    43 of 59 4/13/14, 12:57 PM

    View Slide

  44. Provides a (decent-ish) http client library, right in the standard library. This is actually pretty great.
    Python's httplib
    Deliver Your Software in an Envelope http://localhost:8080/print/
    44 of 59 4/13/14, 12:57 PM

    View Slide

  45. This state machine is taken from the docs. It doesn't fit on the slide, but that's not really what's important here.
    (null)
    |
    | HTTPConnection()
    v
    Idle
    |
    | putrequest()
    v
    Request-started
    |
    | ( putheader() )* endheaders()
    v
    Request-sent
    |
    | response = getresponse()
    v
    Unread-response [Response-headers-read]
    |\____________________
    | |
    | response.read() | putrequest()
    v v
    Idle Req-started-unread-response
    ______/|
    / |
    Deliver Your Software in an Envelope http://localhost:8080/print/
    45 of 59 4/13/14, 12:57 PM

    View Slide

  46. This means it's impossible to make any substantial changes to the internals of this package, because you've told clients how
    to abuse your internals. This would have been a great thing to change in Python 3, but it [didn't happen](http://hg.python.org
    /cpython/file/b466fd273625/Lib/http/client.py). Python 4, I guess?
    "HTTPResponse class does not enforce
    this state machine, which implies
    sophisticated clients may accelerate the
    request/response pipeline."
    Deliver Your Software in an Envelope http://localhost:8080/print/
    46 of 59 4/13/14, 12:57 PM

    View Slide

  47. Dimensions of the input envelope are "all possible methods that could be exposed on Object" and the input envelope was
    built too widely.
    Java's Object methods
    Object.clone
    Object.finalize
    Deliver Your Software in an Envelope http://localhost:8080/print/
    47 of 59 4/13/14, 12:57 PM

    View Slide

  48. Good Examples
    Deliver Your Software in an Envelope http://localhost:8080/print/
    48 of 59 4/13/14, 12:57 PM

    View Slide

  49. HTTP (the protocol)
    Deliver Your Software in an Envelope http://localhost:8080/print/
    49 of 59 4/13/14, 12:57 PM

    View Slide

  50. Only 15 status codes in http/1.0. In 1.1, many more (38) are defined.
    Status-Code = "200" ; OK
    | "201" ; Created
    | "202" ; Accepted
    | "204" ; No Content
    | "301" ; Moved Permanently
    | "302" ; Moved Temporarily
    | "304" ; Not Modified
    | "400" ; Bad Request
    | "401" ; Unauthorized
    | "403" ; Forbidden
    | "404" ; Not Found
    | "500" ; Internal Server Error
    | "501" ; Not Implemented
    | "502" ; Bad Gateway
    | "503" ; Service Unavailable
    Deliver Your Software in an Envelope http://localhost:8080/print/
    50 of 59 4/13/14, 12:57 PM

    View Slide

  51. RFC 1945 is curious because it defines significantly fewer status codes than RFC 2616 (HTTP/1.1), which came 3 years
    later. 402 (Payment required), 206 (Partial content), etc all introduced later, and are safe for existing clients only because of
    the above defensive specification.
    RFC 1945:
    HTTP status codes are extensible[...].
    HTTP applications are not required to
    understand the meaning of all registered
    status codes[...]. However, applications
    must understand the class of any status
    code, as indicated by the first digit, and
    treat any unrecognized response as
    being equivalent to the x00 status code
    of that class, with the exception that an
    unrecognized response must not be
    cached.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    51 of 59 4/13/14, 12:57 PM

    View Slide

  52. Has a backport to 2.x, and it's pretty great.
    The
    concurrent.futures
    package in Python 3.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    52 of 59 4/13/14, 12:57 PM

    View Slide

  53. Comes with two executor classes out of the box. One lets you work around the GIL, but has a limitation in that it has to
    pickle arguments to send them to other processes. Easy to envision something that'd use Stackless's microthreads. Maybe
    even one that used any PEP3156 event loop.
    ThreadPoolExecutor
    ProcessPoolExecutor
    Deliver Your Software in an Envelope http://localhost:8080/print/
    53 of 59 4/13/14, 12:57 PM

    View Slide

  54. Is reassigning the value associated with a key a thread-safe operation in Python? Once came up in a Google-internal debate.
    It turns out that it is in the current version of Python - but as an implementation detail. There's no formal support for it. Our
    position in the debate was that you should program to the envelope of behavior that Python's documentation guarantees
    rather than some specific behavior that it implements.
    my_dict[my_key] = new_value
    Deliver Your Software in an Envelope http://localhost:8080/print/
    54 of 59 4/13/14, 12:57 PM

    View Slide

  55. --- TRANSITION --- This is all important because when we started programming, it felt like a bunch of discrete circuits.
    Source Code Is The Best Documentation!
    Deliver Your Software in an Envelope http://localhost:8080/print/
    55 of 59 4/13/14, 12:57 PM

    View Slide

  56. When we're learning to code, coding feels more like wiring: connecting exact bits of behavior into larger, still very precise
    systems.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    56 of 59 4/13/14, 12:57 PM

    View Slide

  57. After you start thinking and reasoning about software this way it feels more like plumbing a variable-flow system. You're
    still connecting outputs to downstream inputs, but when you do you're ensuring simple compatibility between them rather
    than making an exact match. Mature software creation is about assembling, transforming, and convolving behavioral
    envelopes.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    57 of 59 4/13/14, 12:57 PM

    View Slide

  58. Summary
    Provide small, well-defined envelopes.
    Only rely on the stated envelope others
    provide.
    Deliver Your Software in an Envelope http://localhost:8080/print/
    58 of 59 4/13/14, 12:57 PM

    View Slide

  59. Thanks!
    Deliver Your Software in an Envelope http://localhost:8080/print/
    59 of 59 4/13/14, 12:57 PM

    View Slide