Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Static Types in Python

Static Types in Python

Python is a great language for writing software because it’s expressive, concise, and clear. But as a codebase grows large, Python can get harder to understand than comparable Java, Go, or Haskell. The author of almost every piece of code has expectations about what types the values involved will have, and a reader has to form the same expectations — but in Python these expectations are usually left implicit, and the readers to fend for themselves.

I’ll describe recent developments that enable Python users to make these type expectations — aka “static types” — explicit. A new standard, PEP 484, provides notation for writing down types, and a type-checker, Mypy, analyzes the program to confirm that it really abides by the expectations the authors have written down. Dropbox is supporting the work and its engineers eagerly beginning to adopt it in the company’s 3-million-line Python codebase. Everything is 100% open source and applies to both Python 3 and Python 2. More users, bug reports, and patches are all welcome.

(Given at the Recurse Center on 2016-05-09.)

Greg Price

May 09, 2016
Tweet

More Decks by Greg Price

Other Decks in Programming

Transcript

  1. Good news Python has optional static types (in beta). You

    can use them in your programs. This talk: why and how.
  2. Joint work Tech is not all or mainly my own

    work! Core Mypy team: Jukka Lehtosalo, Guido vR, me, David Fisher, Reid Barton All supported by Dropbox
  3. Why static types? Lots of things static types can do

    … even lots of things “static types” can mean
  4. Understanding code Understanding what some code does Humans: 40% of

    eng time (Dropbox survey, 2015) Computers too (navigate, refactor, find bugs)
  5. Understanding code What does this code do? for entry in

    entries: entry.data.validate() Want to read what validate does.
  6. Understanding code for entry in entries: entry.data.validate() Want to read

    what validate does. git grep “def validate” -> 45 results
  7. Understanding code: types for entry in entries: entry.data.validate() git grep

    def\ validate -> 45 results; which? Have to know the type.
  8. Understanding code: types for entry in entries: entry.data.validate() What's the

    type of entry.data? Can find out: – entries a parameter? Grep for call sites – … a return value? Find that method
  9. Understanding code: types for entry in entries: entry.data.validate() What's the

    type of entry.data? Can find out, but a lot of work
  10. Understanding code: types What's the type of (some expression)? Can

    find out, but a lot of work … and maybe the type varies! Subclasses, duck typing, generic containers … and undecidable in general!
  11. Understanding code: types What's the type of (some expression)? Lots

    of excuses, but: The author had some kind of answer. “int”; “list of basestring”; “T and list of T, for any T” That would be enough.
  12. Understanding code: static types What's the type of (some expression)?

    The author had some kind of answer. static type, n.: the expectation the author had of the (runtime) type of an expression's value (usage note: many competing definitions in academia)
  13. Static types: writing them down static type, n.: the expectation

    the author had of the (runtime) type of an expression's value Explicit is better than implicit.
  14. Static types: writing them down Explicit is better than implicit.

    def something(self, entries): '''entries: a list of LogEntry''' for entry in entries: entry.data.validate()
  15. Static types: writing them down Explicit is better than implicit.

    Checked is better than unchecked. If we carry out those principles, we can better understand our code.
  16. Static types: writing them down Explicit is better than implicit.

    → Standard type notation: PEP 484 (Python 2 and Python 3 compatible!) Checked is better than unchecked. → Type-checker: Mypy (no runtime effect)
  17. Questions on why? Next up: how – PEP 484 type

    notation – Type system – Mypy type-checker, workflow
  18. Notation Python 3: function annotations (PEP 3107) def gcd(a: int,

    b: int) -> int: … Python 2 and 3: # type: comments def gcd(a, b): # type: (int, int) -> int ...
  19. Notation Python 3: function annotations (PEP 3107) def gcd(a: int,

    b: int) -> int: ... Annotations are Python expressions Design constraint affected notation
  20. Parametric polymorphism Key example: generic containers from typing import List,

    Dict, Set, Iterable def sum(a): # type: (List[int]) -> int ... def toposort(data): # type: (Dict[str, Set[str]]) -> Iterable[Set[str]] ...
  21. Parametric polymorphism from typing import TypeVar, Generic T = TypeVar('T')

    class MyList(Generic[T]): def append(self, item): # type: (T) -> None ...
  22. Gradual typing Typed code coexists with untyped code Still want

    to type-check the typed code At the boundary: type Any
  23. Notation: other people's code The libraries you use may lack

    written types Solution: write them down, in separate files “stub files”, extension .pyi Collaborative “typeshed” repo has stdlib, more
  24. Tools Designed for gradual adoption Run on just the files

    you've given types to: $ mypy --silent-imports file.py dir/ Other modules become all Any
  25. Status at Dropbox • ~15 eager early adopters • 50kLOC

    now has explicit types • People love it: survey – 100% agree “easier to read and understand” – 100% agree “I am more productive” – 45% “adding to existing code is a lot of work” … but 90% glad to have done it
  26. Dropbox: people love having types “Refactors are already so much

    easier with the typing that's been added. Game changing for the sync engine!” “This makes it sooo much easier for me to read code and figure out what parameters are supposed to be!” “It has addressed my personal biggest pain point with python. It’s really increased the scope of things I’m excited to consider python as great for.”
  27. Try it! • Annotate your favorite Python codebase – Maybe

    Zulip? http://zulip.readthedocs.io/en/latest/mypy.html • Report any issues on GitHub – We respond fast and love hearing things • Patches and code reviews also welcome