Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Idiomatic APIs with the Python Data Model

Idiomatic APIs with the Python Data Model

The key to writing Pythonic classes, APIs and frameworks is leveraging the Python Data Model: a set of interfaces which, when implemented in your classes, enables them to leverage fundamental language features such as iteration, context managers, infix operators, attribute access control etc. This talk shows how, through a diverse selection of examples.

Luciano Ramalho

July 24, 2014
Tweet

More Decks by Luciano Ramalho

Other Decks in Technology

Transcript

  1. 2 Demo: Pingo API • Pingo means “pin go!” •

    Uniform API for controlling GPIO pins on small computers and microcontrollers • Currently supports Raspberry Pi, Arduino, pcDuino and UDOO http://pingo.io http://pingo.io
  2. 4 Example 1a: basic tests >>> from frenchdeck import FrenchDeck

    >>> deck = FrenchDeck() >>> len(deck) 52 >>> deck[:3] [Card(rank='2', suit='spades'), Card(rank='3', suit='spades'), Card(rank='4', suit='spades')] >>> deck[12::13] [Card(rank='A', suit='spades'), Card(rank='A', suit='diamonds'), Card(rank='A', suit='clubs'), Card(rank='A', suit='hearts')]
  3. 5 Example 1b: the in operator >>> from frenchdeck import

    Card >>> Card('Q', 'hearts') in deck True >>> Card('Z', 'clubs') in deck False
  4. 6 Example 1c: iteration >>> for card in deck: ...

    print(card) ... Card(rank='2', suit='spades') Card(rank='3', suit='spades') Card(rank='4', suit='spades') Card(rank='5', suit='spades') Card(rank='6', suit='spades') Card(rank='7', suit='spades') Card(rank='8', suit='spades') Card(rank='9', suit='spades') Card(rank='10', suit='spades') Card(rank='J', suit='spades') Card(rank='Q', suit='spades')
  5. 7 Example 1: implementation • Sequence protocol: – __len__ –

    __getitem__ import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] An immutable sequence An immutable sequence
  6. 8 Example 1: Shuffle the deck • Python has a

    random.shuffle function ready to use • Should we implement a shuffle method in FrenchDeck? • Pythonic alternative: make FrenchDeck a mutable sequence >>> from random import shuffle >>> l = list(range(10)) >>> l [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> shuffle(l) >>> l [3, 9, 1, 0, 5, 7, 6, 8, 4, 2]
  7. 9 Mutable Sequence protocol • The complete Mutable sequence protocol:

    – __setitem__ – __delitem__ – insert import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card A mutable sequence A mutable sequence Partial implementation is OK for now Partial implementation is OK for now
  8. 10 Formal Interfaces x Protocols • Formal interface: – defined

    via ABCs - Abstract Base Classes • since Python 2.6/3.0 – inherit from ABC, implement all methods marked @abstractmethod – checked upon object instantiation • Protocol: – informal interface, defined by documentation – implementation: just methods • no need to inherit from ABC – may be partially implemented • if you don't need all the functionality of the full protocol
  9. 11 Duck Typing • It's all about protocols • The

    DNA of the bird is irrelevant • All that matters is what it does – the methods it implements duck typing, an intro (photo by flickr user cskk) license: cc-by-nc-nd source: https://flic.kr/p/bmQssJ
  10. 12 collections.abc • Formal interfaces defined as abstract classes: –

    Container, Sized, Iterable – Sequence – MutableSequence – Mapping, MutableMapping – Set, MutableSet – etc.
  11. 13 Sequence ABC import collections from collections import abc Card

    = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck(abc.MutableSequence): ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card def __delitem__(self, position): del self._cards[position] def insert(self, position, card): self._cards.insert(position, card) import collections from collections import abc Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck(abc.MutableSequence): ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card def __delitem__(self, position): del self._cards[position] def insert(self, position, card): self._cards.insert(position, card) A mutable sequence A mutable sequence • Sequence ABC: – __len__ – __getitem__ • MutableSequence ABC: – __setitem__ – __delitem__ – insert Must also implement Must also implement
  12. 14 The Python Data Model • Pragmatic • Fixed extension

    points: special methods (e.g. __new__) • Syntactic support: – keywords: for, yield, with, class, def, etc. – operators: almost all, including arithmetic and o.a, f(), d[i] Not magic! Not magic!
  13. 15 Special methods • “Dunder” syntax: __iter__ – “dunder” is

    shortcut for “double-underscore-prefix- and-suffix” • Documented in the Python Language Reference: – 3.2 - The standard type hierarchy – 3.3 - Special method names
  14. 16 From the Python Language Reference: 3.3. Special method names

    […] When implementing a class that emulates any built-in type, it is important that the emulation only be implemented to the degree that it makes sense for the object being modelled. For example, some sequences may work well with retrieval of individual elements, but extracting a slice may not make sense.
  15. 17 Insight: why is len a function? • Sequences are

    a family of standard types unrelated by inheritance but united by common protocols – len(s), e in s – s[i], s[a:b:c] – for e in s, reversed(s) – s1 + s2, s * n • Just as numeric types are united by common arithmetic operators and functions Nobody complains that abs(x) should be x.abs() or claims that x.add(y) is better than x + y Nobody complains that abs(x) should be x.abs() or claims that x.add(y) is better than x + y
  16. 18 Insight: why len is a function • Pragmatics: –

    len(s) is more readable than s.len() – Guido's opinion – len(s) is shorter than s.len() – len(s) is faster than s.len() • Why len(s) is faster than s.len() – for built-in types, the CPython implementation of len(s) simply gets the value of the ob_size field in a PyVarObject C struct – no attribute lookup and no method call
  17. 19 Insight: __len__ increases consistency • The __len__ special method

    is a fallback called by the implementation of len for user defined types • With this, len(s) is fast for built-ins but also works for any sequence s – no need to remember whether to write s.length or s.length() depending on the type of s Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. (from The Zen of Python) Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. (from The Zen of Python)
  18. 20 Recap: special methods • Emulate built-in collections • Overload

    arithmetic, comparison and bitwise operators – __add__, __lt__, __xor__... – + * / - == != <= >= & ^ | << >> • Iteration: – __iter__, __next__ • Attribute and item access operators – __getattr__, __setitem__... – a.x b[i] k in c • Call operator: __call__ – f(x) # same as f.__call__(x) • Context managers: – __enter__, __exit__
  19. 21 How special methods are used • Your code does

    not usually calls the special methods – they are mostly designed to be called by the Python interpreter • The call is usually implicit • Avoid explicit calls to special methods if possible • Example of implicit calls: – for i in x: → iter(x) → x.__iter__() or → x.__getitem__(0)
  20. 23 Example: 2d vector • Fields: x, y • Operations:

    – v1 + v2 # vector addition – v1 * k # scalar multiplication – abs(v) # modulus – bool(x) # convert to boolean
  21. 24 Vector in action >>> v1 = Vector(2, 4) >>>

    v2 = Vector(2, 1) >>> v1 + v2 Vector(4, 5) >>> v = Vector(3, 4) >>> abs(v) 5.0 >>> v * 3 Vector(9, 12) >>> abs(v * 3) 15.0 >>> bool(v) True >>> bool(Vector(0,0)) False
  22. 25 Vector 0.1 Row 1 Row 2 Row 3 Row

    4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) The built-in complex type can be used to represent 2D vectors, but this class can be extended to represent n-dimensional vectors. The built-in complex type can be used to represent 2D vectors, but this class can be extended to represent n-dimensional vectors.
  23. 27 A problem with * • Initial implementation of Vector

    does not honor the commutative property of multiplication >>> v * 3 Vector(9, 12) >>> 3 * v Traceback (most recent call last): ... TypeError: unsupported operand type(s) for *: 'int' and 'Vector'
  24. 28 How Python computes x * y • x.__mul__(y) is

    tried • if x.__mul__ does not exist, or returns NotImplemented: – y.__rmul__(x) is tried – if y.__rmul__ does not exist, or returns NotImplemented: • raise TypeError This is a variant of the double-dispatch pattern: http://bit.ly/st-double-dispatch This is a variant of the double-dispatch pattern: http://bit.ly/st-double-dispatch
  25. 29 Vector 0.2 Row 1 Row 2 Row 3 Row

    4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) def __rmul__(self, scalar): return self * scalar from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) def __rmul__(self, scalar): return self * scalar __rmul__ delegates to __mul__ __rmul__ delegates to __mul__
  26. 31 Another problem with * • Our * is designed

    to do only scalar multiplication and not vector multiplication >>> Vector(1, 2) * Vector(3, 4) Vector(Vector(3, 4), Vector(6, 8)) This makes no sense! This makes no sense!
  27. 32 Vector 0.3 Row 1 Row 2 Row 3 Row

    4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot from numbers import Real class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar from math import hypot from numbers import Real class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar Use numbers.Real to check for a scalar value Use numbers.Real to check for a scalar value
  28. 33 Double-dispatch recap • Implement __op__ and __rop__ to suport

    operations with other types – eg. • str * int • Vector * float • Return NotImplemented if you don't recognize the other type – note that NotImplemented is not an exception, but a special value • Returning NotImplemented in your __op__ lets Python try __rop__ with the other operand
  29. 35 Default behavior of augmented assignment • Unless overridden, the

    default implementation of augmented assignment is the same as the simple operator in a plain assignment – eg. by default, a += b is the same as a = a + b >>> v1 = Vector(1, 2) >>> v1 Vector(1, 1) >>> id(v1) 4329163240 >>> v1 *= 3 >>> v1 Vector(3, 6) >>> id(v1) 4329163408 v1 now refers to a new object v1 now refers to a new object
  30. 36 Vector 0.4 Row 1 Row 2 Row 3 Row

    4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 ... class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): ... def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar def __imul__(self, scalar): self.x *= scalar self.y *= scalar return self ... class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): ... def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar def __imul__(self, scalar): self.x *= scalar self.y *= scalar return self Change attributes of instance and return it Change attributes of instance and return it
  31. 37 In-place object modification • Implementing __iop__ allows changes to

    the target instance (the left operand) • This is probably not a good idea for math vectors – example chosen for brevity >>> v1 = Vector(1, 2) >>> v1 Vector(1, 1) >>> id(v1) 4330211872 >>> v1 *= 3 >>> v1 Vector(3, 6) >>> id(v1) 4330211872 v1 refers to the same object v1 refers to the same object
  32. 38 Augmented assignment recap • Augmented assignment operators may or

    may not change its target (the left operand) • In contrast, standard infix operators should create now objects and never change the operands • Only implement __iop__ if you intend to change the target in-place – default implementation works by creating a new object • __iop__ method should return self
  33. 40 Raymond Hettinger's analogy • Subroutines allow abstracting a commonly

    used sandwich filling • Context managers allow abstracting a commonly used bread
  34. 41 Context manager • The most trivial use: file handling

    with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) The file object fp returned by open is the context manager here The file object fp returned by open is the context manager here
  35. 42 Context manager with open('atext.txt', encoding='utf-8') as fp: for line

    in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) try: fp = open('atext.txt', encoding='utf-8') for line in fp: line = line.rstrip() if line: print(linha) finally: fp.close() try: fp = open('atext.txt', encoding='utf-8') for line in fp: line = line.rstrip() if line: print(linha) finally: fp.close() ...is a replacement for: ...is a replacement for:
  36. 43 Context manager: __enter__ • context.__enter__(): called at the top

    of the block – whatever it returns is bound to the target variable (eg. fp) – __enter__ often returns self, but not necessarily with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) __enter__ returns the file object itself __enter__ returns the file object itself
  37. 44 Context manager: __exit__ • Called when control flow exits

    the block: – context.__enter__(exc_type, exc_value, traceback) • If the arguments are None, None, None, nothing special happened • Otherwise, exc_type is the exception class, exc_value the exception instance and traceback the stack trace object • To suppress the exception, __enter__ must return True; otherwise the exception propagates
  38. 45 Context manager example class Room: def __init__(self, silent=False): self.silent

    = silent def __enter__(self): print('entering the room') return self def __exit__(self, exc_type, exc_value, traceback): if exc_type is None: print('exiting the room with no incident.') else: print('exiting with {}'.format( (exc_type, exc_value, traceback))) if self.silent: return True # suppress exception propagation class Room: def __init__(self, silent=False): self.silent = silent def __enter__(self): print('entering the room') return self def __exit__(self, exc_type, exc_value, traceback): if exc_type is None: print('exiting the room with no incident.') else: print('exiting with {}'.format( (exc_type, exc_value, traceback))) if self.silent: return True # suppress exception propagation >>> with Room() as r: ... print(r) ... entering the room <__main__.Room object at 0x1005ff160> exiting the room with no incident.
  39. 46 Context manager example >>> with Room() as r: ...

    r.flavor() ... entering the room exiting with (<class 'AttributeError'>, AttributeError("'Room' object has no attribute 'flavor'",), <traceback object at 0x1005fe648>) Traceback (most recent call last): File "<stdin>", line 2, in <module> AttributeError: 'Room' object has no attribute 'flavor' >>> >>> with Room(silent=True) as r: ... r.flavor() ... entering the room exiting with (<class 'AttributeError'>, AttributeError("'Room' object has no attribute 'flavor'",), <traceback object at 0x1005fe748>)
  40. 47 Uses of context managers • Tools and examples in

    the contextlib module – contextmanager – closing... • threading: – Lock – Semaphore – Condition – ... • unittest: – TestCase.subTest – TestCase.assertXXX – mock.patch • a growing number of examples... with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3) with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)
  41. 48 @contextmanager example from contextlib import contextmanager @contextmanager def room(silent=False):

    print('entering the room') try: yield room except Exception as exc: traceback = exc.__traceback__ print('exiting with {}'.format((type(exc), exc, traceback))) if not silent: raise else: print('exiting the room with no incident.') from contextlib import contextmanager @contextmanager def room(silent=False): print('entering the room') try: yield room except Exception as exc: traceback = exc.__traceback__ print('exiting with {}'.format((type(exc), exc, traceback))) if not silent: raise else: print('exiting the room with no incident.') >>> with room() as r: ... print(r) ... entering the room <function room at 0x102a41400> exiting the room with no incident.
  42. 49 @contextmanager example >>> with room() as r: ... r.flavor()

    ... entering the room exiting with (<class 'AttributeError'>, AttributeError("'function' object has no attribute 'flavor'",), <traceback object at 0x1006feb08>) Traceback (most recent call last): File "<stdin>", line 2, in <module> AttributeError: 'function' object has no attribute 'flavor' >>> with room(silent=True) as r: ... r.flavor() ... entering the room exiting with (<class 'AttributeError'>, AttributeError("'function' object has no attribute 'flavor'",), <traceback object at 0x10210b4c8>)
  43. 50 Idiomatic Python APIs • Make proper use of the

    special methods • Let the user apply previous knowledge of the standard types and operations • Make it easy to leverage existing libraries • Come with “batteries included” • Use duck typing for enhanced interoperation with user-defined types • Provide ready to use objects • Don't require subclassing for basic usage • Leverage standard language objects: containers, functions, classes, modules • Objects all the way!
  44. 51 Thanks, keep in touch! • Luciano Ramalho – [email protected]

    – @ramalhoorg don't forget to rate this talk! thanks! don't forget to rate this talk! thanks!