Slide 1

Slide 1 text

Idiomatic APIs With the Python Data Model Luciano Ramalho [email protected] Twitter: @ramalhoorg

Slide 2

Slide 2 text

2 Demo: Pingo API ● Pingo means “pin go!” ● Uniform API for controlling GPIO pins on small computers and microcontrollers ● Currently supports Raspberry Pi, Arduino, pcDuino and UDOO http://pingo.io http://pingo.io

Slide 3

Slide 3 text

3 Example 1: Pythonic card deck

Slide 4

Slide 4 text

4 Example 1a: basic tests >>> from frenchdeck import FrenchDeck >>> deck = FrenchDeck() >>> len(deck) 52 >>> deck[:3] [Card(rank='2', suit='spades'), Card(rank='3', suit='spades'), Card(rank='4', suit='spades')] >>> deck[12::13] [Card(rank='A', suit='spades'), Card(rank='A', suit='diamonds'), Card(rank='A', suit='clubs'), Card(rank='A', suit='hearts')]

Slide 5

Slide 5 text

5 Example 1b: the in operator >>> from frenchdeck import Card >>> Card('Q', 'hearts') in deck True >>> Card('Z', 'clubs') in deck False

Slide 6

Slide 6 text

6 Example 1c: iteration >>> for card in deck: ... print(card) ... Card(rank='2', suit='spades') Card(rank='3', suit='spades') Card(rank='4', suit='spades') Card(rank='5', suit='spades') Card(rank='6', suit='spades') Card(rank='7', suit='spades') Card(rank='8', suit='spades') Card(rank='9', suit='spades') Card(rank='10', suit='spades') Card(rank='J', suit='spades') Card(rank='Q', suit='spades')

Slide 7

Slide 7 text

7 Example 1: implementation ● Sequence protocol: – __len__ – __getitem__ import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] An immutable sequence An immutable sequence

Slide 8

Slide 8 text

8 Example 1: Shuffle the deck ● Python has a random.shuffle function ready to use ● Should we implement a shuffle method in FrenchDeck? ● Pythonic alternative: make FrenchDeck a mutable sequence >>> from random import shuffle >>> l = list(range(10)) >>> l [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> shuffle(l) >>> l [3, 9, 1, 0, 5, 7, 6, 8, 4, 2]

Slide 9

Slide 9 text

9 Mutable Sequence protocol ● The complete Mutable sequence protocol: – __setitem__ – __delitem__ – insert import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card import collections Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck: ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card A mutable sequence A mutable sequence Partial implementation is OK for now Partial implementation is OK for now

Slide 10

Slide 10 text

10 Formal Interfaces x Protocols ● Formal interface: – defined via ABCs - Abstract Base Classes ● since Python 2.6/3.0 – inherit from ABC, implement all methods marked @abstractmethod – checked upon object instantiation ● Protocol: – informal interface, defined by documentation – implementation: just methods ● no need to inherit from ABC – may be partially implemented ● if you don't need all the functionality of the full protocol

Slide 11

Slide 11 text

11 Duck Typing ● It's all about protocols ● The DNA of the bird is irrelevant ● All that matters is what it does – the methods it implements duck typing, an intro (photo by flickr user cskk) license: cc-by-nc-nd source: https://flic.kr/p/bmQssJ

Slide 12

Slide 12 text

12 collections.abc ● Formal interfaces defined as abstract classes: – Container, Sized, Iterable – Sequence – MutableSequence – Mapping, MutableMapping – Set, MutableSet – etc.

Slide 13

Slide 13 text

13 Sequence ABC import collections from collections import abc Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck(abc.MutableSequence): ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card def __delitem__(self, position): del self._cards[position] def insert(self, position, card): self._cards.insert(position, card) import collections from collections import abc Card = collections.namedtuple('Card', ['rank', 'suit']) class FrenchDeck(abc.MutableSequence): ranks = [str(n) for n in range(2, 11)] + list('JQKA') suits = 'spades diamonds clubs hearts'.split() def __init__(self): self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks] def __len__(self): return len(self._cards) def __getitem__(self, position): return self._cards[position] def __setitem__(self, position, card): self._cards[position] = card def __delitem__(self, position): del self._cards[position] def insert(self, position, card): self._cards.insert(position, card) A mutable sequence A mutable sequence ● Sequence ABC: – __len__ – __getitem__ ● MutableSequence ABC: – __setitem__ – __delitem__ – insert Must also implement Must also implement

Slide 14

Slide 14 text

14 The Python Data Model ● Pragmatic ● Fixed extension points: special methods (e.g. __new__) ● Syntactic support: – keywords: for, yield, with, class, def, etc. – operators: almost all, including arithmetic and o.a, f(), d[i] Not magic! Not magic!

Slide 15

Slide 15 text

15 Special methods ● “Dunder” syntax: __iter__ – “dunder” is shortcut for “double-underscore-prefix- and-suffix” ● Documented in the Python Language Reference: – 3.2 - The standard type hierarchy – 3.3 - Special method names

Slide 16

Slide 16 text

16 From the Python Language Reference: 3.3. Special method names […] When implementing a class that emulates any built-in type, it is important that the emulation only be implemented to the degree that it makes sense for the object being modelled. For example, some sequences may work well with retrieval of individual elements, but extracting a slice may not make sense.

Slide 17

Slide 17 text

17 Insight: why is len a function? ● Sequences are a family of standard types unrelated by inheritance but united by common protocols – len(s), e in s – s[i], s[a:b:c] – for e in s, reversed(s) – s1 + s2, s * n ● Just as numeric types are united by common arithmetic operators and functions Nobody complains that abs(x) should be x.abs() or claims that x.add(y) is better than x + y Nobody complains that abs(x) should be x.abs() or claims that x.add(y) is better than x + y

Slide 18

Slide 18 text

18 Insight: why len is a function ● Pragmatics: – len(s) is more readable than s.len() – Guido's opinion – len(s) is shorter than s.len() – len(s) is faster than s.len() ● Why len(s) is faster than s.len() – for built-in types, the CPython implementation of len(s) simply gets the value of the ob_size field in a PyVarObject C struct – no attribute lookup and no method call

Slide 19

Slide 19 text

19 Insight: __len__ increases consistency ● The __len__ special method is a fallback called by the implementation of len for user defined types ● With this, len(s) is fast for built-ins but also works for any sequence s – no need to remember whether to write s.length or s.length() depending on the type of s Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. (from The Zen of Python) Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. (from The Zen of Python)

Slide 20

Slide 20 text

20 Recap: special methods ● Emulate built-in collections ● Overload arithmetic, comparison and bitwise operators – __add__, __lt__, __xor__... – + * / - == != <= >= & ^ | << >> ● Iteration: – __iter__, __next__ ● Attribute and item access operators – __getattr__, __setitem__... – a.x b[i] k in c ● Call operator: __call__ – f(x) # same as f.__call__(x) ● Context managers: – __enter__, __exit__

Slide 21

Slide 21 text

21 How special methods are used ● Your code does not usually calls the special methods – they are mostly designed to be called by the Python interpreter ● The call is usually implicit ● Avoid explicit calls to special methods if possible ● Example of implicit calls: – for i in x: → iter(x) → x.__iter__() or → x.__getitem__(0)

Slide 22

Slide 22 text

22 Operator overloading Doing it right

Slide 23

Slide 23 text

23 Example: 2d vector ● Fields: x, y ● Operations: – v1 + v2 # vector addition – v1 * k # scalar multiplication – abs(v) # modulus – bool(x) # convert to boolean

Slide 24

Slide 24 text

24 Vector in action >>> v1 = Vector(2, 4) >>> v2 = Vector(2, 1) >>> v1 + v2 Vector(4, 5) >>> v = Vector(3, 4) >>> abs(v) 5.0 >>> v * 3 Vector(9, 12) >>> abs(v * 3) 15.0 >>> bool(v) True >>> bool(Vector(0,0)) False

Slide 25

Slide 25 text

25 Vector 0.1 Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) The built-in complex type can be used to represent 2D vectors, but this class can be extended to represent n-dimensional vectors. The built-in complex type can be used to represent 2D vectors, but this class can be extended to represent n-dimensional vectors.

Slide 26

Slide 26 text

26 Reverse operators Honoring the commutative property (when it makes sense)

Slide 27

Slide 27 text

27 A problem with * ● Initial implementation of Vector does not honor the commutative property of multiplication >>> v * 3 Vector(9, 12) >>> 3 * v Traceback (most recent call last): ... TypeError: unsupported operand type(s) for *: 'int' and 'Vector'

Slide 28

Slide 28 text

28 How Python computes x * y ● x.__mul__(y) is tried ● if x.__mul__ does not exist, or returns NotImplemented: – y.__rmul__(x) is tried – if y.__rmul__ does not exist, or returns NotImplemented: ● raise TypeError This is a variant of the double-dispatch pattern: http://bit.ly/st-double-dispatch This is a variant of the double-dispatch pattern: http://bit.ly/st-double-dispatch

Slide 29

Slide 29 text

29 Vector 0.2 Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) def __rmul__(self, scalar): return self * scalar from math import hypot class Vector: def __init__(self, x=0, y=0): self.x = x self.y = y def __repr__(self): return 'Vector(%r, %r)' % (self.x, self.y) def __abs__(self): return hypot(self.x, self.y) def __bool__(self): return bool(abs(self)) def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): return Vector(self.x * scalar, self.y * scalar) def __rmul__(self, scalar): return self * scalar __rmul__ delegates to __mul__ __rmul__ delegates to __mul__

Slide 30

Slide 30 text

30 Giving up Returning NotImplemented to let the interpreter take further action

Slide 31

Slide 31 text

31 Another problem with * ● Our * is designed to do only scalar multiplication and not vector multiplication >>> Vector(1, 2) * Vector(3, 4) Vector(Vector(3, 4), Vector(6, 8)) This makes no sense! This makes no sense!

Slide 32

Slide 32 text

32 Vector 0.3 Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 from math import hypot from numbers import Real class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar from math import hypot from numbers import Real class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): x = self.x + other.x y = self.y + other.y return Vector(x, y) def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar Use numbers.Real to check for a scalar value Use numbers.Real to check for a scalar value

Slide 33

Slide 33 text

33 Double-dispatch recap ● Implement __op__ and __rop__ to suport operations with other types – eg. ● str * int ● Vector * float ● Return NotImplemented if you don't recognize the other type – note that NotImplemented is not an exception, but a special value ● Returning NotImplemented in your __op__ lets Python try __rop__ with the other operand

Slide 34

Slide 34 text

34 Augmented assignment In-place operations (when they make sense)

Slide 35

Slide 35 text

35 Default behavior of augmented assignment ● Unless overridden, the default implementation of augmented assignment is the same as the simple operator in a plain assignment – eg. by default, a += b is the same as a = a + b >>> v1 = Vector(1, 2) >>> v1 Vector(1, 1) >>> id(v1) 4329163240 >>> v1 *= 3 >>> v1 Vector(3, 6) >>> id(v1) 4329163408 v1 now refers to a new object v1 now refers to a new object

Slide 36

Slide 36 text

36 Vector 0.4 Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 ... class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): ... def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar def __imul__(self, scalar): self.x *= scalar self.y *= scalar return self ... class Vector: def __init__(self, x=0, y=0): ... def __repr__(self): ... def __abs__(self): ... def __bool__(self): ... def __add__(self, other): ... def __mul__(self, scalar): if isinstance(scalar, Real): return Vector(self.x * scalar, self.y * scalar) else: return NotImplemented def __rmul__(self, scalar): return self * scalar def __imul__(self, scalar): self.x *= scalar self.y *= scalar return self Change attributes of instance and return it Change attributes of instance and return it

Slide 37

Slide 37 text

37 In-place object modification ● Implementing __iop__ allows changes to the target instance (the left operand) ● This is probably not a good idea for math vectors – example chosen for brevity >>> v1 = Vector(1, 2) >>> v1 Vector(1, 1) >>> id(v1) 4330211872 >>> v1 *= 3 >>> v1 Vector(3, 6) >>> id(v1) 4330211872 v1 refers to the same object v1 refers to the same object

Slide 38

Slide 38 text

38 Augmented assignment recap ● Augmented assignment operators may or may not change its target (the left operand) ● In contrast, standard infix operators should create now objects and never change the operands ● Only implement __iop__ if you intend to change the target in-place – default implementation works by creating a new object ● __iop__ method should return self

Slide 39

Slide 39 text

39 Context managers Abstracting “the bread”

Slide 40

Slide 40 text

40 Raymond Hettinger's analogy ● Subroutines allow abstracting a commonly used sandwich filling ● Context managers allow abstracting a commonly used bread

Slide 41

Slide 41 text

41 Context manager ● The most trivial use: file handling with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) The file object fp returned by open is the context manager here The file object fp returned by open is the context manager here

Slide 42

Slide 42 text

42 Context manager with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) try: fp = open('atext.txt', encoding='utf-8') for line in fp: line = line.rstrip() if line: print(linha) finally: fp.close() try: fp = open('atext.txt', encoding='utf-8') for line in fp: line = line.rstrip() if line: print(linha) finally: fp.close() ...is a replacement for: ...is a replacement for:

Slide 43

Slide 43 text

43 Context manager: __enter__ ● context.__enter__(): called at the top of the block – whatever it returns is bound to the target variable (eg. fp) – __enter__ often returns self, but not necessarily with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) with open('atext.txt', encoding='utf-8') as fp: for line in fp: line = line.rstrip() if line: print(linha) __enter__ returns the file object itself __enter__ returns the file object itself

Slide 44

Slide 44 text

44 Context manager: __exit__ ● Called when control flow exits the block: – context.__enter__(exc_type, exc_value, traceback) ● If the arguments are None, None, None, nothing special happened ● Otherwise, exc_type is the exception class, exc_value the exception instance and traceback the stack trace object ● To suppress the exception, __enter__ must return True; otherwise the exception propagates

Slide 45

Slide 45 text

45 Context manager example class Room: def __init__(self, silent=False): self.silent = silent def __enter__(self): print('entering the room') return self def __exit__(self, exc_type, exc_value, traceback): if exc_type is None: print('exiting the room with no incident.') else: print('exiting with {}'.format( (exc_type, exc_value, traceback))) if self.silent: return True # suppress exception propagation class Room: def __init__(self, silent=False): self.silent = silent def __enter__(self): print('entering the room') return self def __exit__(self, exc_type, exc_value, traceback): if exc_type is None: print('exiting the room with no incident.') else: print('exiting with {}'.format( (exc_type, exc_value, traceback))) if self.silent: return True # suppress exception propagation >>> with Room() as r: ... print(r) ... entering the room <__main__.Room object at 0x1005ff160> exiting the room with no incident.

Slide 46

Slide 46 text

46 Context manager example >>> with Room() as r: ... r.flavor() ... entering the room exiting with (, AttributeError("'Room' object has no attribute 'flavor'",), ) Traceback (most recent call last): File "", line 2, in AttributeError: 'Room' object has no attribute 'flavor' >>> >>> with Room(silent=True) as r: ... r.flavor() ... entering the room exiting with (, AttributeError("'Room' object has no attribute 'flavor'",), )

Slide 47

Slide 47 text

47 Uses of context managers ● Tools and examples in the contextlib module – contextmanager – closing... ● threading: – Lock – Semaphore – Condition – ... ● unittest: – TestCase.subTest – TestCase.assertXXX – mock.patch ● a growing number of examples... with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3) with self.assertRaises(SomeException) as cm: do_something() the_exception = cm.exception self.assertEqual(the_exception.error_code, 3)

Slide 48

Slide 48 text

48 @contextmanager example from contextlib import contextmanager @contextmanager def room(silent=False): print('entering the room') try: yield room except Exception as exc: traceback = exc.__traceback__ print('exiting with {}'.format((type(exc), exc, traceback))) if not silent: raise else: print('exiting the room with no incident.') from contextlib import contextmanager @contextmanager def room(silent=False): print('entering the room') try: yield room except Exception as exc: traceback = exc.__traceback__ print('exiting with {}'.format((type(exc), exc, traceback))) if not silent: raise else: print('exiting the room with no incident.') >>> with room() as r: ... print(r) ... entering the room exiting the room with no incident.

Slide 49

Slide 49 text

49 @contextmanager example >>> with room() as r: ... r.flavor() ... entering the room exiting with (, AttributeError("'function' object has no attribute 'flavor'",), ) Traceback (most recent call last): File "", line 2, in AttributeError: 'function' object has no attribute 'flavor' >>> with room(silent=True) as r: ... r.flavor() ... entering the room exiting with (, AttributeError("'function' object has no attribute 'flavor'",), )

Slide 50

Slide 50 text

50 Idiomatic Python APIs ● Make proper use of the special methods ● Let the user apply previous knowledge of the standard types and operations ● Make it easy to leverage existing libraries ● Come with “batteries included” ● Use duck typing for enhanced interoperation with user-defined types ● Provide ready to use objects ● Don't require subclassing for basic usage ● Leverage standard language objects: containers, functions, classes, modules ● Objects all the way!

Slide 51

Slide 51 text

51 Thanks, keep in touch! ● Luciano Ramalho – [email protected] – @ramalhoorg don't forget to rate this talk! thanks! don't forget to rate this talk! thanks!