Slide 1

Slide 1 text

Write More Robust Code with Weak References Jim Baker Write More Robust Code with Weak References Jim Baker jim.baker@{python.org, rackspace.com}

Slide 2

Slide 2 text

Write More Robust Code with Weak References Jim Baker Some possible questions Questions you might have in coming to this talk: What exactly are weak references?

Slide 3

Slide 3 text

Write More Robust Code with Weak References Jim Baker Some possible questions Questions you might have in coming to this talk: What exactly are weak references? How do they differ from strong references?

Slide 4

Slide 4 text

Write More Robust Code with Weak References Jim Baker Some possible questions Questions you might have in coming to this talk: What exactly are weak references? How do they differ from strong references? When would I use them anyway?

Slide 5

Slide 5 text

Write More Robust Code with Weak References Jim Baker About me Core developer of Jython

Slide 6

Slide 6 text

Write More Robust Code with Weak References Jim Baker About me Core developer of Jython Co-author of Definitive Guide to Jython from Apress

Slide 7

Slide 7 text

Write More Robust Code with Weak References Jim Baker About me Core developer of Jython Co-author of Definitive Guide to Jython from Apress Software developer at Rackspace

Slide 8

Slide 8 text

Write More Robust Code with Weak References Jim Baker About me Core developer of Jython Co-author of Definitive Guide to Jython from Apress Software developer at Rackspace Lecturer in CS at Univ of Colorado at Boulder

Slide 9

Slide 9 text

Write More Robust Code with Weak References Jim Baker Defining a weak reference A weak reference to an object is not enough to keep the object alive: when the only remaining references to a referent are weak references, garbage collection is free to destroy the referent and reuse its memory for something else. However, until the object is actually destroyed the weak reference may return the object even if there are no strong references to it. (https://docs.python.org/3/library/weakref.html)

Slide 10

Slide 10 text

Write More Robust Code with Weak References Jim Baker Weak references Initially proposed in PEP 205

Slide 11

Slide 11 text

Write More Robust Code with Weak References Jim Baker Weak references Initially proposed in PEP 205 Implemented in Python 2.1 (released April 2001)

Slide 12

Slide 12 text

Write More Robust Code with Weak References Jim Baker Weak references Initially proposed in PEP 205 Implemented in Python 2.1 (released April 2001) Released 14 years ago!

Slide 13

Slide 13 text

Write More Robust Code with Weak References Jim Baker Example: WeakSet First, let’s import WeakSet. Many uses of weak references are with respect to the collections provided by the weakref module: from weakref import WeakSet

Slide 14

Slide 14 text

Write More Robust Code with Weak References Jim Baker Weak referenceable classes Define a class X like so: class X(object): pass NB: str and certain other classes are not weak referenceable in CPython, but their subclasses can be

Slide 15

Slide 15 text

Write More Robust Code with Weak References Jim Baker Construction Construct a weak set and add an element to it. We then list the set: s = WeakSet() s.add(X()) list(s)

Slide 16

Slide 16 text

Write More Robust Code with Weak References Jim Baker Conclusions s is (eventually) empty - with list(s), we get []

Slide 17

Slide 17 text

Write More Robust Code with Weak References Jim Baker Conclusions s is (eventually) empty - with list(s), we get [] May require a round of garbage collection with gc.collect()

Slide 18

Slide 18 text

Write More Robust Code with Weak References Jim Baker Some possible questions Questions you might have in coming to this talk: What exactly are weak references? How do they differ from strong references? When would I use them anyway? To prevent memory and resource leaks.

Slide 19

Slide 19 text

Write More Robust Code with Weak References Jim Baker Resource leaks Often you can write code like this, without explicitly calling f.close(): f = open("foo.txt") ... But not always. . .

Slide 20

Slide 20 text

Write More Robust Code with Weak References Jim Baker Garbage collection is not magical GC works by determining that some some set of objects is unreachable: Doesn’t matter if it’s reference counting

Slide 21

Slide 21 text

Write More Robust Code with Weak References Jim Baker Garbage collection is not magical GC works by determining that some some set of objects is unreachable: Doesn’t matter if it’s reference counting Or a variant of mark-and-sweep

Slide 22

Slide 22 text

Write More Robust Code with Weak References Jim Baker Garbage collection is not magical GC works by determining that some some set of objects is unreachable: Doesn’t matter if it’s reference counting Or a variant of mark-and-sweep Or the combination used by CPython, to account for reference cycles

Slide 23

Slide 23 text

Write More Robust Code with Weak References Jim Baker Takeaway It cannot read your mind, developer though you may be!

Slide 24

Slide 24 text

Write More Robust Code with Weak References Jim Baker Takeaway It cannot read your mind, developer though you may be! GC is not sufficient to manage the lifecycle of resources

Slide 25

Slide 25 text

Write More Robust Code with Weak References Jim Baker Manual clearance Clean up resources - setting to None, calling close(), . . .

Slide 26

Slide 26 text

Write More Robust Code with Weak References Jim Baker Manual clearance Clean up resources - setting to None, calling close(), . . . Use try/finally

Slide 27

Slide 27 text

Write More Robust Code with Weak References Jim Baker try/finally try: f = open("foo.txt") ... finally: f.close()

Slide 28

Slide 28 text

Write More Robust Code with Weak References Jim Baker Manual clearance Clean up resources - setting to None, calling close(), . . . Use try/finally Apply deeper knowledge of your code

Slide 29

Slide 29 text

Write More Robust Code with Weak References Jim Baker Manual clearance Clean up resources - setting to None, calling close(), . . . Use try/finally Apply deeper knowledge of your code Or do cleanup by some other scheme

Slide 30

Slide 30 text

Write More Robust Code with Weak References Jim Baker Finalizers with del May use finalizers because of explicit external resource management

Slide 31

Slide 31 text

Write More Robust Code with Weak References Jim Baker Finalizers with del May use finalizers because of explicit external resource management Especially in conjunction with some explicit ref counting

Slide 32

Slide 32 text

Write More Robust Code with Weak References Jim Baker socket.makefile socket.makefile([mode[, bufsize]]) Return a file object associated with the socket. (File objects are described in File Objects.) The file object does not close the socket explicitly when its close() method is called, but only removes its reference to the socket object, so that the socket will be closed if it is not referenced from anywhere else.

Slide 33

Slide 33 text

Write More Robust Code with Weak References Jim Baker errno.EMFILE? Otherwise we may see an IOError raised with errno.EMFILE (“Too many open files”)

Slide 34

Slide 34 text

Write More Robust Code with Weak References Jim Baker socket.makefile socket.makefile([mode[, bufsize]]) Return a file object associated with the socket. (File objects are described in File Objects.) The file object does not close the socket explicitly when its close() method is called, but only removes its reference to the socket object, so that the socket will be closed if it is not referenced from anywhere else. Implementation is done through a separate ref counting scheme

Slide 35

Slide 35 text

Write More Robust Code with Weak References Jim Baker fileobject Prevent resource leaks (of underlying sockets) in the socket module: class _fileobject(object): ... def __del__(self): try: self.close() except: # close() may fail if __init__ didn’t complete pass NB: changed in Python 3.x, above is 2.7 implementation

Slide 36

Slide 36 text

Write More Robust Code with Weak References Jim Baker with statement for ARM You are already using automatic resource management, right? with open("foo.txt") as f: ...

Slide 37

Slide 37 text

Write More Robust Code with Weak References Jim Baker So far, so good No weak references yet

Slide 38

Slide 38 text

Write More Robust Code with Weak References Jim Baker So far, so good No weak references yet Keeping it simple!

Slide 39

Slide 39 text

Write More Robust Code with Weak References Jim Baker So far, so good No weak references yet Keeping it simple! No need to be in this talk, right?

Slide 40

Slide 40 text

Write More Robust Code with Weak References Jim Baker What if. . . An object is a child in a parent-child relationship?

Slide 41

Slide 41 text

Write More Robust Code with Weak References Jim Baker What if. . . An object is a child in a parent-child relationship? And needs to track its parent?

Slide 42

Slide 42 text

Write More Robust Code with Weak References Jim Baker What if. . . An object is a child in a parent-child relationship? And needs to track its parent? And the parent wants to track the child?

Slide 43

Slide 43 text

Write More Robust Code with Weak References Jim Baker What if. . . An object is a child in a parent-child relationship? And needs to track its parent? And the parent wants to track the child? Example: xml.sax.expatreader

Slide 44

Slide 44 text

Write More Robust Code with Weak References Jim Baker Make it even simpler Let’s implement a doubly-linked list - next and previous references

Slide 45

Slide 45 text

Write More Robust Code with Weak References Jim Baker Make it even simpler Let’s implement a doubly-linked list - next and previous references But also add del to clean up resources

Slide 46

Slide 46 text

Write More Robust Code with Weak References Jim Baker OrderedDict Dict that preserves the order of insertion, for iteration and indexed access

Slide 47

Slide 47 text

Write More Robust Code with Weak References Jim Baker OrderedDict Dict that preserves the order of insertion, for iteration and indexed access Asymptotic performance (big-O running time) same as regular dicts

Slide 48

Slide 48 text

Write More Robust Code with Weak References Jim Baker OrderedDict Dict that preserves the order of insertion, for iteration and indexed access Asymptotic performance (big-O running time) same as regular dicts Uses a doubly-linked list to preserve insertion order

Slide 49

Slide 49 text

Write More Robust Code with Weak References Jim Baker Avoiding reference cycles Why is avoiding strong reference cycles important?

Slide 50

Slide 50 text

Write More Robust Code with Weak References Jim Baker Avoiding reference cycles Why is avoiding strong reference cycles important? CPython’s GC usually does reference counting

Slide 51

Slide 51 text

Write More Robust Code with Weak References Jim Baker Avoiding reference cycles Why is avoiding strong reference cycles important? CPython’s GC usually does reference counting But a cycle cannot go to zero

Slide 52

Slide 52 text

Write More Robust Code with Weak References Jim Baker Under the hood CPython’s weak reference scheme stores a list of containers to be cleared out, including proxies

Slide 53

Slide 53 text

Write More Robust Code with Weak References Jim Baker Under the hood CPython’s weak reference scheme stores a list of containers to be cleared out, including proxies Performed when the referred object is deallocated

Slide 54

Slide 54 text

Write More Robust Code with Weak References Jim Baker Under the hood CPython’s weak reference scheme stores a list of containers to be cleared out, including proxies Performed when the referred object is deallocated Which occurs when the refcount goes to zero

Slide 55

Slide 55 text

Write More Robust Code with Weak References Jim Baker Under the hood CPython’s weak reference scheme stores a list of containers to be cleared out, including proxies Performed when the referred object is deallocated Which occurs when the refcount goes to zero No waiting on the garbage collector!

Slide 56

Slide 56 text

Write More Robust Code with Weak References Jim Baker Example: set From setobject.c in CPython 3.5 static void set_dealloc(PySetObject *so) { setentry *entry; Py_ssize_t fill = so->fill; PyObject_GC_UnTrack(so); Py_TRASHCAN_SAFE_BEGIN(so) if (so->weakreflist != NULL) PyObject_ClearWeakRefs((PyObject *) so); ... Also explains why many lightweight objects in CPython are not weak referenceable - avoid the cost of extra overhead of the weakreflist

Slide 57

Slide 57 text

Write More Robust Code with Weak References Jim Baker Ref cycles using GC in CPython Strong reference cycles have to wait for mark-and-sweep GC

Slide 58

Slide 58 text

Write More Robust Code with Weak References Jim Baker Ref cycles using GC in CPython Strong reference cycles have to wait for mark-and-sweep GC CPython’s GC is stop-the-world

Slide 59

Slide 59 text

Write More Robust Code with Weak References Jim Baker Ref cycles using GC in CPython Strong reference cycles have to wait for mark-and-sweep GC CPython’s GC is stop-the-world Runs only per decision criteria in the gc.set threshold, which is now generational

Slide 60

Slide 60 text

Write More Robust Code with Weak References Jim Baker Ref cycles using GC in CPython Strong reference cycles have to wait for mark-and-sweep GC CPython’s GC is stop-the-world Runs only per decision criteria in the gc.set threshold, which is now generational Doesn’t occur when you need it to close that file, or some other issue

Slide 61

Slide 61 text

Write More Robust Code with Weak References Jim Baker Useful points to consider My experience with garbage collectors is that they work well, except when they don’t

Slide 62

Slide 62 text

Write More Robust Code with Weak References Jim Baker Useful points to consider My experience with garbage collectors is that they work well, except when they don’t Especially around a small object pointing to an expensive resource

Slide 63

Slide 63 text

Write More Robust Code with Weak References Jim Baker Useful points to consider My experience with garbage collectors is that they work well, except when they don’t Especially around a small object pointing to an expensive resource Which you might see with resources that have limits

Slide 64

Slide 64 text

Write More Robust Code with Weak References Jim Baker Bug! http://bugs.python.org/issue9825 For 2.7, removed del in r84725

Slide 65

Slide 65 text

Write More Robust Code with Weak References Jim Baker Bug! http://bugs.python.org/issue9825 For 2.7, removed del in r84725 For 3.2, replaced del with weakrefs in r84727

Slide 66

Slide 66 text

Write More Robust Code with Weak References Jim Baker Bug! http://bugs.python.org/issue9825 For 2.7, removed del in r84725 For 3.2, replaced del with weakrefs in r84727 For 3.4, using del no longer means ref cycles are uncollectable garbage

Slide 67

Slide 67 text

Write More Robust Code with Weak References Jim Baker Python 2.7 solution Issue #9825: removed del from the definition of collections.OrderedDict. This prevents user-created self-referencing ordered dictionaries from becoming permanently uncollectable GC garbage. The downside is that removing del means that the internal doubly-linked list has to wait for GC collection rather than freeing memory immediately when the refcnt drops to zero. So this is an important fix - don’t want uncollectable garbage!

Slide 68

Slide 68 text

Write More Robust Code with Weak References Jim Baker Bug! http://bugs.python.org/issue9825 For 2.7, removed del in r84725 For 3.2, replaced del with weakrefs in r84727

Slide 69

Slide 69 text

Write More Robust Code with Weak References Jim Baker Bug! http://bugs.python.org/issue9825 For 2.7, removed del in r84725 For 3.2, replaced del with weakrefs in r84727 For 3.4, using del no longer means ref cycles are uncollectable garbage

Slide 70

Slide 70 text

Write More Robust Code with Weak References Jim Baker Weak references to the rescue! See implementation of collections.OrderedDict

Slide 71

Slide 71 text

Write More Robust Code with Weak References Jim Baker Crux of the code Use slots to minimize overhead - no need for a dict per object here __slots__ = ’prev’, ’next’, ’key’, ’__weakref__’

Slide 72

Slide 72 text

Write More Robust Code with Weak References Jim Baker Crux of the code Use slots to minimize overhead - no need for a dict per object here weakref means that a slots-built class should be weak referenceable __slots__ = ’prev’, ’next’, ’key’, ’__weakref__’

Slide 73

Slide 73 text

Write More Robust Code with Weak References Jim Baker Crux of the code Use slots to minimize overhead - no need for a dict per object here weakref means that a slots-built class should be weak referenceable NB: no-op in implementations like Jython __slots__ = ’prev’, ’next’, ’key’, ’__weakref__’

Slide 74

Slide 74 text

Write More Robust Code with Weak References Jim Baker Crux of the code (2) root.prev = proxy(link)

Slide 75

Slide 75 text

Write More Robust Code with Weak References Jim Baker Lookup tables Want to provide more information about a given object

Slide 76

Slide 76 text

Write More Robust Code with Weak References Jim Baker Lookup tables Want to provide more information about a given object Without extending/monkeypatching it

Slide 77

Slide 77 text

Write More Robust Code with Weak References Jim Baker Lookup tables Want to provide more information about a given object Without extending/monkeypatching it (So no use of dict for extra properties)

Slide 78

Slide 78 text

Write More Robust Code with Weak References Jim Baker Using a dict Could use the object as a key

Slide 79

Slide 79 text

Write More Robust Code with Weak References Jim Baker Using a dict Could use the object as a key But need to manually clean up the dict when the object is no longer needed

Slide 80

Slide 80 text

Write More Robust Code with Weak References Jim Baker Using a dict Could use the object as a key But need to manually clean up the dict when the object is no longer needed Maybe you know, maybe you don’t. Especially useful for libraries

Slide 81

Slide 81 text

Write More Robust Code with Weak References Jim Baker WeakKeyDictionary Insert the object as the key

Slide 82

Slide 82 text

Write More Robust Code with Weak References Jim Baker WeakKeyDictionary Insert the object as the key Associate anything you want as a value - list of proprerties, another object, etc

Slide 83

Slide 83 text

Write More Robust Code with Weak References Jim Baker WeakKeyDictionary Insert the object as the key Associate anything you want as a value - list of proprerties, another object, etc When the object used as key goes away, the value is also cleared out (if nothing else is holding onto it)

Slide 84

Slide 84 text

Write More Robust Code with Weak References Jim Baker Example: Django signals Django uses weak references in the implementation of its signal mechanism: Django includes a “signal dispatcher” which helps allow decoupled applications get notified when actions occur elsewhere in the framework. In a nutshell, signals allow certain senders to notify a set of receivers that some action has taken place. They’re especially useful when many pieces of code may be interested in the same events.

Slide 85

Slide 85 text

Write More Robust Code with Weak References Jim Baker WeakKeyDictionary Avoid computing the senders-receivers coupling on the fly, the easy way: self.sender_receivers_cache = weakref.WeakKeyDicti if use_caching else {}

Slide 86

Slide 86 text

Write More Robust Code with Weak References Jim Baker WeakValueDictionary Why?

Slide 87

Slide 87 text

Write More Robust Code with Weak References Jim Baker WeakValueDictionary Why? Used by multiprocessing (track processes), logging (track handlers) , symtable. . .

Slide 88

Slide 88 text

Write More Robust Code with Weak References Jim Baker WeakValueDictionary Why? Used by multiprocessing (track processes), logging (track handlers) , symtable. . . Useful for when you want to track the object by some id, and there should only be one, but once the object is no longer needed, you can let it go

Slide 89

Slide 89 text

Write More Robust Code with Weak References Jim Baker Object lifecycle independence One side may depend on the other, but not vice versa

Slide 90

Slide 90 text

Write More Robust Code with Weak References Jim Baker Object lifecycle independence One side may depend on the other, but not vice versa Use weak references for the independent side - process is terminated, can remove the lookup by process id

Slide 91

Slide 91 text

Write More Robust Code with Weak References Jim Baker Object lifecycle independence One side may depend on the other, but not vice versa Use weak references for the independent side - process is terminated, can remove the lookup by process id -> WeakValueDictionary

Slide 92

Slide 92 text

Write More Robust Code with Weak References Jim Baker Combining both weak keys and weak values? Yes, it does make sense. Both sides are independent.

Slide 93

Slide 93 text

Write More Robust Code with Weak References Jim Baker Example: Mapping Java classes to Python wrappers Jython implements this variant of the Highlander pattern: Map the Java class to Python wrappers (strong ref from using Java code)

Slide 94

Slide 94 text

Write More Robust Code with Weak References Jim Baker Example: Mapping Java classes to Python wrappers Jython implements this variant of the Highlander pattern: Map the Java class to Python wrappers (strong ref from using Java code) Python classes to any using Java class (strong ref from using Python code)

Slide 95

Slide 95 text

Write More Robust Code with Weak References Jim Baker Example: Mapping Java classes to Python wrappers Jython implements this variant of the Highlander pattern: Map the Java class to Python wrappers (strong ref from using Java code) Python classes to any using Java class (strong ref from using Python code) AND there can only be one mapping (or at least should be)

Slide 96

Slide 96 text

Write More Robust Code with Weak References Jim Baker Either might go away Why?

Slide 97

Slide 97 text

Write More Robust Code with Weak References Jim Baker Either might go away Why? Java classes will be garbage collected if no ClassLoader (the parent of the class effectively) or objects of that class exist;

Slide 98

Slide 98 text

Write More Robust Code with Weak References Jim Baker Either might go away Why? Java classes will be garbage collected if no ClassLoader (the parent of the class effectively) or objects of that class exist; But Python usage of this class will be GCed if no usage on the Python side - no subclasses in Python, etc

Slide 99

Slide 99 text

Write More Robust Code with Weak References Jim Baker Implementations Pure Python Recipe available (http://code.activestate.com/recipes/528879-weak-key- and-value-dictionary/) but I haven’t evaluated

Slide 100

Slide 100 text

Write More Robust Code with Weak References Jim Baker Implementations Pure Python Recipe available (http://code.activestate.com/recipes/528879-weak-key- and-value-dictionary/) but I haven’t evaluated Easy Jython version, because of JVM ecosystem

Slide 101

Slide 101 text

Write More Robust Code with Weak References Jim Baker Jython version from jythonlib import MapMaker, dict_builder class WeakKeyValueDictionary(dict): def __new__(cls, *args, **kw): return WeakKeyValueDictionaryBuilder(*args, ** # also add itervaluerefs, valuerefs, # iterkeyrefs, keyrefs

Slide 102

Slide 102 text

Write More Robust Code with Weak References Jim Baker Hook into Google Guava Collections WeakKeyValueDictionaryBuilder = dict_builder( MapMaker().weakKeys().weakValues().makeMap, WeakKeyValueDictionary)