Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Nina Zakharenko - The Basics of Memory Management in Python - North Bay Python 2018

Nina Zakharenko - The Basics of Memory Management in Python - North Bay Python 2018

https://2018.northbaypython.org/schedule/presentation/19/

As a new python developer, do you find memory management in Python confusing? Come to this talk to learn about the basics of how Memory Management works in Python. We'll cover the concepts of reference counting, garbage collection, weak references, __slots__, and the Global Interpreter Lock.

The documentation immediately jumps into difficult to follow concepts, especially if you don't have a background in Computer Science.

I'll provide a simple, easy to follow overview of the concepts that a developer needs to be familiar with in order to scratch the surface of how memory management and garbage collection works in Python.

Nina Zakharenko

November 04, 2018
Tweet

More Decks by Nina Zakharenko

Other Decks in Technology

Transcript

  1. A name is just a label for an object. In

    Python, each object can have lots of names. Like 'x', 'y' @nnja
  2. Different Types of Objects Simple Container numbers dict strings list

    user defined classes Container objects can contain simple objects, or other container objects. @nnja
  3. What does del do? The del statement doesn't delete objects.

    It: 4 removes that name as a reference to that object 4 reduces the ref count by 1 @nnja
  4. ! When there are no more references, the object can

    be safely removed from memory. @nnja
  5. local vs. global namespace If refcounts decrease when an object

    goes out of scope, what happens to objects in the global namespace? 4 Never goes out of scope! 4 Refcount never reaches 0. 4 Avoid putting large or complex objects in the global namespace. @nnja
  6. Every Python object holds 3 things 4 Its type 4

    A reference count 4 Its value @nnja
  7. >>> def mem_test(): ... x = 300 ... y =

    300 ... print( id(x) ) ... print( id(y) ) ... print( x is y ) >>> mem_test() 4504654160 4504654160 True ℹ note: run this from a function in the repl, or from a file @nnja
  8. What is Garbage Collection? A way for a program to

    automatically release memory when the object taking up that space is no longer in use. @nnja
  9. How does reference counting garbage collection work? 1. Add and

    remove references 2. When the refcount reaches 0, remove the object 3. Cascading effect 4 decrease ref count of any object the deleted object was pointing to @nnja
  10. Reference Counting Garbage Collection: The Good 4 Easy to implement

    4 When refcount is 0, objects are immediately deleted. @nnja
  11. Reference Counting: The Bad 4 space overhead 4 reference count

    is stored for every object 4 execution overhead 4 reference count changed on every assignment @nnja
  12. Python maintains a list of every object created as a

    program is run. Actually, it makes 3: - generation 0 - generation 1 - generation 2 Newly created objects are stored in generation 0. @nnja
  13. Only container objects with a refcount greater than 0 will

    be stored in a generation list. @nnja
  14. When the number of objects in a generation reaches a

    threshold, python runs a garbage collection algorithm on that generation, and any generations younger than it. @nnja
  15. What happens during a generational garbage collection cycle? 1. Python

    makes a list for objects to discard. 2. It runs an algorithm to detect reference cycles. 3. If an object has no outside references, add it to the discard list. 4. When the cycle is done, free up the objects on the discard list. @nnja
  16. After a garbage collection cycle, objects that survived will be

    promoted to the next generation. Objects in the last generation (2) stay there as the program executes. @nnja
  17. When the ref count reaches 0, you get immediate clean

    up. If you have a cycle, you need to wait for garbage collection to run. @nnja
  18. After garbage collection, the size of the python program likely

    won’t shrink. 4 The freed memory is fragmented. 4 i.e. it's not freed in one continuous block. 4 When we say memory is freed during garbage collection, it’s released back to Python to use for other objects, not necessarily to the system. @nnja
  19. Python instances have a dict of values class Dog(object): pass

    buddy = Dog() buddy.name = 'Buddy' print(buddy.__dict__) {'name': 'Buddy'} @nnja
  20. AttributeError 'Hello'.name = 'Fred' AttributeError Traceback (most recent call last)

    ----> 1 'Hello'.name = 'Fred' AttributeError: 'str' object has no attribute 'name' @nnja
  21. __slots__ class Point(object): __slots__ = ('x', 'y') point = Point()

    point.x = 5 point.y = 7 point.name = "Fred" Traceback (most recent call last): File "point.py", line 8, in <module> point.name = "Fred" AttributeError: 'Point' object has no attribute 'name' @nnja
  22. size of dict vs. size of tuple import sys sys.getsizeof(dict())

    sys.getsizeof(tuple()) sizeof dict: 232 bytes sizeof tuple: 40 bytes @nnja
  23. When to use slots? 4 Creating many instances of a

    class 4 Know in advance what properties the class should have Saving 9 GB of RAM with __slots__
  24. weakref 4 A weakref to an object is not enough

    to keep it alive. 4 When the only remaining references are weak references, the object can be garbage collected. 4 Useful for: 4 implementing caches or mappings holding large objects python3 weakref docs
  25. Advantages / Disadvantages of a GIL Upside: Reference counting is

    fast and easy to implement. Downside: In a Python program, no matter how many threads exist, only one thread will be executed at a time.
  26. Want to take advantage of multiple cores? 4 Use multi-processing

    instead of multi-threading. 4 Each process will have it’s own GIL, it’s on the developer to figure out a way to share information between processes. @nnja
  27. Additional Reading 4 Great explanation of generational garbage collection and

    python’s reference detection algorithm 4 Weak Reference Documentation 4 Python Module of the Week - gc 4 PyPy STM - GIL less Python Interpreter 4 Saving 9GB of RAM with python’s __slots__ @nnja
  28. Getting in-depth with the GIL 4 Dave Beazley - Guide

    on how the GIL Operates 4 Dave Beazley - New GIL in Python 3.2 4 Dave Beazley - Inside Look at Infamous GIL Patch @nnja
  29. Why can’t we use the REPL to follow along at

    home? 4 Because It doesn’t behave like a typical python program that’s being executed. 4 Further reading @nnja
  30. Python pre-loads objects 4 Many objects are loaded by Python

    as the interpreter starts. 4 Called peephole optimization. 4 Numbers: -5 -> 256 4 Single Letter Strings 4 Common Exceptions 4 Further reading @nnja
  31. Attempting to remove the Gil - A Gilectomy 4 Larry

    Hastings - Removing Python's GIL - The Gilectomy 4 Larry Hastings - The Gilectomy, How it's going 4 Gilectomy on GitHub 4 A Gilectomy Update @nnja