Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2016 - Wesley Chun - Python 103: Memory Model &...

PyBay
August 20, 2016

2016 - Wesley Chun - Python 103: Memory Model & Best Practices

Description
There's a growing crowd of Python users who don't consider themselves beginners anymore. However some users at this stage discover odd behavior that's hard to explain. Why doesn't code behave like it should? Why doesn't "correct" code execute correctly? We'll focus on Python's object & memory model, addressing these issues directly. Let's empower attendees to not create these bugs to begin with!

Abstract
In "Python 101," you learned basic Python syntax. In "Python 102" (or equivalent in experience), you went further, exploring Python more deeply -- creating/using classes, decorators, files, other standard library or 3rd-party modules/packages -- and graduated from being purely a beginner. Because Python has been around the block for quite awhile now, there is a continuously growing number of "Python 103" programmers out there. Many are no longer new to the language, however, they have run into various issues, bugs, or odd behavior in their code that is difficult to explain. It's time to take a closer look. This is an interactive best practices talk, focusing on how Python objects, references, and the memory model work as well as thinking about performance. Knowing more about how the interpreter works under the covers, including the relationship between data objects and memory management, will make you a much more effective Python programmer, and the (main) goal with the knowledge imparted in this talk is to empower developers to not (inadvertently) create certain classes of bugs in their code to begin with! All you need to bring is the desire to learn more about the interpreter to take your Python skills to the next level.

Bio
Wesley J Chun is the author of the bestselling Core Python titles and the Python Fundamentals Live Lessons companion video. He is coauthor of Python Web Development with Django (withdjango.com), and has written for Linux Journal, CNET, and InformIT. Wesley is an architect and Developer Advocate at Google.

https://youtu.be/SiXyyOA6RZg

PyBay

August 20, 2016
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. Python 103 Python 103 Understanding Python Understanding Python’ ’s s

    Memory Model & Mutability Memory Model & Mutability +Wesley Chun, Principal CyberWeb Consulting @wescpy :: cyberwebconsulting.com Aug 2016, San Francisco Aug 2016, San Francisco
  2. OR OR "An ode to Python best "An ode to

    Python best practices in nine parts" practices in nine parts"
  3. About this talk & YOU About this talk & YOU

    •Agenda •Review of objects and references •Judicious use of memory •Being (performance) time conscious •Mutability & memory referencing gotchas •About you •Some Python experience… not (nec) a beginner •Still don’t understand “weird” behavior •Bugs where you swear code correct •Want to learn more internals
  4. Stack Utilization During Execution Stack Utilization During Execution zStack starts

    at bottom, grows upward zFunction call made == stack frame pushed zFunction returns == frame popped; stack shrinks zOverhead incurred each function call zAdd up over time so minimize! "main()" "main()" locals locals "main()" "main()" ret ret addr addr "main()" "main()" params params foo foo() () locals locals foo foo() () ret ret addr addr foo foo() () params params grows grows shrinks shrinks
  5. The Stack and Execution Speed The Stack and Execution Speed

    zWhich of the two loops is "faster?" x = 'blah-blah ...heckuva long string...' i = 0 while i < len(x): print x[i], i += 1 i = 0 strlen = len(x) while i < strlen: print x[i], i += 1 loop 2 loop 1
  6. z zmap( map(func, seq1[, ...] func, seq1[, ...]) ) –

    – applies applies func func to each to each element of element of seq1 seq1 and returns an iterable with the and returns an iterable with the results results; ; if any of if any of … … is given, applies is given, applies function function (which (which takes takes N N args) to a tuple containing args) to a tuple containing seq1[i] seq1[i] to to seqN[i] seqN[i] for each index for each index i i ( (None None filled in for shorter lists) filled in for shorter lists) >>> x = range(4) # x = [0, 1, 2, 3] >>> x = range(4) # x = [0, 1, 2, 3] >>> >>> def def times2(n): times2(n): return return n * 2 n * 2 >>> map(times2, x) >>> map(times2, x) [0, 2, 4, 6] [0, 2, 4, 6] map() Built-in Function
  7. z zfilter( filter(func, seq func, seq) ) – – returns

    an iterable whose returns an iterable whose values are those for which values are those for which func func returned returned True True for for each value in each value in seq seq >>> x = range(4) # x = [0, 1, 2, 3] >>> x = range(4) # x = [0, 1, 2, 3] >>> >>> def def odd(n): odd(n): return return n % 2 n % 2 >>> filter(odd, x) >>> filter(odd, x) [1, 3] [1, 3] filter() Built-in Function
  8. z zmap() map() >>> >>> def def times2(n): times2(n): return

    return n * 2 n * 2 >>> map(times2, range(5)) >>> map(times2, range(5)) [0, 2, 4, 6, 8] [0, 2, 4, 6, 8] >>> [times2(i) >>> [times2(i) for for i i in in range(5)] range(5)] [0, 2, 4, 6, 8] [0, 2, 4, 6, 8] z zfilter() filter() >>> >>> def def odd(n odd(n): ): return return n % 2 != 0 n % 2 != 0 >>> filter(odd, range(10)) >>> filter(odd, range(10)) [1, 3, 5, 7, 9] [1, 3, 5, 7, 9] >>> [i >>> [i for for i i in in range(10) range(10) if if i % 2 != 0] i % 2 != 0] [1, 3, 5, 7, 9] [1, 3, 5, 7, 9] z z"It depends." "It depends." Are listcomps faster?
  9. Lists are great! Lists are great! •Flexible, powerful data structure

    •2nd most used behind dictionaries •Ordered sequences of arbitrary objects •Mutable, resizable arrays •Small collection of useful methods •Highly-optimized under the covers •"List comprehensions" allow for quick logical construction •A favorite of many Python users
  10. Lists are horrible! Lists are horrible! •Take up unneeded amount

    of memory •List comprehensions most highly-occurring violators •People make lists when lists aren't needed •Trend to move away from lists in favor of iterators •2.x: zip(), range(), map(), filter(), dict.{key,value,item}s() return lists •3.x: all return iterators •Instead of list comprehensions, use generator expressions
  11. What's wrong with this? What's wrong with this? f =

    open('data.txt', 'r') uids = [line.strip() \ for line in f.readlines()] f.close() uid_csv = ','.join([x for x in uids])
  12. Don't waste memory! Don't waste memory! f = open('data.txt', 'r')

    uids = (line.strip() for line in f) f.close() uid_csv = ','.join(uids) •Wait! There's one more bug…
  13. Don't waste memory! Don't waste memory! f = open('data.txt', 'r')

    uids = (line.strip() for line in f) uid_csv = ','.join(uids) f.close()
  14. Part IV: Know how the Part IV: Know how the

    refcount changes refcount changes
  15. •Objects allocated on assignment •All objects passed by reference •References

    are also called aliases •Reference count used to track total number •Count in/decrements based upon usage •Objects garbage-collected when count goes to 0 Objects and References Objects and References
  16. More on References More on References •Variables don't "hold" data

    per se (not memory) •Variables just point to objects (aliases) •Additional aliases to an object can be created •Objects reclaimed when "refcount" goes to 0 •Be aware of cyclic references x = 1 y = x 1 x y
  17. Reference Count Increased Reference Count Increased •Examples of refcount increment:

    •It (the object) is created (and assigned) foo = 'Python is cool!' •Additional aliases for it are created bar = foo •It is passed to a function (new local reference) spam(foo) •It becomes part of a container object lotsaFoos = [123, foo, 'xyz']
  18. Reference Count Decreased Reference Count Decreased •Examples of refcount decrement:

    •A local reference goes out-of-scope i.e., when spam() ends •Aliases for that object are explicitly destroyed del bar # or del foo •An alias is reassigned a different object bar = 42 •It is removed from a container object lotsaFoos.remove(foo) •The container itself is deallocated del lotsaFoos # or out-of-scope
  19. Part V: Categorize standard Part V: Categorize standard types for

    interrelationships types for interrelationships
  20. Categorizing Standard Types Categorizing Standard Types Why? •To make you

    learn them faster •To make you understand them better •To know how to view them internally •To encourage more proficient programming Three Models •Storage •Update •Access
  21. Storage Model Storage Model Python Type Python Type Model Category

    Model Category lists, tuples, dicts, sets lists, tuples, dicts, sets container container numbers (all), strings numbers (all), strings literal/scalar literal/scalar •How data is stored in an object •Can it hold single or multiple objects?
  22. Update Model Update Model Python Type Python Type Model Category

    Model Category numbers, strings, numbers, strings, tuples tuples, , frozensets frozensets immutable immutable lists, lists, dicts dicts, sets , sets mutable mutable •Can an object's value be updated? •Mutable == yes and immutable == no •There is one of each set type •bytearray type is mutable (3.x)
  23. Access Model Access Model numbers, sets numbers, sets direct direct

    Python Type Python Type Model Category Model Category dicts dicts mapping mapping strings, lists, tuples strings, lists, tuples sequence sequence •How data is accessed in an object •Directly, via index, or by key •Primary model for type differentiation
  24. Type Categorization Summary Type Categorization Summary mapping mapping mutable mutable

    container container dictionaries dictionaries im/mutable im/mutable immutable immutable mutable mutable immutable immutable immutable immutable Update Update Model Model container container container container container container literal/scalar literal/scalar literal/scalar literal/scalar Storage Storage Model Model sequence sequence strings strings direct direct numbers numbers sequence sequence tuples tuples Access Access Model Model Data Type Data Type direct direct sets sets sequence sequence lists lists
  25. Objects & References Quiz Objects & References Quiz zWhat is

    the output of the code below? WHY? Example 1 x = 42 y = x x = x + 1 print x print y Example 2 x = [ 1, 2, 3 ] y = x x[0] = 4 print x print y
  26. Quiz Analysis Quiz Analysis zExample 1 x = 42 y

    = x x += 1 zExample 2 x = [1,2,3] y = x x[0] = 4 42 x y [1, 2, 3] x y
  27. Quiz Answers Quiz Answers Example 1 >>> x = 42

    >>> y = x >>> x += 1 >>> print x 43 >>> print y 42 Example 2 >>> x = [ 1, 2, 3 ] >>> y = x >>> x[0] = 4 >>> print x [4, 2, 3] >>> print y [4, 2, 3]
  28. Quiz Epilogue Quiz Epilogue zExample 1 x = 42 y

    = x x += 1 zExample 2 x = [1,2,3] y = x x[0] = 4 42 x y [4, 2, 3] x y 43
  29. Digging deeper into Python Digging deeper into Python zWe know

    what this does… x = 4 y = x zWhat about this? x = 4 y = 4 4 x y 4 x y 4
  30. Interning of Objects Interning of Objects •Exception to the general

    rule •Some strings and integers are "interned" •Integers in range(-5, 257) [currently] •Oft-used, single-character, and empty strings •Primarily for performance reasons only x = 4 y = 4 x = 4.3 y = 4.3 4 x y 4.3 x y 4.3
  31. What is it with What is it with is is?

    ? if error is True: : if data is not None: : if error == True: : if data != None: : Ever see any of these? Ever see any of these? Why not these? Why not these?
  32. Why dereference if optional? Why dereference if optional? $ python

    -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0522 usec per loop $ python -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0526 usec per loop $ python -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0516 usec per loop $ python -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0317 usec per loop $ python -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0311 usec per loop $ python -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0314 usec per loop $ python3 -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0261 usec per loop $ python3 -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0273 usec per loop $ python3 -m timeit -s 'x = None' 'x == None' 10000000 loops, best of 3: 0.0256 usec per loop $ python3 -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0202 usec per loop $ python3 -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0202 usec per loop $ python3 -m timeit -s 'x = None' 'x is None' 10000000 loops, best of 3: 0.0199 usec per loop int *x, *y; : if (x == y) {… if (*x == *y) {… In C parlance In C parlance… … vs. vs. But please not But please not… … if x is 42:
  33. Objects and References Quiz 2 Objects and References Quiz 2

    •Copy objects using their factory function(s) •Can use improper slice with sequences •What is the output here (and WHY)? x = ['foo', [1,2,3], 10.4] y = list(x) # or x[:] y[1][0] = 4 print x print y
  34. Quiz analysis Quiz analysis zCopying a list x = ['foo',

    [1, 2, 3], 10.4] y = list(x) # or x[:] y[1][0] = 4 ['foo', [1,2,3], 10.4] x y ['foo', [1,2,3], 10.4]
  35. Quiz 2 Answer Quiz 2 Answer >>> x = ['foo',

    [1,2,3], 10.4] >>> y = list(x) # or x[:] >>> y[1][0] = 4 >>> print x ['foo', [4, 2, 3], 10.4] >>> print y ['foo', [4, 2, 3], 10.4]
  36. Correct analysis Correct analysis zCopying a list means copying references,

    not… x = ['foo', [1, 2, 3], 10.4] y = list(x) # or x[:] y[1][0] = 4 [ , , ] x y 'foo' [4, 2, 3] 10.4 [ , , ]
  37. Copying Objects Copying Objects •Trickier with mutable objects •Let's say

    you have a list a you wish to copy to b •Creating an alias not a copy b = a # a == b and a is b [id(a) == id(b)] •Creating a shallow copy (all objects inside are aliases!) b = a[:] # a == b but a is not b •Creating a deep copy (all objects inside are copies) •Use the deepcopy() function in the copy module b = copy.deepcopy(a)
  38. Summary Summary 1. Function calls have overhead 2. Do you

    really need a list? 3. Know how references and aliases work 4. Know standard data types better (BTW sets are awesome!) 5. Don't be afraid to diagram… may shed light 6. Pythons are interesting under the covers (interning & is) 7. Mutability is like the butler… they probably did it 8. Knowledge is power 9. Avoid entire class of bugs you'll never write… thank me later 10. More in Core Python Programming amzn.com/0132269937
  39. THANK YOU! THANK YOU! +Wesley Chun :: @ +Wesley Chun

    :: @wescpy wescpy corepython.com corepython.com cyberwebconsulting.com cyberwebconsulting.com Upcoming Upcoming: "Exploring Google APIs with Python" : "Exploring Google APIs with Python" Thu, 2016 Aug 25 in Mountain View, CA Thu, 2016 Aug 25 in Mountain View, CA See See baypiggies.net baypiggies.net for Meetup link for Meetup link