Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

2 © 2019 Pinterest. All rights reserved. Jordan Adler, Software Engineer Joe Gordon, Site Reliability Engineer Migrating Pinterest from Python 2 to Python 3

Slide 3

Slide 3 text

3 © 2019 Pinterest. All rights reserved.

Slide 4

Slide 4 text

4 © 2019 Pinterest. All rights reserved. Our mission to create a life the inspiration everyone they love. To bring

Slide 5

Slide 5 text

5 © 2019 Pinterest. All rights reserved.

Slide 6

Slide 6 text

Agenda Approach The Good The Bad The Ugly Q&A 1 2 3 4

Slide 7

Slide 7 text

7 © 2019 Pinterest. All rights reserved. Approach Approach 1 2 3 4

Slide 8

Slide 8 text

8 © 2019 Pinterest. All rights reserved. Python at Pinterest ● Started with ● Python now used to serve over 250 million monthly active users ● Python’s speed and flexibility enables quick experimentation

Slide 9

Slide 9 text

9 © 2019 Pinterest. All rights reserved. Large 2.6 million LOC (excl deps) Aged 10 year old codebase with over 1,000 authors over lifetime Dynamic ~3,500 changes monthly from over 450 developers Multi-Stakeholder Tightly Coupled Problem Statement Large-Scale Python Codebase Migration

Slide 10

Slide 10 text

10 © 2019 Pinterest. All rights reserved. Engineering Principles ● Start Simple ○ We strive to build the simplest viable solution. ○ We learn, iterate and scale quickly. ○ We value progress over perfection. ● Build for Impact ○ We look beyond our team’s goals for the highest impact opportunities. ○ We are metrics-informed, not metrics-driven. ○ We prioritize Pinners over technology in our decisions. ● Own It! ○ We are responsible for driving our work forward. ○ When there is ambiguity, we take initiative. ○ We take pride in improving our work, keeping it high-quality and performant.

Slide 11

Slide 11 text

11 © 2019 Pinterest. All rights reserved. Gradual Py3 Rollout 1. Make Py3 available 2. Upgrade requirements 3. Futurize codebase 4. Test under Py2 and Py3 5. Migrate production environments to Py3 6. Drop support for Py2 7. Add Py3 only features

Slide 12

Slide 12 text

12 © 2019 Pinterest. All rights reserved. Upgrade requirements ● Start at bottom of dependency graph ● caniusepython3 (version classifier troves) ● Unmaintained dependencies ● >8 years of changes in some cases ● CI test of requirements.txt as backstop ● Environment markers; python_version < '3'

Slide 13

Slide 13 text

13 © 2019 Pinterest. All rights reserved. Modernize vs futurize modernize futurize

Slide 14

Slide 14 text

14 © 2019 Pinterest. All rights reserved. Python Future http://python-future.org/overview.html

Slide 15

Slide 15 text

15 © 2019 Pinterest. All rights reserved. Stage 2: lib2to3.fixes.fix_basestring lib2to3.fixes.fix_dict lib2to3.fixes.fix_exec lib2to3.fixes.fix_getcwdu lib2to3.fixes.fix_input lib2to3.fixes.fix_itertools lib2to3.fixes.fix_itertools_imports lib2to3.fixes.fix_filter lib2to3.fixes.fix_long lib2to3.fixes.fix_map lib2to3.fixes.fix_nonzero lib2to3.fixes.fix_operator lib2to3.fixes.fix_raw_input lib2to3.fixes.fix_zip libfuturize.fixes.fix_cmp libfuturize.fixes.fix_division libfuturize.fixes.fix_execfile libfuturize.fixes.fix_future_builtins libfuturize.fixes.fix_future_standard_library libfuturize.fixes.fix_future_standard_library_urllib libfuturize.fixes.fix_metaclass libpasteurize.fixes.fix_newstyle libfuturize.fixes.fix_object libfuturize.fixes.fix_unicode_keep_u libfuturize.fixes.fix_xrange_with_import Fixers Stage 1: lib2to3.fixes.fix_apply lib2to3.fixes.fix_except lib2to3.fixes.fix_exitfunc lib2to3.fixes.fix_funcattrs lib2to3.fixes.fix_has_key lib2to3.fixes.fix_idioms lib2to3.fixes.fix_intern lib2to3.fixes.fix_isinstance lib2to3.fixes.fix_methodattrs lib2to3.fixes.fix_ne lib2to3.fixes.fix_numliterals lib2to3.fixes.fix_paren lib2to3.fixes.fix_reduce lib2to3.fixes.fix_renames lib2to3.fixes.fix_repr lib2to3.fixes.fix_standarderror lib2to3.fixes.fix_sys_exc lib2to3.fixes.fix_throw lib2to3.fixes.fix_tuple_params lib2to3.fixes.fix_types lib2to3.fixes.fix_ws_comma lib2to3.fixes.fix_xreadlines libfuturize.fixes.fix_absolute_import libfuturize.fixes.fix_next_call libfuturize.fixes.fix_print_with_import libfuturize.fixes.fix_raise

Slide 16

Slide 16 text

16 © 2019 Pinterest. All rights reserved. Code Transformations

Slide 17

Slide 17 text

17 © 2019 Pinterest. All rights reserved. Futurize Codebase ● Linters and CI to prevent regressions ● Apply each fix individually ● Run unit tests under both Python 2 and 3 ● Upgrading to Py3 is an exercise in discovering what code is tested

Slide 18

Slide 18 text

18 © 2019 Pinterest. All rights reserved. Dependency Graph ● Introspect internal dependency graph ○ Test driven migration ● Monkey patch __import__() ● Build a list of modules that run under Py3 ● Find test modules that have the fewest dependencies on Py2 only

Slide 19

Slide 19 text

19 © 2019 Pinterest. All rights reserved. Test under Py2 and Py3 ● Test runner and base class uses 100’s of modules ● Bootstrapping problem ● Add smaller test runner and base class

Slide 20

Slide 20 text

20 © 2019 Pinterest. All rights reserved. Fail Paths ● Detectable by Flake8 ○ Syntax Errors ○ Scope ● Detectable at Import Time ○ Bad dependencies ○ Code evaluated on import ● Detected at Runtime ○ Via unit tests ○ In production ○ Everything else

Slide 21

Slide 21 text

21 © 2019 Pinterest. All rights reserved. Potential Complications ● Large and complex code base ● test coverage ● Business logic is complex (int/float, str/bytes) ● Limitations in code transformation ○ old_div_safe ○ fix_dict ○ str/bytes

Slide 22

Slide 22 text

22 © 2019 Pinterest. All rights reserved. The Good The Good 1 2 3 4

Slide 23

Slide 23 text

23 © 2019 Pinterest. All rights reserved. Futurize & lib2to3 The Good:

Slide 24

Slide 24 text

24 © 2019 Pinterest. All rights reserved. The Good libfuturize.fixes.fix_print_with_import

Slide 25

Slide 25 text

25 © 2019 Pinterest. All rights reserved. The Good lib2to3.fixes.fix_except

Slide 26

Slide 26 text

26 © 2019 Pinterest. All rights reserved. The Good libfuturize.fixes.fix_metaclass

Slide 27

Slide 27 text

27 © 2019 Pinterest. All rights reserved. The Good libfuturize.fixes.fix_absolute_import

Slide 28

Slide 28 text

28 © 2019 Pinterest. All rights reserved. The Bad The Bad 1 2 3 4

Slide 29

Slide 29 text

29 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Numbers 1 / 2 0 0.5

Slide 30

Slide 30 text

30 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Numbers round(9, 1) 9.0 9

Slide 31

Slide 31 text

31 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Numbers round(2.5) 3.0 2

Slide 32

Slide 32 text

32 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Numbers x = None x > 1 False TypeError: '>' not supported between instances of 'NoneType' and 'int'

Slide 33

Slide 33 text

33 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Numbers month = 01 print(month) 1 SyntaxError: invalid token

Slide 34

Slide 34 text

34 © 2019 Pinterest. All rights reserved. Number Complex Real Rational PEP 3141: A Type Hierarchy for Numbers Numeric Tower of ABCs Int Bool Float Complex Numeric Primitives

Slide 35

Slide 35 text

35 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Bytes user_id_int = 2345434565434565434 user_id_string = bytes(user_id) '2345434565434565434' Python(36431,0x7fff894fc380) malloc: ... MemoryError

Slide 36

Slide 36 text

36 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Bytes for i in b'123': print(i) 1 2 3 49 50 51

Slide 37

Slide 37 text

37 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Strings import string print(string.ascii_letters) print(string.letters) abcdefghijklmnopqrstuvwxyzABCDEFGH IJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyzABCDEFGH IJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyzABCDEFGHIJ KLMNOPQRSTUVWXYZ AttributeError: module 'string' has no attribute 'letters'

Slide 38

Slide 38 text

38 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Bytes/Strings # -*- encoding: utf-8 -*- try: raise Exception(u' ') except Exception as e: print(e) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

Slide 39

Slide 39 text

39 © 2019 Pinterest. All rights reserved. Sequence Sanity list tuple unicode str Container Contains any unicode bytes list tuple str bytes Container Contains any encoded bytes bytes

Slide 40

Slide 40 text

40 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Scopes [i for i in range(10)] print(i) 9 NameError: name 'i' is not defined

Slide 41

Slide 41 text

41 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Scopes class Foo(object): A = [1] B = [i for i in [1, 2] if i not in A] print(B) [2] NameError: name 'A' is not defined

Slide 42

Slide 42 text

42 © 2019 Pinterest. All rights reserved. “This is because list comprehensions are now implemented with their own function object like generator expressions have always been.” https://bugs.python.org/issue21161

Slide 43

Slide 43 text

43 © 2019 Pinterest. All rights reserved. List Comprehensions: Scope

Slide 44

Slide 44 text

44 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Scopes class Foo(object): def __init__(self): print(list(locals().keys())) super(Foo, self).__init__() Foo() ['self'] ['self', '__class__']

Slide 45

Slide 45 text

45 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Dictionaries def foo(**kwargs): print(kwargs) foo(b=1, a=2) {'a': 2, 'b': 1} {'b': 1, 'a': 2} {'a': 2, 'b': 1} Py3.6+ {'b': 1, 'a': 2}

Slide 46

Slide 46 text

46 © 2019 Pinterest. All rights reserved. Hash randomization is intended to provide protection against a denial-of-service … O(n^2) complexity. See http://www.ocert.org/advisories/ocert-20 11-003.html for details. https://docs.python.org/3.3/using/cmdline.html#cmdoption-R

Slide 47

Slide 47 text

47 © 2019 Pinterest. All rights reserved. The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon https://docs.python.org/3/whatsnew/3.6.html #whatsnew36-compactdict

Slide 48

Slide 48 text

48 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Dictionaries mydict = {1: 'z', 2: 'a'} print(mydict.keys() + [3]) [1, 2, 3] TypeError: unsupported operand type(s) for +: 'dict_keys' and 'list'

Slide 49

Slide 49 text

49 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Exceptions try: raise Exception() except Exception: print(sys.exc_info()[0]) print(sys.exc_info()[0]) None

Slide 50

Slide 50 text

50 © 2019 Pinterest. All rights reserved. Exceptions

Slide 51

Slide 51 text

51 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Exceptions from exceptions import KeyError ModuleNotFoundError: No module named 'exceptions'

Slide 52

Slide 52 text

52 © 2019 Pinterest. All rights reserved. The Ugly Gotchas 1 2 3 4

Slide 53

Slide 53 text

53 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Hashability class MyString(str): def __eq__(self, other): return super(MyString, self).__eq__(other) print(hash(MyString('pycon'))) 5778351363512243486 TypeError: unhashable type: 'MyString'

Slide 54

Slide 54 text

54 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Unicode # -*- encoding: utf-8 -*- s = u"a " for char in s: print(hex(ord(char))) 0x61 0x20 0xd83d 0xdca9 0x61 0x20 0x1f4a9 Py2 Wide 0x61 0x20 0x1f4a9

Slide 55

Slide 55 text

55 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: StringIO from cStringIO import StringIO StringIO(b'imagedata') StringIO(u'text') ModuleNotFoundError: No module named 'cStringIO'

Slide 56

Slide 56 text

56 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: StringIO from io import StringIO StringIO(b'sdf') StringIO(u'sdf') TypeError: initial_value must be str or None, not bytes TypeError: initial_value must be str or None, not bytes

Slide 57

Slide 57 text

57 © 2019 Pinterest. All rights reserved. Output Code: Numbers - JavaScript Edition! Welcome to Node.js v12.1.0. > 2**55+10 > 2**55+9 > 2**55+10 36028797018963976 > 2**55+9 36028797018963976

Slide 58

Slide 58 text

58 © 2019 Pinterest. All rights reserved. Py2 Py3 Code: Mock import mock with mock.patch('__builtin__.open'): with open('notafile'): pass Traceback (most recent call last): ... ModuleNotFoundError: No module named '__builtin__'

Slide 59

Slide 59 text

59 © 2019 Pinterest. All rights reserved.