Developer naïveté ● “Packaging is just the last step” ● “We’ll just fudge something at the end and take the B” ● “If I can build it, someone will ship it” 4 A few rookie mistakes:
“ “The first 90% of the code accounts for the first 90% of development time. The last 10% of the code accounts for the other 90% of development time.” — Tom Cargill, Bell Labs (probably talking about packaging) 5 The Rule
Standalone modules ● A .py file is a module ● Standalone: only imports from the standard library ● schema, ashes, boltons, bottle.py ● Targets Python: easy to distribute and integrate ○ “vendoring” 9 The smallest unit of Python
Pure-Python package ● A directory full of .py files is a “package” ● Generally includes an __init__.py ● Django, requests, hyperlink, face ● Easy to install with pip ○ pip installs packages, after all, right? 10 The molecule to the module’s atom
Pardon my dist ● Proper packages need to be single redistributable archives ● A distribution is an archive of zero or more packages ● Motivational case studies: PIL & Pillow. PyCrypto(dome). ● Built by setuptools through setup.py ○ Example: sdist ○ Simple, source-only .tar.gz 11 Packages are not proper packages.
The full package ● Python s interoperability & performance ● Pillow, gevent, lxml ● wheel ○ The shiny new binary distribution, or bdist ○ Supports most Windows, Mac, & Linux ○ No compiler necessary 12 Python is more than just .py files.
System libraries? ● Prebuilt standard libraries managed through the OS ○ .dll and .so ○ libcrypto (OpenSSL), libxml2, libpng, etc. ● Static vs dynamic linking ○ Big wheels ● conda! 16 The world is not written in Python (yet).
Libraries What developers work on and with day in and day out. Rarely a product (maybe SDKs) There are two kinds of packages Applications Services and other products with high-level, non-code interfaces. Most products. 20
Don’t pip install prod ● pip requires developer attention to debug and resolve ● No dependency resolution or transactional installs ● Even dev tools work better with pipsi ○ Every application deserves its own env 22 Especially not from PyPI
The PayPal story ● Started in 2009, grew to a team in 2011 ● 30+ midtier apps, services, and batch jobs ○ Max single service volume: 1.2 billion reqs/day (2016) ○ Max single service throughput: 10,000 reqs/sec/worker (2016) ● Multiprotocol, service-focused, gevent-based framework ● Hundreds of users, almost all grassroots 26 http://github.com/paypal/support Wonderfully, spontaneously Python.
More environments than any other stack. (8 environments x 10 binary libraries) / 5 team members = ∞ static builds. 27 PayPal Python Environment Support Matrix (2014) Operating System Architecture Python Version Linux 32-bit/64-bit 2.6/2.7 Mac 64-bit 2.7 Solaris 32-bit 2.7 Windows 32-bit 2.7
80% environment coverage x 500+ packages = ∞% better 30 Anaconda Environment Support Matrix (2015) Operating System Architecture Python Version Linux 32-bit/64-bit 2.7-3.x Mac 64-bit 2.7-3.x Windows 32-bit/64-bit 2.7-3.x
Anaconda in PayPal LIVE ● PayPal was on RHEL5 ○ Python 2.4 ○ Plus 2.6-2.7 (sort of) ● No Anaconda. Couldn’t just target conda install ● We could bring our own by putting Miniconda in an RPM... 31 Production-first development in practice.
3-steps to a conda RPM 1. With requirements in hand, conda install --download-only 2. Miniconda + .tar.bz2 archives into the RPM 3. Run a tiny installer script in the RPM postinstall section Ready to test and deploy! https://www.paypal-engineering.com/2016/09/07/python-packaging-at-paypal/ https://github.com/paypal/support/blob/master/examples/miniconda 32 One way to box a snake:
Anaconda internals ● Built on OS and Python features ● Userspace Filesystem layout ● Python landmarking ● Paths and linking ● PatchELF 35 Anaconda’s internal layout (lib, include, bin, etc) A whole new ecosystem.
Userspace images Anaconda meets old autorun CDs ● Your userspace in its own partition! ● Literally v1 was ISO9660 ● E.g., AppImage / kdenlive 36 Even trendier than selfies! Anaconda meets old autorun CDs https://github.com/AppImage/AppImageKit https://github.com/appimage-packages/kdenlive
The shopkick story ● 150k commits and 12+ services ● CentOS 6 + Python 2.6 + LXC ● 100% Mac local dev (iOS) One mission: Upgrade stack from 2009 to 2017. 38 Legacy environment and codebase.
Local production Closing the rift. ● Production-first: Better to target Linux than MacOS ● Docker’s native MacOS xhyve virtualization ○ Docker Machine (virtualbox) ○ Minikube ● Avoid lethal exposure of Docker ● Account for new tools: GitLab & DCOS (later k8s) 39
OpenSky Big, blue, and cloud-ready. ● Specify a legacy service into an images for local, stage, and prod ● Library dependencies from: yum, conda, pip, and sky (git repo) ● Service deps specified in terms of docker images ● Plugins + core shipped as PEX to local developers ● Open-sourced just for you: https://github.com/shopkick/sky 40
docker + conda Conda just works, despite docker headaches ● Bring your own Miniconda ● Custom installer script that calls conda ○ docker build instrumentation severely lacking. Layer proliferation. ● Tips: ○ Run after yum, before pip ○ Size is always an issue: nomkl unless you really need the speed, conda clean. ○ --no-channel-priority useful for mixing and matching 41
Wrapping up One neat package. ● Production-first development ● Leverage the ecosystem ● Conda in OS packages works ● Conda in containers works ● Simple is better than complex 42