Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Production-grade Packaging with Anaconda

Production-grade Packaging with Anaconda

Python's packaging vs. packaging Python. Challenges in shipping Python products, and how Anaconda fits in.

Mahmoud Hashemi

April 10, 2018
Tweet

More Decks by Mahmoud Hashemi

Other Decks in Programming

Transcript

  1. Developer naïveté • “Packaging is just the last step” •

    “We’ll just fudge something at the end and take the B” • “If I can build it, someone will ship it” 4 A few rookie mistakes:
  2. “ “The first 90% of the code accounts for the

    first 90% of development time. The last 10% of the code accounts for the other 90% of development time.” — Tom Cargill, Bell Labs (probably talking about packaging) 5 The Rule
  3. Standalone modules • A .py file is a module •

    Standalone: only imports from the standard library • schema, ashes, boltons, bottle.py • Targets Python: easy to distribute and integrate ◦ “vendoring” 9 The smallest unit of Python
  4. Pure-Python package • A directory full of .py files is

    a “package” • Generally includes an __init__.py • Django, requests, hyperlink, face • Easy to install with pip ◦ pip installs packages, after all, right? 10 The molecule to the module’s atom
  5. Pardon my dist • Proper packages need to be single

    redistributable archives • A distribution is an archive of zero or more packages • Motivational case studies: PIL & Pillow. PyCrypto(dome). • Built by setuptools through setup.py ◦ Example: sdist ◦ Simple, source-only .tar.gz 11 Packages are not proper packages.
  6. The full package • Python s interoperability & performance •

    Pillow, gevent, lxml • wheel ◦ The shiny new binary distribution, or bdist ◦ Supports most Windows, Mac, & Linux ◦ No compiler necessary 12 Python is more than just .py files.
  7. 13 python setup.py sdist bdist_wheel upload The modern way to

    build and upload a Python package sum()
  8. System libraries? • Prebuilt standard libraries managed through the OS

    ◦ .dll and .so ◦ libcrypto (OpenSSL), libxml2, libpng, etc. • Static vs dynamic linking ◦ Big wheels • conda! 16 The world is not written in Python (yet).
  9. Basically... 18 1. .py - standalone modules 2. sdist -

    Pure-Python packages 3. wheel - Python packages 4. conda - Python + system libraries (With room to spare for static vs. dynamic linking) 1 2 3 4 But wait...
  10. Libraries What developers work on and with day in and

    day out. Rarely a product (maybe SDKs) There are two kinds of packages Applications Services and other products with high-level, non-code interfaces. Most products. 20
  11. Don’t pip install prod • pip requires developer attention to

    debug and resolve • No dependency resolution or transactional installs • Even dev tools work better with pipsi ◦ Every application deserves its own env 22 Especially not from PyPI
  12. Shipping product 23 1. PEX - Python libraries included 2.

    anaconda - Python ecosystem 3. freezers - Python included 4. images - system libraries included 5. containers - sandboxed images 6. virtual machines - kernel included 7. hardware - plug and play appliances 1 2 3 4 5 6 7 Summarizing packaging for Python applications
  13. Shipping product 24 1. PEX - libraries included 2. anaconda

    - Python ecosystem 3. freezers - Python included 4. images - system libraries included 5. containers - sandboxed images 6. virtual machines - kernel included 7. hardware - plug and play appliances Summarizing packaging for Python applications 1 2 5 6 7 4 3 http://sedimental.org/talks.html for the rest.
  14. The PayPal story • Started in 2009, grew to a

    team in 2011 • 30+ midtier apps, services, and batch jobs ◦ Max single service volume: 1.2 billion reqs/day (2016) ◦ Max single service throughput: 10,000 reqs/sec/worker (2016) • Multiprotocol, service-focused, gevent-based framework • Hundreds of users, almost all grassroots 26 http://github.com/paypal/support Wonderfully, spontaneously Python.
  15. More environments than any other stack. (8 environments x 10

    binary libraries) / 5 team members = ∞ static builds. 27 PayPal Python Environment Support Matrix (2014) Operating System Architecture Python Version Linux 32-bit/64-bit 2.6/2.7 Mac 64-bit 2.7 Solaris 32-bit 2.7 Windows 32-bit 2.7
  16. 80% environment coverage x 500+ packages = ∞% better 30

    Anaconda Environment Support Matrix (2015) Operating System Architecture Python Version Linux 32-bit/64-bit 2.7-3.x Mac 64-bit 2.7-3.x Windows 32-bit/64-bit 2.7-3.x
  17. Anaconda in PayPal LIVE • PayPal was on RHEL5 ◦

    Python 2.4 ◦ Plus 2.6-2.7 (sort of) • No Anaconda. Couldn’t just target conda install • We could bring our own by putting Miniconda in an RPM... 31 Production-first development in practice.
  18. 3-steps to a conda RPM 1. With requirements in hand,

    conda install --download-only 2. Miniconda + .tar.bz2 archives into the RPM 3. Run a tiny installer script in the RPM postinstall section Ready to test and deploy! https://www.paypal-engineering.com/2016/09/07/python-packaging-at-paypal/ https://github.com/paypal/support/blob/master/examples/miniconda 32 One way to box a snake:
  19. 3+ ways to RPM conda 1. https://github.com/ImmobilienScout24/snakepit 2. https://github.com/pelson/conda-rpms 3.

    https://github.com/jcrist/conda-pack RPM yourself a conda, today! (Also worth a look: https://github.com/conda/constructor ) 33 The innovation never stops:
  20. Anaconda internals • Built on OS and Python features •

    Userspace Filesystem layout • Python landmarking • Paths and linking • PatchELF 35 Anaconda’s internal layout (lib, include, bin, etc) A whole new ecosystem.
  21. Userspace images Anaconda meets old autorun CDs • Your userspace

    in its own partition! • Literally v1 was ISO9660 • E.g., AppImage / kdenlive 36 Even trendier than selfies! Anaconda meets old autorun CDs https://github.com/AppImage/AppImageKit https://github.com/appimage-packages/kdenlive
  22. “Containers” Reusable and disposable, but check the seal... • Userspace

    images ◦ + sandboxing ◦ + distribution • Flatpak & Snappy ◦ .deb vs. .rpm round 2 • Docker / Moby 37
  23. The shopkick story • 150k commits and 12+ services •

    CentOS 6 + Python 2.6 + LXC • 100% Mac local dev (iOS) One mission: Upgrade stack from 2009 to 2017. 38 Legacy environment and codebase.
  24. Local production Closing the rift. • Production-first: Better to target

    Linux than MacOS • Docker’s native MacOS xhyve virtualization ◦ Docker Machine (virtualbox) ◦ Minikube • Avoid lethal exposure of Docker • Account for new tools: GitLab & DCOS (later k8s) 39
  25. OpenSky Big, blue, and cloud-ready. • Specify a legacy service

    into an images for local, stage, and prod • Library dependencies from: yum, conda, pip, and sky (git repo) • Service deps specified in terms of docker images • Plugins + core shipped as PEX to local developers • Open-sourced just for you: https://github.com/shopkick/sky 40
  26. docker + conda Conda just works, despite docker headaches •

    Bring your own Miniconda • Custom installer script that calls conda ◦ docker build instrumentation severely lacking. Layer proliferation. • Tips: ◦ Run after yum, before pip ◦ Size is always an issue: nomkl unless you really need the speed, conda clean. ◦ --no-channel-priority useful for mixing and matching 41
  27. Wrapping up One neat package. • Production-first development • Leverage

    the ecosystem • Conda in OS packages works • Conda in containers works • Simple is better than complex 42
  28. 43 Thanks! Questions? Show’s over for real this time, slides

    & more: sedimental.org/talks.html @mhashemi