Pro Yearly is on sale from $80 to $50! »

Production-grade Packaging with Anaconda

Production-grade Packaging with Anaconda

Python's packaging vs. packaging Python. Challenges in shipping Python products, and how Anaconda fits in.

B4bbc497062643a8913884e7aba305f2?s=128

Mahmoud Hashemi

April 10, 2018
Tweet

Transcript

  1. Production-grade Packaging with Anac nda Mahmoud Hashemi - AnacondaCon 2018

  2. Packaging Turning code into a deployable archive. But how? 2

  3. “ TBD — Packaging section, my first design doc 3

  4. Developer naïveté • “Packaging is just the last step” •

    “We’ll just fudge something at the end and take the B” • “If I can build it, someone will ship it” 4 A few rookie mistakes:
  5. “ “The first 90% of the code accounts for the

    first 90% of development time. The last 10% of the code accounts for the other 90% of development time.” — Tom Cargill, Bell Labs (probably talking about packaging) 5 The Rule
  6. Production-first development 6 1

  7. Where’s the code going? Think distance and target environment before

    writing anything. 7
  8. Python Packaging revue Best practices, I googled them for you.

    2
  9. Standalone modules • A .py file is a module •

    Standalone: only imports from the standard library • schema, ashes, boltons, bottle.py • Targets Python: easy to distribute and integrate ◦ “vendoring” 9 The smallest unit of Python
  10. Pure-Python package • A directory full of .py files is

    a “package” • Generally includes an __init__.py • Django, requests, hyperlink, face • Easy to install with pip ◦ pip installs packages, after all, right? 10 The molecule to the module’s atom
  11. Pardon my dist • Proper packages need to be single

    redistributable archives • A distribution is an archive of zero or more packages • Motivational case studies: PIL & Pillow. PyCrypto(dome). • Built by setuptools through setup.py ◦ Example: sdist ◦ Simple, source-only .tar.gz 11 Packages are not proper packages.
  12. The full package • Python s interoperability & performance •

    Pillow, gevent, lxml • wheel ◦ The shiny new binary distribution, or bdist ◦ Supports most Windows, Mac, & Linux ◦ No compiler necessary 12 Python is more than just .py files.
  13. 13 python setup.py sdist bdist_wheel upload The modern way to

    build and upload a Python package sum()
  14. 14 Questions? Thanks for coming! Show’s over, check out my

    projects: github.com/mahmoud
  15. Big questions! Packaging is never as easy as the docs

    make it out to be. 15
  16. System libraries? • Prebuilt standard libraries managed through the OS

    ◦ .dll and .so ◦ libcrypto (OpenSSL), libxml2, libpng, etc. • Static vs dynamic linking ◦ Big wheels • conda! 16 The world is not written in Python (yet).
  17. Python’s Packaging 17 Packaging Python < Seems like...

  18. Basically... 18 1. .py - standalone modules 2. sdist -

    Pure-Python packages 3. wheel - Python packages 4. conda - Python + system libraries (With room to spare for static vs. dynamic linking) 1 2 3 4 But wait...
  19. Production You can’t ship to production without a product! 19

  20. Libraries What developers work on and with day in and

    day out. Rarely a product (maybe SDKs) There are two kinds of packages Applications Services and other products with high-level, non-code interfaces. Most products. 20
  21. 21 PyPI is not an app store Python is general-purpose,

    but pip and PyPI are not
  22. Don’t pip install prod • pip requires developer attention to

    debug and resolve • No dependency resolution or transactional installs • Even dev tools work better with pipsi ◦ Every application deserves its own env 22 Especially not from PyPI
  23. Shipping product 23 1. PEX - Python libraries included 2.

    anaconda - Python ecosystem 3. freezers - Python included 4. images - system libraries included 5. containers - sandboxed images 6. virtual machines - kernel included 7. hardware - plug and play appliances 1 2 3 4 5 6 7 Summarizing packaging for Python applications
  24. Shipping product 24 1. PEX - libraries included 2. anaconda

    - Python ecosystem 3. freezers - Python included 4. images - system libraries included 5. containers - sandboxed images 6. virtual machines - kernel included 7. hardware - plug and play appliances Summarizing packaging for Python applications 1 2 5 6 7 4 3 http://sedimental.org/talks.html for the rest.
  25. Putting Anaconda into Production Bigger stakes, bigger snakes. 3

  26. The PayPal story • Started in 2009, grew to a

    team in 2011 • 30+ midtier apps, services, and batch jobs ◦ Max single service volume: 1.2 billion reqs/day (2016) ◦ Max single service throughput: 10,000 reqs/sec/worker (2016) • Multiprotocol, service-focused, gevent-based framework • Hundreds of users, almost all grassroots 26 http://github.com/paypal/support Wonderfully, spontaneously Python.
  27. More environments than any other stack. (8 environments x 10

    binary libraries) / 5 team members = ∞ static builds. 27 PayPal Python Environment Support Matrix (2014) Operating System Architecture Python Version Linux 32-bit/64-bit 2.6/2.7 Mac 64-bit 2.7 Solaris 32-bit 2.7 Windows 32-bit 2.7
  28. cross-platform package manager? They’re a very rare breed! 28 If

    only there was some sort of...
  29. 29 h!

  30. 80% environment coverage x 500+ packages = ∞% better 30

    Anaconda Environment Support Matrix (2015) Operating System Architecture Python Version Linux 32-bit/64-bit 2.7-3.x Mac 64-bit 2.7-3.x Windows 32-bit/64-bit 2.7-3.x
  31. Anaconda in PayPal LIVE • PayPal was on RHEL5 ◦

    Python 2.4 ◦ Plus 2.6-2.7 (sort of) • No Anaconda. Couldn’t just target conda install • We could bring our own by putting Miniconda in an RPM... 31 Production-first development in practice.
  32. 3-steps to a conda RPM 1. With requirements in hand,

    conda install --download-only 2. Miniconda + .tar.bz2 archives into the RPM 3. Run a tiny installer script in the RPM postinstall section Ready to test and deploy! https://www.paypal-engineering.com/2016/09/07/python-packaging-at-paypal/ https://github.com/paypal/support/blob/master/examples/miniconda 32 One way to box a snake:
  33. 3+ ways to RPM conda 1. https://github.com/ImmobilienScout24/snakepit 2. https://github.com/pelson/conda-rpms 3.

    https://github.com/jcrist/conda-pack RPM yourself a conda, today! (Also worth a look: https://github.com/conda/constructor ) 33 The innovation never stops:
  34. Putting Anaconda into Containerized Envs RPMs and debs are so

    90s. 4
  35. Anaconda internals • Built on OS and Python features •

    Userspace Filesystem layout • Python landmarking • Paths and linking • PatchELF 35 Anaconda’s internal layout (lib, include, bin, etc) A whole new ecosystem.
  36. Userspace images Anaconda meets old autorun CDs • Your userspace

    in its own partition! • Literally v1 was ISO9660 • E.g., AppImage / kdenlive 36 Even trendier than selfies! Anaconda meets old autorun CDs https://github.com/AppImage/AppImageKit https://github.com/appimage-packages/kdenlive
  37. “Containers” Reusable and disposable, but check the seal... • Userspace

    images ◦ + sandboxing ◦ + distribution • Flatpak & Snappy ◦ .deb vs. .rpm round 2 • Docker / Moby 37
  38. The shopkick story • 150k commits and 12+ services •

    CentOS 6 + Python 2.6 + LXC • 100% Mac local dev (iOS) One mission: Upgrade stack from 2009 to 2017. 38 Legacy environment and codebase.
  39. Local production Closing the rift. • Production-first: Better to target

    Linux than MacOS • Docker’s native MacOS xhyve virtualization ◦ Docker Machine (virtualbox) ◦ Minikube • Avoid lethal exposure of Docker • Account for new tools: GitLab & DCOS (later k8s) 39
  40. OpenSky Big, blue, and cloud-ready. • Specify a legacy service

    into an images for local, stage, and prod • Library dependencies from: yum, conda, pip, and sky (git repo) • Service deps specified in terms of docker images • Plugins + core shipped as PEX to local developers • Open-sourced just for you: https://github.com/shopkick/sky 40
  41. docker + conda Conda just works, despite docker headaches •

    Bring your own Miniconda • Custom installer script that calls conda ◦ docker build instrumentation severely lacking. Layer proliferation. • Tips: ◦ Run after yum, before pip ◦ Size is always an issue: nomkl unless you really need the speed, conda clean. ◦ --no-channel-priority useful for mixing and matching 41
  42. Wrapping up One neat package. • Production-first development • Leverage

    the ecosystem • Conda in OS packages works • Conda in containers works • Simple is better than complex 42
  43. 43 Thanks! Questions? Show’s over for real this time, slides

    & more: sedimental.org/talks.html @mhashemi