script, by calling the direct interpreter with -c or -m • When Python is invoked with –c, the program passed in as string (terminates option list) • python –c “import datetime; print datetime.datetime.now()”
virtual machine • Bytecode compilation generates .pyc files • If the Python process has write access, .pyc files are stored on the filesystem – else in memory
Python runtime = bytecode compiler + virtual machine • The Python virtual machine is a loop that iterates through bytecode instructions to carry out the instructions • No build or make step is required to run a Python program
a module • In Python 2.7, a collection of modules under one directory with an __init__.py is considered a package • Even if there is no initialization code to run when the package is imported, an empty __init__.py file is still needed for the interpreter to find any modules or subpackages in that directory. • http://bit.ly/1hGmaAN
path for module files • PYTHONHOME – change the location of the standard Python libraries – can be set to a single directory • In Python 3.3, any directory on sys.path with a name that matches the package name being looked for will be recognized as contributing modules and subpackages to that package.
under one directory with an __init__.py is considered a package • On Linux based systems, software is treated as a collection of well defined units called packages • Package from the OS perspective is an archive file along with the dependencies, version number, name, vendor, checksum etc. • In Python, what an OS calls a package is called a distribution • package(distribution) -> build -> install -> execute
__init__.py is considered a package • Any directory with a __main__.py is treated as an executable • Typing python –m package will execute package/__main__.py if it exists
a default import hook for Python >=2.4 • If a zipfile (in either source or compiled form) has an __init__.py file, it’s considered a package • If a zipfile has a __main__.py file, it’s considered an executable
can make tarballs of Python code and knows how to invoke compilers • A setup.py file has a call to distutils’ main entry point – the setup function • setup.py file can build, distribute, publish or install … • … which some perceive as a flaw, since everything is bundled together in setup.py
doesn’t ship with Python • setuptools introduced easy_install, which has been replaced by pip • setuptools monkeypatches distutils • For setuptools to work with an existing setup.py, edit the target package's setup.py and add from setuptools import setup • Doing this replaces the existing import of the setup function
from PyPI • vastly better than easy_install • pip can do operations like list, upgrade etc. • pip ships with wheels support and can build wheels and cache them • Python 3.4 ships with pip
(and often conflicting) requirements, to coexist on the same computer, including a copy of : 1. the Python binary 2. the entire Python standard library 3. the pip installer 4. site-packages directory
• Run arbitrary code to build and install and recompile • Need to recompile code to create a new virtualenv • Ergo slow • Hard to maintain • Defined by and require distutils/setuptools
to be moved to the correct location on the target host • Doesn’t require a build step – can be installed directly on the host with an installer like pip • Python files do not have to be precompiled
build or install step is required, just put them on PYTHONPATH or sys.path and import them • Key principle of eggs is that they should be discoverable and importable.
eggs: 1. .egg format: a directory or zipfile containing the project’s code and resources, along with an EGG-INFO subdirectory that contains the project’s metadata 2. .egg-info format: a file or directory placed adjacent to the project’s code and resources, that directly contains the project’s metadata.
system (C compilers etc.) • No arbitrary code execution for installation – no setup.py • No arbitrary code execution == faster installation for pure Python and native C extension packages • Creates .pyc files during installation to match the Python interpreter used
• Less dependent on system Python so long as it doesn’t ship with extensions that link to libpython • With manylinux, possible to distribute wheels for Linux platforms*
create a universal wheel • To create a universal wheel, create a setup.cfg file with [bdist_wheel] Universal=1 • Don’t push universal wheels for a project with C extensions as pip will prefer this version over source
package • Any directory with a __main__.py is treated as an executable • The zipimport module provides a default import hook for Python >=2.4 • If the Python import framework sees a zip file with a proper __init__.py, it can be treated as a directory • pex == all of the above put together
API YES YES NO NO NO NO NO Decoupled dependency metadata YES YES NO NO NO YES NO Ivy based metadata YES NO NO NO NO YES NO Pluggable YES YES YES YES* YES YES YES Scriptable YES NO NO YES YES* YES NO Human readable YES NO NO YES YES YES YES* Natively polyglot YES NO NO NO YES YES NO Generic artifact hosting support YES NO NO NO NO NO NO
share a large amount of code • Complex dependencies of third party libs • Variety of languages, code gen frameworks etc. • No need to maintain strict backwards compatibility
Allows for easy collaboration between many authors. • Encourages a cohesive codebase where problems are refactored - not worked around. • Simplifies dependency management within the codebase. All of the code you run with is visible at a single commit in the repo.
Many BUILD files per source tree define targets and goals • 1:1:1 rule – one target per directory representing a single package • Goals describe what you want to do to the targets • Can see all goals with the command pants goal list
not isolate build time dependencies • Uses system provided .so’s - any dependencies that has built .so files (greenlet, for example) will be packaged but not the ones they rely on during runtime.
containers guarantees that the software will always run the same, regardless of its environment. • “The goal is to encapsulate a software component and all its dependencies • run it - without extra dependencies regardless of the underlying machine and the contents of the container.”
a container runtime • It is also an image format • overlay networking • With 1.12 in swarm mode, it’s also a cluster scheduler • Process manager • … and much, much more (service discovery, load balancing, TLS ...) • All compiled into one gigantic binary running as root
Brief history of Python packaging • distutils, setuptools, pip and virtualenv • sdists and bdists • eggs and wheels • pex + pants • Docker • Nix + conda (frankly, another talk in its own right)