Slide 1

Slide 1 text

Conda: A Cross-Platform Package Manager for Any Binary Distribution Aaron Meurer Ilan Schnell Continuum Analytics, Inc

Slide 2

Slide 2 text

or, Solving the Packaging Problem

Slide 3

Slide 3 text

What is the packaging problem?

Slide 4

Slide 4 text

History

Slide 5

Slide 5 text

Two sides Installing Building

Slide 6

Slide 6 text

Two sides Installing Building User Developer

Slide 7

Slide 7 text

Installing • setup.py install • easy_install • pip • apt-get • rpm • emerge • homebrew • port • fink • …

Slide 8

Slide 8 text

setup.py install • fine if it’s pure Python, not so much if it isn’t • you have to have compilers installed distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1

Slide 9

Slide 9 text

setup.py install You are your own package manager

Slide 10

Slide 10 text

pip • Only works with Python • Not so great for scientific packages that depend on big C libraries • Try installing h5py if you don’t have HDF5

Slide 11

Slide 11 text

pip You are a “self integrator”

Slide 12

Slide 12 text

Building

Slide 13

Slide 13 text

Problems • distutils is not really designed for compiled packages • numpy.distutils “fork” • setuptools is over complicated • import setuptools monkeypatches distutils • Entry points require pkg_resources • pkg_resources.DistributionNotFound: flake8==2.1.0 • Each egg adds an entry to sys.path • import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

Slide 14

Slide 14 text

Package maintainers hate having packages that no one can install

Slide 15

Slide 15 text

What is the packaging problem?

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

What about wheels? • Python package specific • Can’t build wheels for C libraries • Can’t make a wheel for Python itself • Still doesn’t address problem that some metadata is only in the package itself • You are still a “self integrator”

Slide 18

Slide 18 text

System Packaging solutions yum (rpm) apt-get (dpkg) Linux OSX macports homebrew fink Windows chocolatey npackd

Slide 19

Slide 19 text

System Packaging solutions yum (rpm) apt-get (dpkg) Linux OSX macports homebrew fink Windows chocolatey npackd Cross-platform conda

Slide 20

Slide 20 text

Conda • System level package manager (Python agnostic) • Python, hdf5, and h5py are all conda packages • Cross platform (works on Windows, OS X, and Linux) • Doesn’t require administrator privileges • Installs binaries (no more compiler woes) • Metadata stored separately in the repository index • Uses a SAT solver to resolve dependency before packages are installed

Slide 21

Slide 21 text

Basic conda usage Install a package conda install sympy List all installed packages conda list Search for packages conda search llvm Create a new environment conda create -n py3k python=3 Remove a package conda remove nose Get help conda install --help

Slide 22

Slide 22 text

Advanced usage Install a package in an environment conda install -n py3k sympy Update all packages conda update --all Export list of packages conda list --export packages.txt Install packages from an export conda install --file packages.txt See package history conda list --revisions Revert to a revision conda install --revision 23 Remove unused packages and cached tarballs conda clean -pt

Slide 23

Slide 23 text

What is a conda package?

Slide 24

Slide 24 text

What is a conda package? Just a tar.bz2 file with the files from the package, and some metadata /lib /include /bin /man /info files index.json

Slide 25

Slide 25 text

What is a conda package? Just a tar.bz2 file with the files from the package, and some metadata /lib /include /bin /man /info files index.json Files are not Python specific. Any kind of program at all can be a conda package. Metadata is static.

Slide 26

Slide 26 text

Python Agnostic • A conda package can be anything • Python packages • Python itself • C libraries (GDAL, netCDF4, dynd, …) • R • Node JS • Perl

Slide 27

Slide 27 text

Installation • The tarball is unarchived in the pkgs directory • Files are hard-linked to the install path • Shebang lines and other instances of a place-holder prefix are replaced with the install prefix • The metadata is updated, so that conda knows that it is installed • post-link script is run (these are rare) And that’s it conda install sympy

Slide 28

Slide 28 text

Installation And that’s it conda install sympy

Slide 29

Slide 29 text

Environments • Environments are simple: just link the package to a different directory • Hard-links are very cheap, and very fast • Conda environments are completely independent installations of everything • No fiddling with PYTHONPATH or symlinking site-packages • “Activating” an environment just means changing your PATH so that its bin/ or Scripts/ comes first. • Unix: • Windows: conda create -n py3k python=3.4 source activate py3k activate py3k

Slide 30

Slide 30 text

Environments /python-3.4.1-0 /bin/python /sympy-0.7.5-0 /bin/isympy /lib/python3.4/ site-packages/ sympy /envs /sympy-env /bin/python /bin/isympy /lib/python3.4/ site-packages/ sympy Hard links /pkgs /test /bin/python

Slide 31

Slide 31 text

Environments Uses: • Testing (python 2.6, 2.7, 3.3) • Development • Trying new packages from PyPI • Separating deployed apps with different dependency needs • Trying new versions of Python • Reproducible science

Slide 32

Slide 32 text

Building

Slide 33

Slide 33 text

Conda Recipes • meta.yaml contains metadata • build.sh is the build script for Unix and bld.bat is the build script for Windows meta.yaml build.sh bld.bat (optional) fix.patch run_test.py post-link.sh conda build path/to/recipe/

Slide 34

Slide 34 text

Example meta.yaml

Slide 35

Slide 35 text

Conda Recipes • Lots more • Command line entry points • Fine-grained control over conda’s relocation logic • Inequalities for versions of dependencies (like >=1.2,<2.0) • “Preprocessing selectors” allow using the same meta.yaml for many platforms • See http://conda.pydata.org/docs/build.html for full documentation conda build path/to/recipe/

Slide 36

Slide 36 text

• conda build is only a convenient wrapper • You can also build packages manually just by following the package specification (http://conda.pydata.org/docs/spec.html)

Slide 37

Slide 37 text

Sharing • Once you have a conda package, the easiest way to share it is to upload it to Binstar • Others can install your package with conda install -c binstar_username package • Or add your channel to their configuration with conda config -—add channels binstar_username

Slide 38

Slide 38 text

Self Hosting • You can also self-host • Store packages in a directory by platform (osx-64, linux-32, linux-64, win-32 ,win-64) • Run conda index on that directory to generate the repodata.json • Serve this up, or use a file:// url as a channel • Binstar is just a very convenient hosted wrapper around conda index conda index directory/osx-64

Slide 39

Slide 39 text

Final words • conda is completely open source (BSD) https://github.com/conda/conda • We have a mailing list ([email protected]) • A big thanks to Continuum for paying me to work on open source

Slide 40

Slide 40 text

Thanks! Sean Ross-Ross (principal binstar.org developer) Bryan Van de Ven (original conda author) Ilan Schnell (principal conda developer) Travis Oliphant (Continuum CEO)