Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From python scripts to packages: an overview for scientists

cournape
January 08, 2013

From python scripts to packages: an overview for scientists

A high-level overview of tools to make your code usable by others, for scientists

cournape

January 08, 2013
Tweet

More Decks by cournape

Other Decks in Research

Transcript

  1. FROM SCRIPTS TO PACKAGES
    practices for hassle-free code reuse
    David Cournapeau, Enthought Ltd - [email protected]

    View Slide

  2. How to make your software actually usable for
    others ?

    View Slide

  3. A FAMILIAR STORY

    View Slide

  4. • A few set of scripts for data
    munging
    • Later shared with co-
    workers ...
    • ... who modify them
    • few months later: multiple,
    incompatible scripts, different
    results

    View Slide

  5. • This someone else may be you 6 months later !

    View Slide

  6. WHAT IS THIS TALK ABOUT ?
    • How to organise a project as it grows
    • How to share software effectively
    • Set of good practices to foster collaboration
    • Tools to help applying those practices
    • Python-focused (but mostly applies to other technologies)

    View Slide

  7. PACKAGING 101

    View Slide

  8. VERSION SOURCE CONTROL
    • Systems to record snapshots of your code
    • Always use one
    • Usage can be low-overhead
    • Use a well known one: subversion, mercurial, git
    • Free hosted solutions available: bitbucket, github, gitorious

    View Slide

  9. CODE ORGANISATION
    Don’t be creative, follow conventions

    View Slide

  10. EXAMPLE
    # Top directory (source tree root)
    foo-1.0
    foo-1.0/README
    foo-1.0/LICENSE
    foo-1.0/setup.py
    # Top python package (“import foo”)
    foo-1.0/foo/__init__.py
    foo-1.0/foo/bar.py

    View Slide

  11. KEY POINTS
    • Use sensible name for the source tree
    • README to describe the purpose of the software
    • Add a LICENSE file to specify the license
    • setup.py will be used for packaging:
    • use distutils library to provide essential packaging capabilities
    • Don’t use a ‘Lib’ or ‘src’ directory at the top

    View Slide

  12. DISTUTILS
    # setup.py file content
    from distutils.core import setup
    setup(name="foo", version="1.0",
    summary="a few words about the package",
    author="John Doe", author_email="[email protected]",
    license="BSD", url="http://www.example.com",
    packages=["foo"])

    View Slide

  13. WHY SETUP.PY
    • Install your package
    python setup.py install
    • Build windows installers
    python setup.py bdist_wininst
    • Build tarballs
    python setup.py sdist

    View Slide

  14. WHY SETUP.PY ? (2)
    • Most python users know what to do with it
    • Simple setup.py files are easy to write
    • Works on every platform

    View Slide

  15. FROM A PACKAGE TO A
    PROJECT

    View Slide

  16. RATIONALE
    • Package vs project: from one to N developers
    • A project usually involved documentation, tests, scheduled
    releases, etc...
    • In many ways, this is about bootstrapping to make other
    people do the work for you !

    View Slide

  17. DOCUMENTATION
    • Multiple kinds of documentation:
    • API documentation: what a given function/class does
    • “Proper” documentation: usage-oriented, should be the main
    documentation as project matures
    • Use sphinx to deal with documentation

    View Slide

  18. • Sphinx transforms reST (text-like format) into html, latex/pdf
    • Knows how to extract inline documentation from python
    code
    • Can embed math (latex), code snippets

    View Slide

  19. SIMPLE EXAMPLE
    foo-1.0
    foo-1.0/setup.py
    # created by sphinx
    foo-1.0/doc/
    foo-1.0/doc/Makefile
    foo-1.0/doc/src/conf.py
    foo-1.0/doc/src/...
    • Doc bootstrapped with sphinx-quickstart script
    • Makefile to help build documentation, e.g. make html

    View Slide

  20. HOSTING YOUR DOC

    View Slide

  21. HOSTING YOUR DOC (2)
    • http://readthedocs.org is a free service to host sphinx
    documentation
    • Only requirement: provide a link to your sphinx
    documentation
    • Documentation automatically built for you when code is
    updated

    View Slide

  22. TESTING
    • Testing is about validating your code
    • Becomes more important as the project grows in #people
    and #size

    View Slide

  23. WRITING A SIMPLE TEST
    # test_add.py file
    from unittest import TestCase
    from foo import add
    class TestSimple(TestCase):
    def test1(self):
    self.assertEqual(add(1, 2), 3)

    View Slide

  24. RUNNING A SIMPLE TEST
    • Tests can be run with a test runner, e.g. with the 3rd party
    discover package (included in python >= 2.7):
    python -m unittest discover foo
    • Other popular testing frameworks: py.text, nose

    View Slide

  25. ORGANISING TESTS
    • Make your tests importable
    • Put tests inside your package, not side by side to it
    foo-1.0
    foo-1.0/setup.py
    # tests subdirectory inside the package directories
    foo-1.0/foo/tests
    foo-1.0/foo/tests/__init__.py
    foo-1.0/foo/tests/...

    View Slide

  26. TRAVIS-CI

    View Slide

  27. TRAVIS-CI
    • http://travis-ci.org is a free hosted service to run tests (not
    python specific)
    • Starting point: a simple text configuration file in your project:
    # .travis.yml file
    language: python
    python:
    - "2.7"
    - "3.2"
    script: python tests/test_all_of_the_units.py

    View Slide

  28. WRAPPING UP

    View Slide

  29. CONCLUSION
    • Giving away code is not enough, it needs to be reusable
    • Reusability is key to help collaboration and ensure
    maintainability
    • Lots of freely available tools to help documenting, testing,
    distributing software
    • Helping others may be helping yourself in the future

    View Slide

  30. LINKS
    • Hosted source code repositories: http://github.com, https://
    bitbucket.org/
    • Writing great documentation: http://jacobian.org/writing/great-
    documentation/
    • Hosting sphinx documentation: https://readthedocs.org/
    • Hosted test runner: http://travis-ci.org
    • To go further: http://guide.python-distribute.org/

    View Slide