Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyData 2013 keynote (IPython & friends)

PyData 2013 keynote (IPython & friends)

A similar talk to others I've given recently, this was my keynote at the Silicon Valley edition of PyData 2013.

Full video: http://vimeo.com/63250251

Fernando Perez

March 20, 2013
Tweet

More Decks by Fernando Perez

Other Decks in Technology

Transcript

  1. IPython Open Source Academia Wrapup
    IPython
    A modern vision of interactive computing
    Fernando Pérez
    http://fperez.org, @fperez_org
    [email protected]
    Henry H. Wheeler Jr. Brain Imaging Center, UC Berkeley
    PyData 2013, Silicon Valley
    March 20, 2013

    View Slide

  2. IPython Open Source Academia Wrapup
    Outline
    1 IPython: Interactive Python
    2 The Life of an Open Source Project
    3 Academia vs Open Source
    4 Wrapup
    FP (UC Berkeley) IPython 3/20/13 2 / 34

    View Slide

  3. In the beginning, IBM said...
    Let there be FORTRAN

    View Slide

  4. In the beginning, IBM said...
    Let there be FORTRAN

    View Slide

  5. Beyond (Floating Point) Number Crunching
    Hardware
    floating point
    Arbitrary precision
    integers
    Rationals
    Interval arithmetic
    Symbolic manipulation
    FORTRAN
    Extended precision
    floating point
    Text processing
    Databases
    Graphical user
    interfaces
    Web interfaces
    Hardware
    control
    Multi-language
    integration
    Data formats: HDF5, XML, ...

    View Slide

  6. The purpose of computing is insight, not numbers.
    Richard Hamming, 1962

    View Slide

  7. IPython Open Source Academia Wrapup
    The computer as microscope
    Exploratory: Problem’s definition evolves as we understand it.
    No ‘requirements’ to build an application against.
    Mathematica, Maple, Matlab, IDL, etc.
    All have an interactive environment.
    Applications Languages
    FP (UC Berkeley) IPython 3/20/13 6 / 34

    View Slide

  8. IPython: part of a Rich Ecosystem
    IPython
    NetworkX

    View Slide

  9. IPython Open Source Academia Wrapup
    The Lifecycle of a Scientific Idea (schematically)
    1 Individual exploratory work
    2 Collaborative development
    3 Parallel production runs (HPC, cloud, ...)
    4 Publication (with reproducible results!)
    5 Education
    6 Goto 1.
    The Problem with most tools
    Barriers and discontinuities in workflow in between all the steps
    FP (UC Berkeley) IPython 3/20/13 8 / 34

    View Slide

  10. IPython Open Source Academia Wrapup
    The Lifecycle of a Scientific Idea (schematically)
    1 Individual exploratory work
    2 Collaborative development
    3 Parallel production runs (HPC, cloud, ...)
    4 Publication (with reproducible results!)
    5 Education
    6 Goto 1.
    The Problem with most tools
    Barriers and discontinuities in workflow in between all the steps
    FP (UC Berkeley) IPython 3/20/13 8 / 34

    View Slide

  11. IPython’s goal:
    Fluid transitions in all these steps

    View Slide

  12. Demo

    View Slide

  13. IPython Open Source Academia Wrapup
    Pillar #1: An architecture for interactive computing
    FP (UC Berkeley) IPython 3/20/13 11 / 34

    View Slide

  14. IPython Open Source Academia Wrapup
    Pillar #2: the Notebook Format
    JSON but version control-friendly
    Easy for machine processing, fixable by hand if need be.
    Lots of hooks for metadata
    Not Python-specific (Ruby, JS notebooks exist, R, Julia planned)
    Produce Markdown, reST, L
    A
    TEX, HTML, etc...
    An open format for sharing, publishing and
    archiving executable computational work
    FP (UC Berkeley) IPython 3/20/13 12 / 34

    View Slide

  15. IPython Open Source Academia Wrapup
    Outline
    1 IPython: Interactive Python
    2 The Life of an Open Source Project
    3 Academia vs Open Source
    4 Wrapup
    FP (UC Berkeley) IPython 3/20/13 13 / 34

    View Slide

  16. Documented protocols and formats:
    a growing ecosystem around IPython

    View Slide

  17. An Emacs Notebook Client!
    Takafumi Arakaki
    http://tkf.github.com/emacs-ipython-notebook

    View Slide

  18. Microsoft Visual Studio 2010 integrated console
    Dino Viehland and Shahrokh Mortazavi (Microsoft)
    http://pytools.codeplex.com

    View Slide

  19. A vim client to control an IPython kernel/console
    Paul Ivanov (Berkeley)
    https://github.com/ivanov/vim-ipython

    View Slide

  20. Notebooks on Windows Azure Cloud
    Shahrokh Mortazavi (Microsoft), B.G., F.P.
    http://bit.ly/JQeojD

    View Slide

  21. Star Cluster: IPython parallel+Notebook on Amazon EC2
    Justin Riley (MIT)
    http://web.mit.edu/star/cluster

    View Slide

  22. NBViewer: easy notebook sharing
    Matthias Bussonnier
    http://nbviewer.ipython.org

    View Slide

  23. Other projects using IPython
    Scientific
    EPD: Enthought Python Distribution.
    Anaconda: Continuum Python Distribution.
    Sage: open source mathematics.
    PyRAF: Space Telescope Science Institute
    CASA: Nat. Radio Astronomy Observatory
    Ganga: CERN
    PyMAD: neutron spectrom., Laue Langevin
    Sardana: European Synchrotron Radiation
    ASCEND: eng. modeling (Carnegie Mellon).
    JModelica: dynamical systems.
    DASH: Denver Aerosol Sources and Health.
    Trilinos: Sandia National Lab.
    DoD: baseline configuration.
    NiPype: computational pipelines, MIT.
    PyIMSL Studio, by Visual Numerics.
    ...
    Web/Other
    Visual Studio 2010: MS.
    Django.
    Turbo Gears.
    Pylons web framework
    Zope and Plone CMS.
    Axon Shell, BBC
    Kamaelia.
    Schevo database.
    Pitz: distributed
    task/bug tracking.
    iVR (interactive Virtual
    Reality).
    Movable Python
    (portable Python
    environment).
    ...

    View Slide

  24. How did we get here?
    A brief history of IPython
    October 2001: “just a little afternoon hack”
    My own $PYTHONSTARTUP:
    ipython-0.0.1.py: 259 lines.
    In [N]: prompts and _N results cache.
    IPP (Interactive Python Prompt) by Janko Hauser (Oceanography)
    LazyPython by Nathan Gray (CS Caltech)
    2002: Ignore John Hunter’s Gnuplot support patches
    ... let there be matplotlib
    (actually finish my PhD!)
    2005: Brian Granger, Min Ragan-Kelley
    First parallel tools, Twisted-based
    2005-2008: Ville Vainio, Gaël Varoquaux, Laurent Dufréchou
    Core maintenance, Wx integration.

    View Slide

  25. Summer 2009: NIH-funded cleanup by Brian.
    March 2010: prototype networked shell using ØMQ
    2-day sprint with Brian
    Enthought funds Qt console. Min ports parallel code to ØMQ
    Core architecture ready, foundation for Notebook
    Fall 2010
    James Gao at Berkeley builds (5th!) Notebook Prototype.
    Summer 2011
    Brian rebuids James’ prototype into today’s Notebook.

    View Slide

  26. An important plot
    http://www.ohloh.net/p/ipython

    View Slide

  27. (Incomplete) Cast of Characters
    Brian Granger - Physics, Cal State San Luis Obispo
    Min Ragan-Kelley - Nuclear Engineering, UC Berkeley
    Matthias Bussonnier - Physics, Institut Curie, Paris
    Brad Froehle - Mathematics, UC Berkeley
    Paul Ivanov - Neuroscience, UC Berkeley.
    Robert Kern - Enthought
    Thomas Kluyver - Biology, U. Sheffield
    Jonathan March- Enthought
    Evan Patterson - Physics, Caltech/Enthought
    Jörgen Stenarson - Elect. Engineering, Sweden.
    Stefan van der Walt - UC Berkeley
    John Hunter - TradeLink Securities, Chicago.
    Prabhu Ramachandran - Aerospace Engineering, IIT Bombay.
    Satra Ghosh- MIT Neuroscience
    Gaël Varoquaux - Neurospin (Orsay, France)
    Ville Vainio - CS, Tampere University of Technology, Finland
    Barry Wark - Neuroscience, U. Washington.
    Ondrej Certik - Physics, U Nevada Reno
    Darren Dale - Cornell
    Justin Riley - MIT
    Mark Voorhies - UC San Francisco
    Nicholas Rougier - INRIA Nancy Grand Est
    Thomas Spura - Fedora project
    Many more! (~220 commit authors)

    View Slide

  28. IPython Open Source Academia Wrapup
    Outline
    1 IPython: Interactive Python
    2 The Life of an Open Source Project
    3 Academia vs Open Source
    4 Wrapup
    FP (UC Berkeley) IPython 3/20/13 26 / 34

    View Slide

  29. Support at the edges of academic funding
    Enthought, Austin, TX: Lots!
    Microsoft: WinHPC support, Visual Studio integration, Azure
    (thanks to Shahrokh Mortazavi).
    DoD/DRC Inc: funding through Sept. 2012 (thanks to Jose
    Unpingco and Chris Keees).
    NIH: via NiPy grant
    NSF: via Sage compmath grant
    Google: summer of code 2005, 2010.
    Tech-X Corp., Boulder, CO: Parallel/notebook (previous versions)
    Recent stable funding (2 years, 7 people, J. Taylor):

    View Slide

  30. Open Source:
    skills, tools and practices we need!
    A culture where things get done.
    Wildly collaborative
    Reproducible by necessity
    Version control, testing, documentation, public peer review, etc.

    View Slide

  31. Reward Structure in academia:
    we punish all of the above
    Departmental boundaries: interdisciplinary work is a great buzzword,
    not such a great career path.
    Computational heritage is built on code
    not on citations
    Continuous evolution vs publication milestones
    Authorship in collaborative works vs the first-author paper.
    Scholarship and intellectual effort embedded in the code.

    View Slide

  32. NumFOCUS: Open Code, Better Science
    Promote the health of our open source scientific computing
    ecosystem
    Support the development of multiple projects.
    Community-created and driven.
    A neutral ground for industry, academia and government to support
    scientific open source.
    501(c)3 - donations are tax-exempt in the USA
    http://numfocus.org

    View Slide

  33. IPython Open Source Academia Wrapup
    Outline
    1 IPython: Interactive Python
    2 The Life of an Open Source Project
    3 Academia vs Open Source
    4 Wrapup
    FP (UC Berkeley) IPython 3/20/13 31 / 34

    View Slide

  34. The future of IPython: a 2-year roadmap
    Spring/summer 2013: IPython 1.0
    Notebook document management (nbconvert)
    JavaScript internals cleanup
    Fall 2013
    Interactive JavaScript API
    With callbacks to remote kernels.
    2014
    Multiuser server
    Simple to deploy
    Trusted (shell OK) Unix users in a lab, group, class, etc.
    https://github.com/ipython/ipython/wiki/Roadmap:-IPython

    View Slide

  35. In closing: our vision of scientific computing
    Build on the right abstractions
    The kernel: unify interactive and parallel computing
    → you only have one brain!
    A single protocol: many kernels, many clients.
    Communications and logging
    the protocol is the notebook file format.
    Insight and communication (Hamming)
    “Literate computing” vs “literate programming”.
    Build a community and an ecosystem
    “How to Scale a Code in the Human Dimension”, M. Turk,
    http://arxiv.org/abs/1301.7064.

    View Slide

  36. In closing: our vision of scientific computing
    Build on the right abstractions
    The kernel: unify interactive and parallel computing
    → you only have one brain!
    A single protocol: many kernels, many clients.
    Communications and logging
    the protocol is the notebook file format.
    Insight and communication (Hamming)
    “Literate computing” vs “literate programming”.
    Build a community and an ecosystem
    “How to Scale a Code in the Human Dimension”, M. Turk,
    http://arxiv.org/abs/1301.7064.

    View Slide

  37. In closing: our vision of scientific computing
    Build on the right abstractions
    The kernel: unify interactive and parallel computing
    → you only have one brain!
    A single protocol: many kernels, many clients.
    Communications and logging
    the protocol is the notebook file format.
    Insight and communication (Hamming)
    “Literate computing” vs “literate programming”.
    Build a community and an ecosystem
    “How to Scale a Code in the Human Dimension”, M. Turk,
    http://arxiv.org/abs/1301.7064.

    View Slide

  38. John D. Hunter, 1968-2012: http://matplotlib.org
    Memorial fund: http://numfocus.org/johnhunter

    View Slide