$30 off During Our Annual Pro Sale. View Details »

Science and Python: a interactively biased retrospective of a (mostly) successful decade

Fernando Perez
November 12, 2012

Science and Python: a interactively biased retrospective of a (mostly) successful decade

Slides for my PyCon Canada keynote, that cover similar material to other recent talks but with a slightly different emphasis.

Video: http://www.youtube.com/watch?v=F4rFuIb1Ie4

Fernando Perez

November 12, 2012
Tweet

More Decks by Fernando Perez

Other Decks in Technology

Transcript

  1. Me and computers? SciComp Examples IPython Trouble
    Science and Python: a interactively biased retrospective
    of a (mostly) successful decade
    Fernando Pérez
    http://fperez.org, @fperez_org
    [email protected]
    PyCon Canada
    Toronto
    Nov 11, 2012

    View Slide

  2. Me and computers? SciComp Examples IPython Trouble
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 2 / 46

    View Slide

  3. Me and computers? SciComp Examples IPython Trouble
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 3 / 46

    View Slide

  4. Physics undergrad, plot fractals in TurboPascal
    Program on paper, use mom’s office PC on weekends.
    Debug on paper. Think a lot away from the screen.

    View Slide

  5. VGA graphics!
    Binary compiled in 1991, screenshot taken May 2011!

    View Slide

  6. Me and computers? SciComp Examples IPython Trouble
    Clueless about the internet (which I unplugged)
    Image credit: Carlos Latuff
    FP (UC Berkeley) Python for science 11/11/12 6 / 46

    View Slide

  7. Me and computers? SciComp Examples IPython Trouble
    A learning moment...
    Image credit: crazytales562 on Flickr
    FP (UC Berkeley) Python for science 11/11/12 7 / 46

    View Slide

  8. Me and computers? SciComp Examples IPython Trouble
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 8 / 46

    View Slide

  9. Computing is not the ’third branch’ of science...
    It is now the backbone of theory and experiment!

    View Slide

  10. Me and computers? SciComp Examples IPython Trouble
    Scientific computing
    The computer as a microscope
    Exploratory: Problem’s definition evolves as we understand it.
    No ‘requirements’ to build an application against.
    Mathematica, Maple, Matlab, IDL, etc.
    All have an interactive environment.
    Applications Languages
    FP (UC Berkeley) Python for science 11/11/12 10 / 46

    View Slide

  11. Beyond (Floating Point) Number Crunching
    Hardware
    floating point
    Arbitrary precision
    integers
    Rationals
    Interval arithmetic
    Symbolic manipulation
    FORTRAN
    Extended precision
    floating point
    Text processing
    Databases
    Graphical user
    interfaces
    Web interfaces
    Hardware
    control
    Multi-language
    integration
    Data formats: HDF5, XML, ...

    View Slide

  12. Scientific Python: a Rich Ecosystem
    IPython
    NetworkX

    View Slide

  13. Me and computers? SciComp Examples IPython Trouble EEG analysis for epilepsy High quality plotting: matplotlib JPL
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 13 / 46

    View Slide

  14. Data analysis for epilepsy surgery
    Isolating the origin of drug-resistant epileptic seizures which require surgery.
    John Hunter, Department of Pediatric Neurology, University of Chicago.

    View Slide

  15. Electrode location in 3D, combined with MRI data

    View Slide

  16. Correlation analysis of seizure data

    View Slide

  17. Matplotlib: 2d plotting
    Matplotlib 1.2.0 is official, with Python 3 support!!!

    View Slide

  18. Matplotlib: 3d plotting

    View Slide

  19. JPL: Mars mission trajectory design and nav data
    Ted Drain and Lynn Craig, Jet Propulsion Laboratory (NASA/Caltech)
    From: Name Elided
    Date: Oct 2, 2007 7:15 PM
    Subject: Fwd: matplotlib bug numbers
    To: John Hunter
    One of my lead developers mentioned that they had sent a bug to you about the annotations feature of
    MatPlotLib. Would you be able to let me know what the timeline is to resolve that bug? The reason is that
    the feature is needed for the Phoenix project and their arrival at Mars will be in March sometime, but they
    are doing their testing in the coming few months. This annotation feature is used on reports that present
    the analysis of the trajectory to the navigation team and it shows up on our schedule. It would really
    help me to know approximately when it could be resolved.
    B-plane plots are used to show the trajectory of a spacecraft with respect to the target body (specifically
    perpendicular to the incoming asymptote of the spacecraft trajectory) and we plot them with the y-axis
    inverted. The plot is used heavily in flight operations so it is important to our customers.
    In addition, we have what is called a thundering heard plot where many different trajectory solutions
    (determined from different measurement sources) are plotted together. The annotations are import there so
    we can see which plot corresponds to each source of data. I hope it helps to know how your code will be
    used in spacecraft navigation.
    Thanks for all your efforts.

    View Slide

  20. JPL: Mars mission data visualization
    Expected communication power levels between an orbiting spacecraft and
    a lander as it goes through the atmosphere:

    View Slide

  21. August 23, 2011
    The astronomy ‘event of a generation’
    Josh Bloom, UC Berkeley Astronomy
    @profjsb

    View Slide

  22. Monday Tuesday Wednesday
    Supernova PTF11kyl:
    Event of a Generation found on Tuesday
    Most nearby Type Ia supernova in > 25 years
    Soon visible with binoculars
    http://bit.ly/ptf11kly

    View Slide

  23. Me and computers? SciComp Examples IPython Trouble
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 22 / 46

    View Slide

  24. Why IPython?
    (something other than
    “I’d rather not finish my dissertation”)

    View Slide

  25. Why IPython?
    (something other than
    “I’d rather not finish my dissertation”)

    View Slide

  26. Me and computers? SciComp Examples IPython Trouble
    The Lifecycle of a Scientific Idea (schematically)
    1 Individual exploratory work
    2 Collaborative development
    3 Production work (HPC, cloud, parallel)
    4 Publication (with reproducible results!)
    5 Education
    6 Goto 1.
    The Problem with most tools
    Barriers and discontinuities in workflow in between all the steps
    FP (UC Berkeley) Python for science 11/11/12 24 / 46

    View Slide

  27. Me and computers? SciComp Examples IPython Trouble
    The Lifecycle of a Scientific Idea (schematically)
    1 Individual exploratory work
    2 Collaborative development
    3 Production work (HPC, cloud, parallel)
    4 Publication (with reproducible results!)
    5 Education
    6 Goto 1.
    The Problem with most tools
    Barriers and discontinuities in workflow in between all the steps
    FP (UC Berkeley) Python for science 11/11/12 24 / 46

    View Slide

  28. IPython’s goal:
    Fluid transitions in all these steps

    View Slide

  29. Demo

    View Slide

  30. Me and computers? SciComp Examples IPython Trouble
    Pillar #1: An architecture for interactive computing
    FP (UC Berkeley) Python for science 11/11/12 27 / 46

    View Slide

  31. Me and computers? SciComp Examples IPython Trouble
    Pillar #2: the Notebook Format
    JSON but version control-friendly
    Easy for machine processing, fixable by hand if need be.
    Lots of hooks for metadata
    Not Python-specific (R and Ruby notebooks exist, Julia planned)
    Produce Markdown, reST, L
    A
    TEX, HTML, etc...
    An open format for sharing, publishing and
    archiving executable computational work
    FP (UC Berkeley) Python for science 11/11/12 28 / 46

    View Slide

  32. Documented protocols and formats:
    a growing ecosystem around IPython

    View Slide

  33. Microsoft Visual Studio 2010 integrated console
    Dino Viehland and Shahrokh Mortazavi (Microsoft); http://pytools.codeplex.com

    View Slide

  34. A vim client to control an IPython kernel/console
    Paul Ivanov (Berkeley), https://github.com/ivanov/vim-ipython

    View Slide

  35. An Emacs Notebook Client!
    Takafumi Arakaki: http://tkf.github.com/emacs-ipython-notebook.

    View Slide

  36. Notebooks on Windows Azure Cloud
    Shahrokh Mortazavi (Microsoft), B.G., F.P.: http://bit.ly/JQeojD.

    View Slide

  37. Star Cluster: IPython parallel+Notebook on Amazon EC2
    Justin Riley (MIT): http://web.mit.edu/star/cluster

    View Slide

  38. One-click single notebook on Amazon EC2
    Carl Smith (UK): https://notebookcloud.appspot.com.

    View Slide

  39. Other projects using IPython
    Scientific
    EPD: Enthought Python Distribution.
    Sage: open source mathematics.
    PyRAF: Space Telescope Science Institute
    CASA: Nat. Radio Astronomy Observatory
    Ganga: CERN
    PyMAD: neutron spectrom., Laue Langevin
    Sardana: European Synchrotron Radiation
    ASCEND: eng. modeling (Carnegie Mellon).
    JModelica: dynamical systems.
    DASH: Denver Aerosol Sources and Health.
    Trilinos: Sandia National Lab.
    DoD: baseline configuration.
    Mayavi: 3d visualization, Enthought.
    NiPype: computational pipelines, MIT.
    PyIMSL Studio, by Visual Numerics.
    ...
    Web/Other
    Visual Studio 2010: MS.
    Django.
    Turbo Gears.
    Pylons web framework
    Zope and Plone CMS.
    Axon Shell, BBC
    Kamaelia.
    Schevo database.
    Pitz: distributed
    task/bug tracking.
    iVR (interactive Virtual
    Reality).
    Movable Python
    (portable Python
    environment).
    ...

    View Slide

  40. (Incomplete) Cast of Characters
    Brian Granger - Physics, Cal State San Luis Obispo
    Min Ragan-Kelley - Nuclear Engineering, UC Berkeley
    Matthias Bussonnier - Physics, Institut Curie, Paris
    Jonathan March- Enthought
    Thomas Kluyver - Biology, U. Sheffield
    Jörgen Stenarson - Elect. Engineering, Sweden.
    Paul Ivanov - Neuroscience, UC Berkeley.
    Robert Kern - Enthought
    Evan Patterson - Physics, Caltech/Enthought
    Brad Froehle - Mathematics, UC Berkeley
    Stefan van der Walt - UC Berkeley
    John Hunter - TradeLink Securities, Chicago.
    Prabhu Ramachandran - Aerospace Engineering, IIT Bombay.
    Satra Ghosh- MIT Neuroscience
    Gaël Varoquaux - Neurospin (Orsay, France)
    Ville Vainio - CS, Tampere University of Technology, Finland
    Barry Wark - Neuroscience, U. Washington.
    Ondrej Certik - Physics, U Nevada Reno
    Darren Dale - Cornell
    Justin Riley - MIT
    Mark Voorhies - UC San Francisco
    Nicholas Rougier - INRIA Nancy Grand Est
    Thomas Spura - Fedora project
    Many more! (~150 commit authors)

    View Slide

  41. Support
    Thank you!
    Enthought, Austin, TX: Lots!
    Microsoft: WinHPC support, Visual Studio integration, Azure
    (thanks to Shahrokh Mortazavi).
    DoD/DRC Inc: funding through Sept. 2012 (thanks to Jose
    Unpingco and Chris Keees).
    NIH: via NiPy grant
    NSF: via Sage compmath grant
    Google: summer of code 2005, 2010.
    Tech-X Corp., Boulder, CO: Parallel/notebook (previous versions)

    View Slide

  42. Me and computers? SciComp Examples IPython Trouble
    Outline
    1 Me and computers?
    2 Scientific Computing
    3 Two examples
    4 IPython: Interactive Python
    5 Trouble in paradise
    FP (UC Berkeley) Python for science 11/11/12 39 / 46

    View Slide

  43. Me and computers? SciComp Examples IPython Trouble
    Too few are lifting too many
    1 2 3 4 5 6 7 8 9 10
    Individual Committer
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    Commit rate
    Normalized commit rates since Jan-2010
    cython
    ipython
    matplotlib
    mayavi
    numpy
    scipy
    sympy
    FP (UC Berkeley) Python for science 11/11/12 40 / 46

    View Slide

  44. Challenges, weaknesses?
    Speed MATTERS! See JuliaLang.org
    numpy.f2py: Fortran
    Cython
    Numba: an LLVM JIT for typed code/arrays. Very exciting
    development.
    Packaging
    The neverending nightmare.

    View Slide

  45. Challenges, weaknesses?
    Speed MATTERS! See JuliaLang.org
    numpy.f2py: Fortran
    Cython
    Numba: an LLVM JIT for typed code/arrays. Very exciting
    development.
    Packaging
    The neverending nightmare.

    View Slide

  46. Distutils, you say?
    Image credit: http://goodsky.homestead.com/files/Labyrinth800X600.jpg

    View Slide

  47. OK, we’re brave, we go in...
    Image credit: http://xkcd.com/246

    View Slide

  48. setuptool & friends
    It’s called easy_install, so it should be easy...
    Image credit: http://pixdaus.com/single.php?id=47779

    View Slide

  49. Me and computers? SciComp Examples IPython Trouble
    The language lured me in...
    But I stayed for the community!
    Real friendships and incredible people
    A culture of generous and mutual cross-project collaboration
    But we have a ton of work to do!
    The tools we need must be built by scientists.
    Lots of space for truly innovative thinking, and Python is an
    expressive tool for the exercise.
    FP (UC Berkeley) Python for science 11/11/12 45 / 46

    View Slide

  50. John D. Hunter, 1968-2012: matplotlib.org
    Memorial fund: numfocus.org/johnhunter

    View Slide