Slide 1

Slide 1 text

Me and computers? SciComp Examples IPython Trouble Science and Python: a interactively biased retrospective of a (mostly) successful decade Fernando Pérez http://fperez.org, @fperez_org [email protected] PyCon Canada Toronto Nov 11, 2012

Slide 2

Slide 2 text

Me and computers? SciComp Examples IPython Trouble Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 2 / 46

Slide 3

Slide 3 text

Me and computers? SciComp Examples IPython Trouble Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 3 / 46

Slide 4

Slide 4 text

Physics undergrad, plot fractals in TurboPascal Program on paper, use mom’s office PC on weekends. Debug on paper. Think a lot away from the screen.

Slide 5

Slide 5 text

VGA graphics! Binary compiled in 1991, screenshot taken May 2011!

Slide 6

Slide 6 text

Me and computers? SciComp Examples IPython Trouble Clueless about the internet (which I unplugged) Image credit: Carlos Latuff FP (UC Berkeley) Python for science 11/11/12 6 / 46

Slide 7

Slide 7 text

Me and computers? SciComp Examples IPython Trouble A learning moment... Image credit: crazytales562 on Flickr FP (UC Berkeley) Python for science 11/11/12 7 / 46

Slide 8

Slide 8 text

Me and computers? SciComp Examples IPython Trouble Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 8 / 46

Slide 9

Slide 9 text

Computing is not the ’third branch’ of science... It is now the backbone of theory and experiment!

Slide 10

Slide 10 text

Me and computers? SciComp Examples IPython Trouble Scientific computing The computer as a microscope Exploratory: Problem’s definition evolves as we understand it. No ‘requirements’ to build an application against. Mathematica, Maple, Matlab, IDL, etc. All have an interactive environment. Applications Languages FP (UC Berkeley) Python for science 11/11/12 10 / 46

Slide 11

Slide 11 text

Beyond (Floating Point) Number Crunching Hardware floating point Arbitrary precision integers Rationals Interval arithmetic Symbolic manipulation FORTRAN Extended precision floating point Text processing Databases Graphical user interfaces Web interfaces Hardware control Multi-language integration Data formats: HDF5, XML, ...

Slide 12

Slide 12 text

Scientific Python: a Rich Ecosystem IPython NetworkX

Slide 13

Slide 13 text

Me and computers? SciComp Examples IPython Trouble EEG analysis for epilepsy High quality plotting: matplotlib JPL Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 13 / 46

Slide 14

Slide 14 text

Data analysis for epilepsy surgery Isolating the origin of drug-resistant epileptic seizures which require surgery. John Hunter, Department of Pediatric Neurology, University of Chicago.

Slide 15

Slide 15 text

Electrode location in 3D, combined with MRI data

Slide 16

Slide 16 text

Correlation analysis of seizure data

Slide 17

Slide 17 text

Matplotlib: 2d plotting Matplotlib 1.2.0 is official, with Python 3 support!!!

Slide 18

Slide 18 text

Matplotlib: 3d plotting

Slide 19

Slide 19 text

JPL: Mars mission trajectory design and nav data Ted Drain and Lynn Craig, Jet Propulsion Laboratory (NASA/Caltech) From: Name Elided Date: Oct 2, 2007 7:15 PM Subject: Fwd: matplotlib bug numbers To: John Hunter One of my lead developers mentioned that they had sent a bug to you about the annotations feature of MatPlotLib. Would you be able to let me know what the timeline is to resolve that bug? The reason is that the feature is needed for the Phoenix project and their arrival at Mars will be in March sometime, but they are doing their testing in the coming few months. This annotation feature is used on reports that present the analysis of the trajectory to the navigation team and it shows up on our schedule. It would really help me to know approximately when it could be resolved. B-plane plots are used to show the trajectory of a spacecraft with respect to the target body (specifically perpendicular to the incoming asymptote of the spacecraft trajectory) and we plot them with the y-axis inverted. The plot is used heavily in flight operations so it is important to our customers. In addition, we have what is called a thundering heard plot where many different trajectory solutions (determined from different measurement sources) are plotted together. The annotations are import there so we can see which plot corresponds to each source of data. I hope it helps to know how your code will be used in spacecraft navigation. Thanks for all your efforts.

Slide 20

Slide 20 text

JPL: Mars mission data visualization Expected communication power levels between an orbiting spacecraft and a lander as it goes through the atmosphere:

Slide 21

Slide 21 text

August 23, 2011 The astronomy ‘event of a generation’ Josh Bloom, UC Berkeley Astronomy @profjsb

Slide 22

Slide 22 text

Monday Tuesday Wednesday Supernova PTF11kyl: Event of a Generation found on Tuesday Most nearby Type Ia supernova in > 25 years Soon visible with binoculars http://bit.ly/ptf11kly

Slide 23

Slide 23 text

Me and computers? SciComp Examples IPython Trouble Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 22 / 46

Slide 24

Slide 24 text

Why IPython? (something other than “I’d rather not finish my dissertation”)

Slide 25

Slide 25 text

Why IPython? (something other than “I’d rather not finish my dissertation”)

Slide 26

Slide 26 text

Me and computers? SciComp Examples IPython Trouble The Lifecycle of a Scientific Idea (schematically) 1 Individual exploratory work 2 Collaborative development 3 Production work (HPC, cloud, parallel) 4 Publication (with reproducible results!) 5 Education 6 Goto 1. The Problem with most tools Barriers and discontinuities in workflow in between all the steps FP (UC Berkeley) Python for science 11/11/12 24 / 46

Slide 27

Slide 27 text

Me and computers? SciComp Examples IPython Trouble The Lifecycle of a Scientific Idea (schematically) 1 Individual exploratory work 2 Collaborative development 3 Production work (HPC, cloud, parallel) 4 Publication (with reproducible results!) 5 Education 6 Goto 1. The Problem with most tools Barriers and discontinuities in workflow in between all the steps FP (UC Berkeley) Python for science 11/11/12 24 / 46

Slide 28

Slide 28 text

IPython’s goal: Fluid transitions in all these steps

Slide 29

Slide 29 text

Demo

Slide 30

Slide 30 text

Me and computers? SciComp Examples IPython Trouble Pillar #1: An architecture for interactive computing FP (UC Berkeley) Python for science 11/11/12 27 / 46

Slide 31

Slide 31 text

Me and computers? SciComp Examples IPython Trouble Pillar #2: the Notebook Format JSON but version control-friendly Easy for machine processing, fixable by hand if need be. Lots of hooks for metadata Not Python-specific (R and Ruby notebooks exist, Julia planned) Produce Markdown, reST, L A TEX, HTML, etc... An open format for sharing, publishing and archiving executable computational work FP (UC Berkeley) Python for science 11/11/12 28 / 46

Slide 32

Slide 32 text

Documented protocols and formats: a growing ecosystem around IPython

Slide 33

Slide 33 text

Microsoft Visual Studio 2010 integrated console Dino Viehland and Shahrokh Mortazavi (Microsoft); http://pytools.codeplex.com

Slide 34

Slide 34 text

A vim client to control an IPython kernel/console Paul Ivanov (Berkeley), https://github.com/ivanov/vim-ipython

Slide 35

Slide 35 text

An Emacs Notebook Client! Takafumi Arakaki: http://tkf.github.com/emacs-ipython-notebook.

Slide 36

Slide 36 text

Notebooks on Windows Azure Cloud Shahrokh Mortazavi (Microsoft), B.G., F.P.: http://bit.ly/JQeojD.

Slide 37

Slide 37 text

Star Cluster: IPython parallel+Notebook on Amazon EC2 Justin Riley (MIT): http://web.mit.edu/star/cluster

Slide 38

Slide 38 text

One-click single notebook on Amazon EC2 Carl Smith (UK): https://notebookcloud.appspot.com.

Slide 39

Slide 39 text

Other projects using IPython Scientific EPD: Enthought Python Distribution. Sage: open source mathematics. PyRAF: Space Telescope Science Institute CASA: Nat. Radio Astronomy Observatory Ganga: CERN PyMAD: neutron spectrom., Laue Langevin Sardana: European Synchrotron Radiation ASCEND: eng. modeling (Carnegie Mellon). JModelica: dynamical systems. DASH: Denver Aerosol Sources and Health. Trilinos: Sandia National Lab. DoD: baseline configuration. Mayavi: 3d visualization, Enthought. NiPype: computational pipelines, MIT. PyIMSL Studio, by Visual Numerics. ... Web/Other Visual Studio 2010: MS. Django. Turbo Gears. Pylons web framework Zope and Plone CMS. Axon Shell, BBC Kamaelia. Schevo database. Pitz: distributed task/bug tracking. iVR (interactive Virtual Reality). Movable Python (portable Python environment). ...

Slide 40

Slide 40 text

(Incomplete) Cast of Characters Brian Granger - Physics, Cal State San Luis Obispo Min Ragan-Kelley - Nuclear Engineering, UC Berkeley Matthias Bussonnier - Physics, Institut Curie, Paris Jonathan March- Enthought Thomas Kluyver - Biology, U. Sheffield Jörgen Stenarson - Elect. Engineering, Sweden. Paul Ivanov - Neuroscience, UC Berkeley. Robert Kern - Enthought Evan Patterson - Physics, Caltech/Enthought Brad Froehle - Mathematics, UC Berkeley Stefan van der Walt - UC Berkeley John Hunter - TradeLink Securities, Chicago. Prabhu Ramachandran - Aerospace Engineering, IIT Bombay. Satra Ghosh- MIT Neuroscience Gaël Varoquaux - Neurospin (Orsay, France) Ville Vainio - CS, Tampere University of Technology, Finland Barry Wark - Neuroscience, U. Washington. Ondrej Certik - Physics, U Nevada Reno Darren Dale - Cornell Justin Riley - MIT Mark Voorhies - UC San Francisco Nicholas Rougier - INRIA Nancy Grand Est Thomas Spura - Fedora project Many more! (~150 commit authors)

Slide 41

Slide 41 text

Support Thank you! Enthought, Austin, TX: Lots! Microsoft: WinHPC support, Visual Studio integration, Azure (thanks to Shahrokh Mortazavi). DoD/DRC Inc: funding through Sept. 2012 (thanks to Jose Unpingco and Chris Keees). NIH: via NiPy grant NSF: via Sage compmath grant Google: summer of code 2005, 2010. Tech-X Corp., Boulder, CO: Parallel/notebook (previous versions)

Slide 42

Slide 42 text

Me and computers? SciComp Examples IPython Trouble Outline 1 Me and computers? 2 Scientific Computing 3 Two examples 4 IPython: Interactive Python 5 Trouble in paradise FP (UC Berkeley) Python for science 11/11/12 39 / 46

Slide 43

Slide 43 text

Me and computers? SciComp Examples IPython Trouble Too few are lifting too many 1 2 3 4 5 6 7 8 9 10 Individual Committer 0.0 0.2 0.4 0.6 0.8 1.0 Commit rate Normalized commit rates since Jan-2010 cython ipython matplotlib mayavi numpy scipy sympy FP (UC Berkeley) Python for science 11/11/12 40 / 46

Slide 44

Slide 44 text

Challenges, weaknesses? Speed MATTERS! See JuliaLang.org numpy.f2py: Fortran Cython Numba: an LLVM JIT for typed code/arrays. Very exciting development. Packaging The neverending nightmare.

Slide 45

Slide 45 text

Challenges, weaknesses? Speed MATTERS! See JuliaLang.org numpy.f2py: Fortran Cython Numba: an LLVM JIT for typed code/arrays. Very exciting development. Packaging The neverending nightmare.

Slide 46

Slide 46 text

Distutils, you say? Image credit: http://goodsky.homestead.com/files/Labyrinth800X600.jpg

Slide 47

Slide 47 text

OK, we’re brave, we go in... Image credit: http://xkcd.com/246

Slide 48

Slide 48 text

setuptool & friends It’s called easy_install, so it should be easy... Image credit: http://pixdaus.com/single.php?id=47779

Slide 49

Slide 49 text

Me and computers? SciComp Examples IPython Trouble The language lured me in... But I stayed for the community! Real friendships and incredible people A culture of generous and mutual cross-project collaboration But we have a ton of work to do! The tools we need must be built by scientists. Lots of space for truly innovative thinking, and Python is an expressive tool for the exercise. FP (UC Berkeley) Python for science 11/11/12 45 / 46

Slide 50

Slide 50 text

John D. Hunter, 1968-2012: matplotlib.org Memorial fund: numfocus.org/johnhunter