interactive computing Fernando Pérez http://fperez.org, @fperez_org [email protected] Henry H. Wheeler Jr. Brain Imaging Center, UC Berkeley PyData 2013, Silicon Valley March 20, 2013
integers Rationals Interval arithmetic Symbolic manipulation FORTRAN Extended precision floating point Text processing Databases Graphical user interfaces Web interfaces Hardware control Multi-language integration Data formats: HDF5, XML, ...
Problem’s definition evolves as we understand it. No ‘requirements’ to build an application against. Mathematica, Maple, Matlab, IDL, etc. All have an interactive environment. Applications Languages FP (UC Berkeley) IPython 3/20/13 6 / 34
Idea (schematically) 1 Individual exploratory work 2 Collaborative development 3 Parallel production runs (HPC, cloud, ...) 4 Publication (with reproducible results!) 5 Education 6 Goto 1. The Problem with most tools Barriers and discontinuities in workflow in between all the steps FP (UC Berkeley) IPython 3/20/13 8 / 34
Idea (schematically) 1 Individual exploratory work 2 Collaborative development 3 Parallel production runs (HPC, cloud, ...) 4 Publication (with reproducible results!) 5 Education 6 Goto 1. The Problem with most tools Barriers and discontinuities in workflow in between all the steps FP (UC Berkeley) IPython 3/20/13 8 / 34
JSON but version control-friendly Easy for machine processing, fixable by hand if need be. Lots of hooks for metadata Not Python-specific (Ruby, JS notebooks exist, R, Julia planned) Produce Markdown, reST, L A TEX, HTML, etc... An open format for sharing, publishing and archiving executable computational work FP (UC Berkeley) IPython 3/20/13 12 / 34
October 2001: “just a little afternoon hack” My own $PYTHONSTARTUP: ipython-0.0.1.py: 259 lines. In [N]: prompts and _N results cache. IPP (Interactive Python Prompt) by Janko Hauser (Oceanography) LazyPython by Nathan Gray (CS Caltech) 2002: Ignore John Hunter’s Gnuplot support patches ... let there be matplotlib (actually finish my PhD!) 2005: Brian Granger, Min Ragan-Kelley First parallel tools, Twisted-based 2005-2008: Ville Vainio, Gaël Varoquaux, Laurent Dufréchou Core maintenance, Wx integration.
shell using ØMQ 2-day sprint with Brian Enthought funds Qt console. Min ports parallel code to ØMQ Core architecture ready, foundation for Notebook Fall 2010 James Gao at Berkeley builds (5th!) Notebook Prototype. Summer 2011 Brian rebuids James’ prototype into today’s Notebook.
San Luis Obispo Min Ragan-Kelley - Nuclear Engineering, UC Berkeley Matthias Bussonnier - Physics, Institut Curie, Paris Brad Froehle - Mathematics, UC Berkeley Paul Ivanov - Neuroscience, UC Berkeley. Robert Kern - Enthought Thomas Kluyver - Biology, U. Sheffield Jonathan March- Enthought Evan Patterson - Physics, Caltech/Enthought Jörgen Stenarson - Elect. Engineering, Sweden. Stefan van der Walt - UC Berkeley John Hunter - TradeLink Securities, Chicago. Prabhu Ramachandran - Aerospace Engineering, IIT Bombay. Satra Ghosh- MIT Neuroscience Gaël Varoquaux - Neurospin (Orsay, France) Ville Vainio - CS, Tampere University of Technology, Finland Barry Wark - Neuroscience, U. Washington. Ondrej Certik - Physics, U Nevada Reno Darren Dale - Cornell Justin Riley - MIT Mark Voorhies - UC San Francisco Nicholas Rougier - INRIA Nancy Grand Est Thomas Spura - Fedora project Many more! (~220 commit authors)
Lots! Microsoft: WinHPC support, Visual Studio integration, Azure (thanks to Shahrokh Mortazavi). DoD/DRC Inc: funding through Sept. 2012 (thanks to Jose Unpingco and Chris Keees). NIH: via NiPy grant NSF: via Sage compmath grant Google: summer of code 2005, 2010. Tech-X Corp., Boulder, CO: Parallel/notebook (previous versions) Recent stable funding (2 years, 7 people, J. Taylor):
Departmental boundaries: interdisciplinary work is a great buzzword, not such a great career path. Computational heritage is built on code not on citations Continuous evolution vs publication milestones Authorship in collaborative works vs the first-author paper. Scholarship and intellectual effort embedded in the code.
open source scientific computing ecosystem Support the development of multiple projects. Community-created and driven. A neutral ground for industry, academia and government to support scientific open source. 501(c)3 - donations are tax-exempt in the USA http://numfocus.org
1.0 Notebook document management (nbconvert) JavaScript internals cleanup Fall 2013 Interactive JavaScript API With callbacks to remote kernels. 2014 Multiuser server Simple to deploy Trusted (shell OK) Unix users in a lab, group, class, etc. https://github.com/ipython/ipython/wiki/Roadmap:-IPython
right abstractions The kernel: unify interactive and parallel computing → you only have one brain! A single protocol: many kernels, many clients. Communications and logging the protocol is the notebook file format. Insight and communication (Hamming) “Literate computing” vs “literate programming”. Build a community and an ecosystem “How to Scale a Code in the Human Dimension”, M. Turk, http://arxiv.org/abs/1301.7064.
right abstractions The kernel: unify interactive and parallel computing → you only have one brain! A single protocol: many kernels, many clients. Communications and logging the protocol is the notebook file format. Insight and communication (Hamming) “Literate computing” vs “literate programming”. Build a community and an ecosystem “How to Scale a Code in the Human Dimension”, M. Turk, http://arxiv.org/abs/1301.7064.
right abstractions The kernel: unify interactive and parallel computing → you only have one brain! A single protocol: many kernels, many clients. Communications and logging the protocol is the notebook file format. Insight and communication (Hamming) “Literate computing” vs “literate programming”. Build a community and an ecosystem “How to Scale a Code in the Human Dimension”, M. Turk, http://arxiv.org/abs/1301.7064.