Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Computing & Data: From academia to open source

Fernando Perez
February 26, 2013

Computing & Data: From academia to open source

Slides for a lightning talk for the UC Berkeley workshop "Supporting Data Science - A Campus-Wide Workshop": http://vcresearch.berkeley.edu/datascience/workshop

Fernando Perez

February 26, 2013
Tweet

More Decks by Fernando Perez

Other Decks in Research

Transcript

  1. Computing & Data
    From academia to open source
    Fernando Pérez
    http://fperez.org, @fperez_org
    [email protected]
    Henry H. Wheeler Jr. Brain Imaging Center, UC Berkeley
    Supporting Data Science
    Feb 23, 2013

    View Slide

  2. Computing and data
    Now part of the DNA of science
    Much more than “the third/fourth branch” of science
    Computing and data are everybody’s problem...
    Therefore they are nobody’s problem

    View Slide

  3. An educational problem:
    the computer as a research tool
    All scientists need to own their computational processes.
    This means literacy in statistics, linear algebra, algorithms,...
    But also in ’software carpentry’ skills:
    version control, software design, testing, documentation, ...
    NOT yet another department on campus (ask Dave Culler)...

    View Slide

  4. Open Source:
    skills, tools and practices we need!
    The culture where these things get done.
    Wildly collaborative
    Reproducible by necessity
    Version control, testing, documentation, public peer review, etc.

    View Slide

  5. Reward Structure in academia:
    we punish all of the above
    Departmental boundaries: interdisciplinary work is a great buzzword,
    not such a great career path.
    Computational heritage is built on code, not on citations of prior
    literature.
    Continuous evolution vs publication milestones
    Authorship in collaborative works vs the first-author paper.
    Scholarship and intellectual effort embedded in the code.

    View Slide