Slide 1

Slide 1 text

Computing & Data From academia to open source Fernando Pérez http://fperez.org, @fperez_org Fernando.Perez@berkeley.edu Henry H. Wheeler Jr. Brain Imaging Center, UC Berkeley Supporting Data Science Feb 23, 2013

Slide 2

Slide 2 text

Computing and data Now part of the DNA of science Much more than “the third/fourth branch” of science Computing and data are everybody’s problem... Therefore they are nobody’s problem

Slide 3

Slide 3 text

An educational problem: the computer as a research tool All scientists need to own their computational processes. This means literacy in statistics, linear algebra, algorithms,... But also in ’software carpentry’ skills: version control, software design, testing, documentation, ... NOT yet another department on campus (ask Dave Culler)...

Slide 4

Slide 4 text

Open Source: skills, tools and practices we need! The culture where these things get done. Wildly collaborative Reproducible by necessity Version control, testing, documentation, public peer review, etc.

Slide 5

Slide 5 text

Reward Structure in academia: we punish all of the above Departmental boundaries: interdisciplinary work is a great buzzword, not such a great career path. Computational heritage is built on code, not on citations of prior literature. Continuous evolution vs publication milestones Authorship in collaborative works vs the first-author paper. Scholarship and intellectual effort embedded in the code.