Slide 1

Slide 1 text

@jakevdp Jake VanderPlas Promoting Open Science in the University Jake VanderPlas; @jakevdp OSBD, Dec 5, 2016

Slide 2

Slide 2 text

@jakevdp Jake VanderPlas Barriers to Open Science in the University Incentives: Little incentive for academics to devote time to openness. Career paths: Relevant skills are more highly valued outside academia. Education: Undergraduate and graduate curricula lag in data science. Interdisciplinarity: Siloization of disciplines leads to missed opportunities.

Slide 3

Slide 3 text

@jakevdp Jake VanderPlas

Slide 4

Slide 4 text

@jakevdp Jake VanderPlas Original Core Faculty Team Data Science Methodology Biological Sciences Environmental Sciences Social Sciences Physical Sciences Cecilia Aragon Human Centered Design & Engr. Magda Balazinska CSE Emily Fox Statistics Carlos Guestrin CSE Bill Howe CSE Jeff Heer CSE Ed Lazowska CSE David Beck Chem. Engr. Tom Daniel Biology Bill Noble Genome Sciences Josh Blumenstock iSchool Mark Ellis Geography Tyler McCormick Sociology, CSSS Ginger Armbrust Oceanography Randy LeVeque Applied Math Thom Richardson Statistics, CSSS Werner Stuetzle Statistics Andy Connolly Astronomy John Vidale Earth & Space Sciences

Slide 5

Slide 5 text

@jakevdp Jake VanderPlas 2014 Kickoff Event: 137 posters from 30 UW departments!

Slide 6

Slide 6 text

@jakevdp Jake VanderPlas

Slide 7

Slide 7 text

@jakevdp Jake VanderPlas

Slide 8

Slide 8 text

@jakevdp Jake VanderPlas

Slide 9

Slide 9 text

@jakevdp Jake VanderPlas

Slide 10

Slide 10 text

@jakevdp Jake VanderPlas eScience Major Support: Gordon and Betty Moore Foundation & Alfred P. Sloan Foundation - $38 million over 5 years, split between UW, NYU, and UC Berkeley Washington Research Foundation - $9.3 million over 5 years for faculty & postdocs - $7.1 million to the closely-aligned Institute for Neuroengineering University of Washington - $550,000/year for staff support - $600,000/year for faculty support National Science Foundation - $2.8 million over 5 years for graduate program development and Ph.D. student funding (IGERT)

Slide 11

Slide 11 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Addressing the Problems This level of support has given eScience opportunity to explore many aspects of these challenges . . . Six interrelated “Working Groups” - Career Paths and Alternative Metrics - Education and Training - Software Tools, Environments, and Support - Reproducibility and Open Science - Data Science Ethnography - Working Spaces and Culture.

Slide 12

Slide 12 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Incentivizing Open Research Practices Rewarding open/reproducible research with “Open Science Badges” One of the ideas being explored by our Reproducibility working group https://osf.io/tvyxz/wiki/home/

Slide 13

Slide 13 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Incentivizing Open Research Practices http://joss.theoj.org/ Code as a first-class research product (on par with traditional publications) - Short papers, review focused on code - Submitted & reviewed in the open on GitHub - Makes code citeable and indexable by traditional tracking services

Slide 14

Slide 14 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity New Academic Career Paths Jake VanderPlas Director of Research, Physical Sciences PhD Astronomy Ariel Rokem Data Scientist PhD, Neuroscience Valentina Staneva Data Scientist PhD, Applied Math Bernease Herman Data Scientist BS Stats, Formerly Amazon & Morgan-Stanley Data Scientists (full support) Research Scientists (partial support) Bryna Hazelton Research Scientist PhD Astrophysics Andrew Gartland Research Scientist PhD Biostatistics Vaughn Iverson Research Scientist PhD Oceanography Anthony Arendt Research Scientist PhD Geophysics Joe Hellerstein Sr. Data Science Fellow PhD Computer Science Formerly Microsoft Research, Google, IBM Watson Dave Beck Director of Research, Life Sciences PhD Medicinal Chemistry Rob Fatland Director Cloud & Data Solutions PhD Geophysics, formerly NASA & Microsoft Research Britta Fiore-Gartland Director of Ethnography PhD Communication Research Faculty Research IT Ethnography of Data Science

Slide 15

Slide 15 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity New Academic Career Paths 2-year/3-year postdoctoral fellowships across a range of departments. Joint mentorship: domain + methodology. Focus on high-impact researchers who will push boundaries in both areas. Post-doctoral Fellowships

Slide 16

Slide 16 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity New Academic Career Paths Example: Mario Juric, Astronomy - Data Management Lead for LSST - Professor of Astronomy, UW - Sr. Fellow at UW eScience Working on scalable software infrastructure for the LSST project, especially regarding the formation, structure, and evolution of the Milky Way. Faculty position half-funded through eScience, half through Astronomy. (one of six such appointments across campus) Interdisciplinary Faculty

Slide 17

Slide 17 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity New Academic Programs Undergraduate Level: “Transcriptable option” for students across departments. Masters Level: Stand-alone evening masters program aimed at working professionals. PhD Level: “Data Science specialty” for graduate students across disciplines.

Slide 18

Slide 18 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity New Interdisciplinary Courses http://uwseds.github.io/

Slide 19

Slide 19 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity IGERT/eScience Graduate Fellows Cecilia Noecker Genome Sc. & ML Matt Murbach ChemE & ML Ryan Maas CS & Astro Alex Tank Stats & Allen Inst. for Brain Science Grace Telford Astro & Stats Will Gagne-Maynard Oceanography & MSR eScience Graduate Fellows – first cohort

Slide 20

Slide 20 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Extracurricular Training - Short trainings, workshops, and bootcamps - Annual “Hack Weeks” (e.g. AstroHackWeek, GeoHackWeek, NeuroHackWeek) - Informal seminar series (e.g. Python in Geosciences) - International coding sprints (e.g. Python in Astronomy)

Slide 21

Slide 21 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Data Science Studio

Slide 22

Slide 22 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Restoring the “Water Cooler” full 6th floor

Slide 23

Slide 23 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Office Hours

Slide 24

Slide 24 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Research Incubator Quarter-long, in-Studio projects, engagement two days per week - Each team: 1 project lead + 1 eScience Data Scientist - 4-6 concurrent teams: Network effects among cohort beyond 1:1 interactions

Slide 25

Slide 25 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Research Incubator Developing a Workflow for Managing Large Hydrologic Spatial Datasets to Assist Water Resources Management and Research Project Lead: Nicoleta Cristea, Civil and Environmental Engineering eScience Liaisons: Anthony Arendt, Rob Fatland Methods for Characterizing Human Centromeres Project Lead: Siva Kasinathan, UW School of Medicine eScience Liaison: Andrew Fiore-Gartland, Bryna Hazelton Target Detection for Advanced Environmental Monitoring of Marine Renewable Energy Project Lead: Emma Cotter, Mechanical Engineering eScience Liaison: Bernease Herman Improved Stimulation Protocols for Sight Restoration Technologies Project Leads: Ione Fine, Geoffrey M. Boynton, UW Psychology eScience Liaison: Ariel Rokem AralDIF: A Cloud-based Dynamic Information Framework for the Aral Sea Basin Project Lead: Amanda Tan, Department of Oceanography eScience Liaisons: Rob Fatland, Anthony Arendt

Slide 26

Slide 26 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Data Science for Social Good Four teams supported each summer Teams include: - Project Leads (1-2 from each org.) - DSSG Student Fellows (4 per team) - Data Science Leads (1-2 per team) - Stakeholders

Slide 27

Slide 27 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Data Science for Social Good - Open Sidewalk Graph for Accessible Trip Planning - Assessing Community Well-being through Open Data and Social Media - Predictors of Permanent Housing for Homeless Families - Rerouting Solutions and Expensive Ride Analysis for King County Paratransit - Mining Online Data for Early Identification of Unsafe Food Products - Use of ORCA data for improved transit system planning and operation - Global Open Sidewalks: Creating a shared open data layer and an OpenStreetMap data standard for sidewalks - CrowdSensing Census: A heterogenous-based tool for estimating poverty 2015 2016

Slide 28

Slide 28 text

@jakevdp Jake VanderPlas Incentives Career Paths Education Interdisciplinarity Data Science for Social Good

Slide 29

Slide 29 text

@jakevdp Jake VanderPlas

Slide 30

Slide 30 text

@jakevdp Jake VanderPlas Email: [email protected] Twitter: @jakevdp Github: jakevdp Web: http://vanderplas.com/ Blog: http://jakevdp.github.io Thank You!