Slide 1

Slide 1 text

Fostering a community of scientific R developers @_inundata karthik.io

Slide 2

Slide 2 text

bids.berkeley.edu

Slide 3

Slide 3 text

Scott Chamberlain Carl Boettiger

Slide 4

Slide 4 text

P values are just the tip of the iceberg Peng & Leek, 28 Apr 15

Slide 5

Slide 5 text

50+ contributors • ~65 software tools Substantial contributor to CRAN

Slide 6

Slide 6 text

#1 Completing the data pipelines

Slide 7

Slide 7 text

Data retrieval (from APIs, data storage services, journals, and other remote servers). Data visualization (interactive graphics in R that extend beyond base and ggplot2).
 Data deposition into research repositories, including metadata generation.
 Data munging: With limited scope
 
 Reproducibility (any tools that facilitate reproducible research, such as interfacing with git, tracking provenance or similar).

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Raw data DataONE CKAN Open Data Internet archive NOAA World Bank

Slide 10

Slide 10 text

Text corpus Public Library of Science (PLOS) Biomed Central eLife Springer IEEE arXiv preprints bioarxiv preprints Elsevier (just kidding) “The new science journalism and open science” http://blog.revolutionanalytics.com

Slide 11

Slide 11 text

Data viz ee_observations(genus = “lynx”) %>% ee_maps

Slide 12

Slide 12 text

DATA SHOULD BE machine readable

Slide 13

Slide 13 text

AUTOMATE BORING TASKS paint drying adding metadata

Slide 14

Slide 14 text

Data publication figshare dvn zenodo dat * + EML git2r pipelines from ‘ * services 㽉 * Both in development

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

#2 Building a user and developer community

Slide 17

Slide 17 text

Teach basic lab skills for scientific computing so that researchers can do more in less time and with less pain. Teach basic concepts, skills and tools for working more effectively with data. Workshops are designed for people with little to no prior computational experience. Building capacity

Slide 18

Slide 18 text

many transition from users to developers

Slide 19

Slide 19 text

Domain experts are often bad programmers

Slide 20

Slide 20 text

Only 19% of packages on CRAN have unit tests 
 
 46% of those rely on non-standard tests (i.e., a /tests/ directory but neither testthat or RUnit) Source: Oliver Keyes

Slide 21

Slide 21 text

/ropensci/onboarding

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Empowering the next generation of R developers

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

“Absolutely stellar gender diversity” I think this is the most gender-balanced tech conference I’ve been to. Great diversity in career and academia too.”

Slide 28

Slide 28 text

Cultivating contributors

Slide 29

Slide 29 text

Cultivating contributions

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

THE NARRATIVE THE DATA THE CODE A RESEARCH PAPER

Slide 33

Slide 33 text

A RESEARCH PAPER

Slide 34

Slide 34 text

https:// /rOpenSci @rOpenSci