Slide 1

Slide 1 text

The  workflowr R  package:  a   framework  for  reproducible   and  collaborative  data   science John  Blischak  (@jdblischak) 2018-­07-­11 useR!  2018  Brisbane,  Australia github.com/jdblischak/workflowr

Slide 2

Slide 2 text

My  computational  challenges Organizing  files Tracking  intermediate  results Sharing  results John  Blischak  -­ github.com/jdblischak/workflowr

Slide 3

Slide 3 text

John  Blischak  -­ github.com/jdblischak/workflowr

Slide 4

Slide 4 text

Literate  programming John  Blischak  -­ github.com/jdblischak/workflowr Source  code Results file.Rmd file.html yihui.name/knitr rmarkdown.rstudio.com

Slide 5

Slide 5 text

R  Markdown  websites John  Blischak  -­ github.com/jdblischak/workflowr rmarkdown.rstudio.com

Slide 6

Slide 6 text

Version  control John  Blischak  -­ github.com/jdblischak/workflowr version:  2rko6xn message:  Start  new… version:  d1zyskv message:  Update  parameters… version:  z6o3b97 message:  Label  axes… git-­scm.com github.com/ropensci/git2r

Slide 7

Slide 7 text

Version  control  terminology repository – the  tracked  files  and  their  revision  history commit – a  snapshot  of  the  current  state  of  the  files John  Blischak  -­ github.com/jdblischak/workflowr

Slide 8

Slide 8 text

Web  hosting GitHub  Pages  – hosts  one  website  per  code  repository John  Blischak  -­ github.com/jdblischak/workflowr pages.github.com

Slide 9

Slide 9 text

workflowr Organized Reproducible Shareable John  Blischak  -­ github.com/jdblischak/workflowr Version-­controlled  websites

Slide 10

Slide 10 text

Organized John  Blischak  -­ github.com/jdblischak/workflowr

Slide 11

Slide 11 text

Start  a  new  project >  wflow_start("myproject") 1.  Creates  directory  with  template  files 2.  Changes  working  directory 3.  Initiates  Git  repository  and  commits  files Also  available  as  RStudio Project  Template John  Blischak  -­ github.com/jdblischak/workflowr

Slide 12

Slide 12 text

Organized  directory  structure John  Blischak  -­ github.com/jdblischak/workflowr R  Markdown  files HTML  files Website  options

Slide 13

Slide 13 text

Reproducible John  Blischak  -­ github.com/jdblischak/workflowr

Slide 14

Slide 14 text

Run  code  in  clean  environment John  Blischak  -­ github.com/jdblischak/workflowr >  wflow_build(c("f1.Rmd",  "f2.Rmd")) f1.Rmd f2.Rmd github.com/r-­lib/callr

Slide 15

Slide 15 text

Tracking  intermediate  results >  wflow_publish("analysis/file.Rmd") Performs  3-­steps: 1. Commits  analysis/file.Rmd 2. Builds analysis/file.Rmd 3. Commits  docs/file.html and  figure  files John  Blischak  -­ github.com/jdblischak/workflowr

Slide 16

Slide 16 text

Combining  rmarkdown and  Git John  Blischak  -­ github.com/jdblischak/workflowr Source  code Results 1ong9jt ln412fy Source  code Results wr1q7bk 3tg6lse

Slide 17

Slide 17 text

View  past  results John  Blischak  -­ github.com/jdblischak/workflowr

Slide 18

Slide 18 text

Other  reproducibility  features output:  workflowr::wflow_html Records  the  session  information  at  the  end Sets  a  seed  prior  to  running  code John  Blischak  -­ github.com/jdblischak/workflowr

Slide 19

Slide 19 text

Reproducibility  report John  Blischak  -­ github.com/jdblischak/workflowr

Slide 20

Slide 20 text

Shareable John  Blischak  -­ github.com/jdblischak/workflowr

Slide 21

Slide 21 text

Distribute  results  for  sharing Create  new  GitHub  repository >  wflow_git_push() John  Blischak  -­ github.com/jdblischak/workflowr ©  2018  GitHub  Inc. pages.github.com

Slide 22

Slide 22 text

Installation 1. Install  R ◦ (Recommended)  Install  RStudio ◦ (Optional)  Install  pandoc ◦ (Optional)  Install  Git 2. Install  workflowr from  CRAN ◦ install.packages("workflowr") 3. Create  an  account  on  GitHub Documentation:  https://jdblischak.github.io/workflowr John  Blischak  -­ github.com/jdblischak/workflowr

Slide 23

Slide 23 text

In  summary,  using  workflowr… Enables  you  to  start  working  reproducibly  immediately Allows  you  to  focus  on  your  analysis Shares  your  results  online John  Blischak  -­ github.com/jdblischak/workflowr

Slide 24

Slide 24 text

Acknowledgements Co-­authors:  Peter  Carbonetto,  Matthew  Stephens Early  adopters  for  testing  and  feedback Authors  and  contributors  to  knitr,  rmarkdown,  git2r,  callr John  Blischak  -­ github.com/jdblischak/workflowr

Slide 25

Slide 25 text

workflowr Organized Reproducible Shareable John  Blischak  -­ github.com/jdblischak/workflowr Version-­controlled  websites