Upgrade to Pro — share decks privately, control downloads, hide ads and more …

projthis-intro

Ian Lyttle
March 18, 2021
41

 projthis-intro

Ian Lyttle

March 18, 2021
Tweet

Transcript

  1. projthis A framework for analysis-based workflows
 
 Ian Lyttle https://ijlyttle.github.io/projthis/

    devtools::install_github("ijlyttle/projthis") https://speakerdeck.com/ijlyttle/projthis-intro 2021-03-18: Iowa State University Graphics Group
  2. Analysis Development • Distinct from package development. • I learned

    this term from Hilary Parker; she has a pre-print and an rstudio::conf(2017) presentation. • Jenny Bryan advocates for more-humane organization of R workflows. She literally wrote the book (Happy Git with R). • Sharla Gelfand has discussed her implementation in a blog post and an rstudio::conf(2020) presentation. • See README for more info, links.
  3. covidStates/ ├── covidIowa.Rproj ├── .Rbuildignore ├── .gitignore ├── DESCRIPTION ├──

    LICENSE └── LICENSE.md Getting started # https://github.com/ijlyttle/covidStates
 
 # create a project
 projthis::proj_create("path/to/covidStates")
 
 # RStudio IDE launches new window with project
 
 # Add a few nice words to DESCRIPTION
 
 # add license (pick one you like)
 usethis::use_mit_license()
 
 # establish git repository
 usethis::use_git()
 Do something with this file.
  4. Getting started # RStudio IDE restarts
 
 library("projthis")
 
 #

    put repository on GitHub
 usethis::use_github()
 
 # create README file, add a few nice words
 usethis::use_readme_md() covidStates/ ├── covidIowa.Rproj ├── .Rbuildignore ├── .gitignore ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── README.Rmd └── README.md
  5. Three main ideas projthis has functions to support: • Workflow

    dependencies: sequence of RMarkdown files • Package dependencies: use DESCRIPTION (like packages) • Automation: GitHub Actions template to run your workflow(s) The first two ideas are independent of each other. The automation idea builds on the first two.
  6. Workflow dependencies • A workflow is a sequence of RMarkdown

    files, and a README • Data are kept in a data/ subdirectory (automatically created) • Each RMarkdown file in sequence has subdirectory in data/ • There are rules for reading-writing files from-to data/ • Can read only from earlier data directories, source • Can write only to its own data directory, target • Only "exceptions": 00-import, 99-publish covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
  7. Workflow dependencies # https://github.com/ijlyttle/covidStates/tree/main/workflow
 
 # creates workflow directory, also

    data/ and README.Rmd
 proj_use_workflow("workflow", git_ignore_data = FALSE)
 
 # add some nice words to README.Rmd
 
 # with README.Rmd open in the IDE
 proj_workflow_use_rmd("00-import") # with any .Rmd *from the workflow* open in the IDE
 proj_workflow_use_rmd("01-clean") # draw the rest of the owl covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
  8. Workflow dependencies # part of the Rmd template, example
 


    # create or *empty* the target directory
 # used to write this file's data 
 projthis::proj_create_dir_target(params$name, clean = TRUE)
 
 # create function to get path to target directory 
 # *note*: params$name is defined in template
 path_target <- projthis::proj_path_target(params$name)
 
 # create function to get path to source directories
 path_source <- projthis::proj_path_source(params$name)
 
 
 # use these functions to access paths to your data
 path_target("sample.csv")
 path_source("00-import", "sample.csv") covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
  9. Workflow dependencies • To render an entire workflow:
 
 proj_workflow_render("workflow")

    • Renders each RMarkdown file in order, README last • As if you had hit the "Knit" button • For a more-comprehensive approach: targets
  10. Package dependencies • We use a DESCRIPTION file, just like

    a package. • We enter a "contract": the workflows will work with the latest versions of all the package dependencies. • This is the same contract used at CRAN. • To specify non-CRAN packages, use Remotes: field.
  11. Package dependencies • Two functions:
 
 # scans files for

    dependencies, updates DESCRIPTION
 proj_update_deps()
 
 # installs packages named in DESCRIPTION
 proj_install_deps() • For a more-comprehensive approach: renv
  12. Automation • proj_workflow_use_action() • Actions are sequences of instructions. •

    You will have to customize (you can add a schedule!) • Best reference for R users: r-lib/actions • Not yet bullet-proof, sometimes Actions just "pukes". • Here's an example. covidStates/ ├── .github/ │ ├── workflows/ │ │ └── proj….yaml │ └── .gitignore └── ...
  13. More-comprehensive tools projthis is meant to be a starter kit

    - maybe that's all you'll need. 
 
 For these problems, there are more-comprehensive tools out there: • Package dependencies: renv • Workflow dependencies: targets • Automation: you, yourself, once you get your legs under you I'd like to submit to CRAN soon, the more feedback the better!