this term from Hilary Parker; she has a pre-print and an rstudio::conf(2017) presentation. • Jenny Bryan advocates for more-humane organization of R workflows. She literally wrote the book (Happy Git with R). • Sharla Gelfand has discussed her implementation in a blog post and an rstudio::conf(2020) presentation. • See README for more info, links.
LICENSE └── LICENSE.md Getting started # https://github.com/ijlyttle/covidStates # create a project projthis::proj_create("path/to/covidStates") # RStudio IDE launches new window with project # Add a few nice words to DESCRIPTION # add license (pick one you like) usethis::use_mit_license() # establish git repository usethis::use_git() Do something with this file.
dependencies: sequence of RMarkdown files • Package dependencies: use DESCRIPTION (like packages) • Automation: GitHub Actions template to run your workflow(s) The first two ideas are independent of each other. The automation idea builds on the first two.
files, and a README • Data are kept in a data/ subdirectory (automatically created) • Each RMarkdown file in sequence has subdirectory in data/ • There are rules for reading-writing files from-to data/ • Can read only from earlier data directories, source • Can write only to its own data directory, target • Only "exceptions": 00-import, 99-publish covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
data/ and README.Rmd proj_use_workflow("workflow", git_ignore_data = FALSE) # add some nice words to README.Rmd # with README.Rmd open in the IDE proj_workflow_use_rmd("00-import") # with any .Rmd *from the workflow* open in the IDE proj_workflow_use_rmd("01-clean") # draw the rest of the owl covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
# create or *empty* the target directory # used to write this file's data projthis::proj_create_dir_target(params$name, clean = TRUE) # create function to get path to target directory # *note*: params$name is defined in template path_target <- projthis::proj_path_target(params$name) # create function to get path to source directories path_source <- projthis::proj_path_source(params$name) # use these functions to access paths to your data path_target("sample.csv") path_source("00-import", "sample.csv") covidStates/ ├── workflow/ │ ├── data/ │ │ ├── 00-import/ │ │ ├── 01-county/ │ │ └── ... │ ├── 00-import.Rmd │ ├── 00-import.md │ ├── 01-clean.Rmd │ ├── 01-clean.md │ ├── ... │ ├── README.Rmd │ └── README.md └── ...
a package. • We enter a "contract": the workflows will work with the latest versions of all the package dependencies. • This is the same contract used at CRAN. • To specify non-CRAN packages, use Remotes: field.
dependencies, updates DESCRIPTION proj_update_deps() # installs packages named in DESCRIPTION proj_install_deps() • For a more-comprehensive approach: renv
You will have to customize (you can add a schedule!) • Best reference for R users: r-lib/actions • Not yet bullet-proof, sometimes Actions just "pukes". • Here's an example. covidStates/ ├── .github/ │ ├── workflows/ │ │ └── proj….yaml │ └── .gitignore └── ...
- maybe that's all you'll need. For these problems, there are more-comprehensive tools out there: • Package dependencies: renv • Workflow dependencies: targets • Automation: you, yourself, once you get your legs under you I'd like to submit to CRAN soon, the more feedback the better!