Slide 1

Slide 1 text

Get your Research Project/Article Organized, Shareable and Reproducible using and friends Emerson M. Del Ponte Open Plant Pathology Universidade Federal de Viçosa 4-h Workshop

Slide 2

Slide 2 text

Why Reproducible Research (RR)? Data + Protocols + Code Knowledge Reproducible Documenting and Sharing Efficiency short term effect Accessibility short and long term effect Transparency Information pen Reproducibility Research Practices Replicability

Slide 3

Slide 3 text

- Sponsors/journals require data (standard in molecular) - Work more efficiently and facilitate collaborations - Improved reproducibility (data and/or methods) - Technology (less cumbersome) is becoming available - Enhanced visibility and transparency - Multiple citable outcomes: data, code, manuscript, etc. Why to change/learn new things?

Slide 4

Slide 4 text

When RR? Idea Register (proposal) Run experiments Get data Analyze Communicate

Slide 5

Slide 5 text

Idea Register Preregistration Of studies University/Institution Laboratory computer When RR?

Slide 6

Slide 6 text

Idea Register Preregistration of Studies University/Institution Laboratory computer https://osf.io/6tsnj When RR?

Slide 7

Slide 7 text

Run experiments When RR?

Slide 8

Slide 8 text

Run experiments When RR?

Slide 9

Slide 9 text

Get data Submit datasets accompanying data descriptors to: Discipline-specific repositories Generalist repositories When RR?

Slide 10

Slide 10 text

Analyze Research Compendium When RR?

Slide 11

Slide 11 text

Communicate Abstract When RR? Preprint OA paper Paywalled paper Quick Files Talk Poster

Slide 12

Slide 12 text

How RR? Science Collect Analyse Publish Write Summarise Reproduce Re-analyse (meta-analysis) Share data Open Repository Share code open/free tools Collaborative tools Citation manager Pre-prints Open Access

Slide 13

Slide 13 text

Sparks et al. (unpublished) 101 99 Count How are we plant pathologists doing?

Slide 14

Slide 14 text

Sparks et al. (unpublished) No Upon request Paywalled Free access Are data made available?

Slide 15

Slide 15 text

Sparks et al. (unpublished) No Free access Are codes made available?

Slide 16

Slide 16 text

Sparks et al. (unpublished) No Name only Version # Full citation Is Software properly cited?

Slide 17

Slide 17 text

Sparks et al. (unpublished) What are the software being used?

Slide 18

Slide 18 text

- Lack of interest/knowledge (supplemental rarely posted) - Low incentive/pressure - that may change! - Perception that it takes time and effort - Document data and code - Versioning code and maintaining - FOBS - Fear of being scooped? - Not valued/taught in our graduate programs Why is it being so slow to adopt RR?

Slide 19

Slide 19 text

Tools Workflows Environments Collaborative & sharing platforms Ok, I want do it differently, but how? Research Project Organized Documented Shared Accessible Reproducible

Slide 20

Slide 20 text

Data wrangling - Excel Data visualization - Excel Data analysis - SAS, STATA Scientific plots - SigmaPlot Text editor - MSWord BIB Save money in Software! Use R and Friends

Slide 21

Slide 21 text

Start small! Then build on it... Article Article (+ preprint) Supplemental (zip) - Protocols - Data Article + preprint Repository (citable) - Protocols - Data - Code Article + preprint Research compendium - Raw Data - Clean Data - Analysis (reproducible) Reproducibility 0 1 2 3 level

Slide 22

Slide 22 text

A new research/submission workflow? Project Data Analysis Manuscript Preprint Journal Submission system Early view Final publication Poster/Talk Research Compendium

Slide 23

Slide 23 text

http://inundata.org/talks/rstd19/ https://research-compendium.science/ How RR? Source: https://research-compendium.science/

Slide 24

Slide 24 text

http://inundata.org/talks/rstd19/ How RR ?

Slide 25

Slide 25 text

https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375986 https://peerj.com/preprints/3192/ How RR?

Slide 26

Slide 26 text

R package structure as inspiration Small Medium Large

Slide 27

Slide 27 text

BIB Minimal RC Short RC webpage Full RC Website + manuscript Structures/templates for RC (not a package) CSL

Slide 28

Slide 28 text

Let's work? 5 Workout Sessions (45 min each) 1) Introduction 2) RStudio project + GitHub 3) The research compendium 4) Manuscript in RMarkdown 5) Packages for automating tasks & RStudio Cloud

Slide 29

Slide 29 text

Everything starts as a PROJECT!

Slide 30

Slide 30 text

RStudio and his friend Git 1. Download and install Git (GitHub Desktop) 2. Download and install RStudio 3. Go to GitHub and create a new repository Let's practice

Slide 31

Slide 31 text

RStudio and his friend Git 1. Create a new RStudio Project from Git 2. Add your repository URL Your turn

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

1. Clone a repository (open in Desktop) 2. https:/ /github.com/emdelponte/paper-FHB-yield-loss 3. Change files 4. Create a pull request to the owner

Slide 34

Slide 34 text

Let's explore a research compendium Short RC webpage 1. Fork the repository (short webpage compendium) a. https:/ /github.com/emdelponte/RC-example-webpage 2. Explore the files 3. Reproduce the analysis 4. Knit to generate the webpage 5. Commit and push to your GH

Slide 35

Slide 35 text

GitHub and Open Science Framework 1. Link GitHub and Open Science Framework 2. Create an OSF Project

Slide 36

Slide 36 text

1. Link GitHub and Open Science Framework 2. Name your project 3. Enable GitHub Add-on GitHub and Open Science Framework

Slide 37

Slide 37 text

1. Link GitHub and Open Science Framework 2. Name your project 3. Select GitHub Add-on 4. Configure Add-on GitHub and Open Science Framework

Slide 38

Slide 38 text

1. Finish description GitHub and Open Science Framework

Slide 39

Slide 39 text

Let's practice: Data management 1. Create/Open an RMarkdown file a. Modify the output parameters 2. Load data a. .csv file b. .xlsx file c. .gsheet file 3. Add some basic commenting 4. Do some basic wrangling 5. Export data to .csv

Slide 40

Slide 40 text

How RR for data management Get data Data file types Binary Text files web-based

Slide 41

Slide 41 text

Data management: must read!

Slide 42

Slide 42 text

Organizing, naming, shaping! Data management Broman and Woo (2018)

Slide 43

Slide 43 text

Analyze

Slide 44

Slide 44 text

Analyze

Slide 45

Slide 45 text

Fork or download a Repository 1. Fork the RC as webpage template a. https://github.com/emdelponte/RC-example-webpage 2. Download the files from GitHub 3. Start with a new RStudio Project + Git 4. Reproduce the analysis 5. Change some content/parameters 6. Push it to your GH account

Slide 46

Slide 46 text

Add local folder and create a GitHub Repository

Slide 47

Slide 47 text

Add local folder and create a GitHub Repository

Slide 48

Slide 48 text

Let's practice 1. Fork the RC as website template a. https:/ /github.com/emdelponte/RC-example-website 2. Change and/or generate the website (knit) 3. Push changes to your GitHub 4. Create a GH webpage for it 5. Send it to OSF project

Slide 49

Slide 49 text

RMarkdown templates 1. RMarkdown html from the basic RStudio templates 2. Rmdformats: https:/ /github.com/juba/rmdformats 3. Distill for RMarkdown: https:/ /github.com/rstudio/distill 4. RMarkdown websites: https:/ /rmarkdown.rstudio.com/lesson-13.html

Slide 50

Slide 50 text

YML header Manuscript in RMarkdown? --- title: "The title goes here" author: "Author name goes here" output: html_document: default word_document: reference_docx: template.docx linestretch: 2 link-citations: yes linkcolor: blue csl: chicago-author-date.csl bibliography: crossref.bib --- BIB CSL https://www.zotero.org/styles

Slide 51

Slide 51 text

Automating RC website creation Workflowr package Organized ● Provides a project template with organized subdirectories ● Mixes code and results with R Markdown ● Uses Git to version both source code and results Reproducible ● Displays the code version used to create each result ● Runs each analysis in an isolated R session ○ Records the session information of each analysis ○ Sets the same seed for random number generation for each analysis Shareable ○ Creates a website to present your research results ○ Documents how to host your website for free via GitHub Pages or GitLab Pages ○ Creates links to past versions of results https://jdblischak.github.io/workflowr

Slide 52

Slide 52 text

Workflowr pkg https:/ /jdblischak.github.io/workflowr/articles/wflow-01-getting-started.html

Slide 53

Slide 53 text

rrtools, an R package to create RC as package! https://github.com/benmarwick/rrtools

Slide 54

Slide 54 text

rrtools, pkg to facilitate creation of RC as a pkg!

Slide 55

Slide 55 text

rrtools, pkg to facilitate creation of RC as a pkg! https:/ /rstudio.cloud/project/424109

Slide 56

Slide 56 text

Let's make it reproducible for future?

Slide 57

Slide 57 text

remotes::install_github("karthik/holepunch") library(holepunch) write_compendium_description(package = "RC template ", description = "A template for a research compendium") write_dockerfile(maintainer = "Your name") generate_badge() # This generates a badge for your readme. # At this time push the code to GitHub # And click on the badge or use the function below to get the build # ready ahead of time. build_binder() Let's make it reproducible for future?

Slide 58

Slide 58 text

https:/ /github.com/emdelponte/RC-template RC was a website template Live examples: https://emdelponte.github.io/paper-FHB-Brazil-meta-analysis/ https://emdelponte.github.io/paper-fungicides-whitemold/ https://mladencucak.github.io/AnalysisPLBIreland/index.html

Slide 59

Slide 59 text

www.openplantpathology.org