Slide 1

Slide 1 text

Writing reports in RStudio: an introduction to RMarkdown Michael Harper

Slide 2

Slide 2 text

Overview • The benefits of RMarkdown and reproducible research • The syntax for writing an RMarkdown document • Highlight some examples of using RMarkdown • Explain the "sotonthesis" R package: a template for Southampton University thesis • Provide resources for further reading to master RMarkdown 2

Slide 3

Slide 3 text

What is RMarkdown? • A combination of R and Markdown (a simple markup language) • Save and execute code within the report • Generate high quality reports directly from the analysis

Slide 4

Slide 4 text

What is Reproducible Research? Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them

Slide 5

Slide 5 text

Other visual editor programs Non-Reproducible Workflow 5 Figure 1 Results Table Script.R Excel to run analysis Dataset1.csv Dataset2.csv Report × Point and click × Often rely on manual processes × Datasets may be updated × Analysis changes × Figure Layouts need update Managing analysis can become difficult! Manually Add Save figures Challenges

Slide 6

Slide 6 text

Reproducible Workflow with RMarkdown 6 Dataset1.csv Dataset2.csv Report & Analysis Compile Single file for code and analysis Latest figures and results included within report Benefits

Slide 7

Slide 7 text

Why Bother? BENEFITS FOR YOU • Ease of updating work • Efficiency • Flexibility BENEFIT FOR OTHERS • Reproducibility of analysis • Transparency • Trust in work However, there are substantial technical and cultural limitations See http://benmarwick.github.io/CSSS-Primer-Reproducible-Research/ for some more information

Slide 8

Slide 8 text

RMarkdown Basics

Slide 9

Slide 9 text

RMarkdown Setup • Recommended to use Rstudio. Download direct from website as the university version is out of date • Builds upon a number of packages knitr and rmarkdown. 9 # Run this within R install.packages("rmarkdown") • This should install all the required dependencies within R • To make PDF reports you will need to have LaTeX installed

Slide 10

Slide 10 text

Rmarkdown File Components 10 YAML header surrounded by --- R code chunks surrounded by ``` Text mixed with simple text formatting

Slide 11

Slide 11 text

YAML header • Act as the document template settings • “output” determines which file type will be built from your .Rmd file • Can customise the font, table of contents, page size within YAML: http://rmarkdown.rstudio.com/pdf_document_format.html 11 --- title: “Untitled” author: “Anonymous” output: pdf_document ----

Slide 12

Slide 12 text

Code Chunks • When you render your .Rmd file, R Markdown will run each code chunk and embed the results beneath the code chunk in your final report 12 ```{r} # Insert any R code plot(cars) ```

Slide 13

Slide 13 text

Code Chunk Example 13

Slide 14

Slide 14 text

Code Chunks (2) 14 •include = FALSE prevents code and results from appearing in the finished file. R Markdown still runs the code in the chunk, and the results can be used by other chunks. •echo = FALSE prevents code, but not the results from appearing in the finished file. This is a useful way to embed figures. •message = FALSE prevents messages that are generated by code from appearing in the finished file. •warning = FALSE prevents warnings that are generated by code from appearing in the finished. •fig.cap = "..." adds a caption to graphical results. ```{r cars, fig.cap = “A scatter diagram of the distance required for a vehicle to stop”} plot(cars) ``` • The output of code chunks can be controlled by settings • Some common settings include • Full list of options available here https://yihui.name/knitr/options/

Slide 15

Slide 15 text

Markdown Document elements: 15 •Headers •Lists •Links •Images •Block quotes •Latex equations •Horizontal rules •Tables •Footnotes •Bibliographies and Citations •Slide breaks •Italicized text •Bold text •Superscripts •Subscripts •Strikethrough text Read more: http://rmarkdown.rstudio.com/authoring_pandoc_markdown.html Markdown is designed to be easy to write and easy to read

Slide 16

Slide 16 text

Markdown Basics 16 https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf

Slide 17

Slide 17 text

Markdown Basics 17

Slide 18

Slide 18 text

Bibliographies and Citations • References stored within a .bib file and called within YAML 18 Blah blah [@doe99]. • Citations go inside square brackets and are separated by semicolons. Each citation must have a key, composed of ‘@’ + the citation identifier from the database, and may optionally have a prefix, a locator, and a suffix. Here are some examples: http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html • Reference managers (Mendeley) can easily export bib files. (goo.gl/VXj8pA)

Slide 19

Slide 19 text

Markdown Example MARKDOWN ## R Markdown This is an R Markdown document [@Reference2000]. **Bold text** is great, and *italics* are also useful. The following list shows: 1. Item 1 2. Item 2 3. Item 3 LATEX \subsection{R Markdown} This is an R Markdown document \cite{Reference2000}. \textbf{Bold text} is great, and \emph{italics} are also useful. The following list shows : \begin{enumerate} \item Item 1 \item Item 2 \item Item 3 \end{enumerate}

Slide 20

Slide 20 text

Building File 20 Output format is determined within the “output” option in the pandoc settings: • html_document • pdf_document (LaTeX) • word_document • beamer_presentation • ioslides_presentation Runs analysis Builds report

Slide 21

Slide 21 text

Building Example 21

Slide 22

Slide 22 text

Rmarkdown Examples & Ideas

Slide 23

Slide 23 text

1. Download data from web • Call data directly from web datasource • Some examples:  Twitter data  Website traffic  Weather data  Latest Satellite imagery • Graphs update every time report is compiled. • Some ideas Intermediate

Slide 24

Slide 24 text

1. Download data from web Example: Google search terms More ideas: http://rstudio-pubs-static.s3.amazonaws.com/155168_d306bcd159da4ff5991c961025dbcb8e.html Intermediate

Slide 25

Slide 25 text

2. Create Diagrams • Link analysis with figures • Uses the DiagrammeR http://rich- iannone.github.io/DiagrammeR/docs.html 25 Advanced

Slide 26

Slide 26 text

3. Customise Layouts • Pandoc uses templates to create the output report • These can be altered to create custom templates • variables within YAML will be substituted into the template at $variable$ location • https://github.com/svmiller/svm-r- markdown- templates/blob/master/svm- rmarkdown-article-example.pdf • https://pandoc.org/MANUAL.html 26 Advanced

Slide 27

Slide 27 text

Further Gallery Ideas Interactive Documents & Web Apps http://rmarkdown.rstudio.co m/gallery.html Also check out these user projects: https://yihui.name/knitr/dem o/showcase/

Slide 28

Slide 28 text

sotonthesis template

Slide 29

Slide 29 text

sotonthesis template • Available on GitHub: www.github.com/mikey- harper/sotonthesis • Template for RMarkdown. • Builds upon the package bookdown, an extension of RMarkdown designed for long- format reports or books. Benefits  Easy to install  Build your thesis and progress reports within RMarkdown  Advanced YAML customisation allows  Build reports into PDF, Word or HTML  No need to edit LaTeX template  Meet university thesis template guidelines1 1 http://library.soton.ac.uk/thesis/templates

Slide 30

Slide 30 text

Installing Template 30 install.packages("devtools") devtools::install_github("rstudio/bookdown") devtools::install_github("mikey-harper/sotonthesis")

Slide 31

Slide 31 text

sotonthesis 31 • Document can be split into multiple .Rmd files • Reading the bookdown book is essential https://bookdown.org/yihui/bookdown/ Good knowledge of RMarkdown vital to be able to use effectively and without frustration

Slide 32

Slide 32 text

Further Resources

Slide 33

Slide 33 text

Online Reading • http://rmarkdown.rstudio.com/index.html Essential reading: used for much of the content of this presentation 33

Slide 34

Slide 34 text

Cheatsheets •RMarkdown Cheatsheet: A very reference sheet. Print this out and have above your desk when you start learning RMarkdown. •RMarkdown Reference: similar to the cheatsheet, but provides more detail surrounding the customisation and documents settings 34

Slide 35

Slide 35 text

Books 35 Available for free online: https://bookdown.org/yihui /bookdown/ First three chapters here: https://github.com/yihui/kni tr-book Google “Reproducible Research with R and Rstudio”

Slide 36

Slide 36 text

Tips for Mastering RMarkdown 1. Read, Read, Read! 2. Learn the basics of RMarkdown before considering more advanced techniques & sotonthesis template 3. An understanding of LaTeX is useful for developing reports. 36

Slide 37

Slide 37 text

Summary • Takes time to master, but it is worth it. • Some advanced ideas presented to highlight the powers of Rmarkdown 37

Slide 38

Slide 38 text

Thank You Slides, contact information and further reading list available at mikeyharper.uk/RMarkdown