Slide 1

Slide 1 text

  Jennifer Bryan 
 RStudio, University of British Columbia @JennyBryan @jennybc Happy Git and GitHub for the useR

Slide 2

Slide 2 text

happygitwithr.com

Slide 3

Slide 3 text

http://happygitwithr.com/workshops.html#pre-workshop-set-up

Slide 4

Slide 4 text

https://github.com/ Register a free account NOW! Or sign in.

Slide 5

Slide 5 text

https://speakerdeck.com/alicebartlett/git-for-humans Alice Bartlett Senior Developer, Financial Times @alicebartlett Git for humans

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

VCS = Version control system original domain: software development Git = very popular VCS Git is being “repurposed” beyond s/w dev

Slide 8

Slide 8 text

adapted from https://www.atlassian.com/git/tutorial/git-basics#!clone

Slide 9

Slide 9 text

rigid structure of Git manages project evolution master copy in cloud enables sane collaboration + =

Slide 10

Slide 10 text

Indicate your OS via GitHub emoji reaction! https://github.com/jennybc/happy-git-with-r/issues/55 Bonus content: learn how to search GitHub!

Slide 11

Slide 11 text

http://happygitwithr.com/installation-pain.html#success-and-operating-systems

Slide 12

Slide 12 text

Make your first pull request! https://github.com/rladies/praise

Slide 13

Slide 13 text

get off the beach! Matthew Hughes via Stephen Heard

Slide 14

Slide 14 text

What would Git adoption feel like? Install Git. Configure it. Affirm RStudio can find it. R project? Pre-existing or new. Dedicate a directory to it. Make that an RStudio Project. Make that a Git repository. Do your usual thing but … instead of just saving, you also make commits. Push to GitHub periodically.

Slide 15

Slide 15 text

RStudio will offer a Git pane to help you make commits, view history and diffs, and push to / pull from GitHub.

Slide 16

Slide 16 text

You — and possibly other people! — could visit the project on GitHub. For browsing and much more.

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

https://speakerdeck.com/alicebartlett/git-for-humans Alice Bartlett Senior Developer, Financial Times @alicebartlett Git for humans Excerpt on repos & commits

Slide 19

Slide 19 text

agony : flow

Slide 20

Slide 20 text

agony : flow

Slide 21

Slide 21 text

agony reduction

Slide 22

Slide 22 text

Use a Git client RStudio might not be enough — some noticeable gaps I SourceTree (free, Mac OS + Windows) More recommendations here: http://happygitwithr.com/git-client.html

Slide 23

Slide 23 text

SourceTree, a free Git client for Windows and Mac.

Slide 24

Slide 24 text

Or do it like this … it’s your call.

Slide 25

Slide 25 text

RStudio can also act as your Git(Hub) client http://www.rstudio.com/ide/docs/version_control/overview

Slide 26

Slide 26 text

Let’s create a new project with the 
 “GitHub first, then RStudio workflow” http://happygitwithr.com/new-github-first.html

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

http://starlogs.net

Slide 31

Slide 31 text

https://git-man-page-generator.lokaltog.net http://starlogs.net

Slide 32

Slide 32 text

Use GitHub Or Bitbucket or Gitlab or … Even if you keep things private and don’t collaborate. Commit and push early and often! Why, you ask?

Slide 33

Slide 33 text

- Alberto Brandolini

Slide 34

Slide 34 text

The amount of Git skilz necessary to fix a borked up repo is an order of magnitude bigger than to bork it. - Me

Slide 35

Slide 35 text

burn it all down

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything. “burn it all down” workflow on explainxkcd.com

Slide 39

Slide 39 text

What’s so great about (R) Markdown + Git(Hub)?

Slide 40

Slide 40 text

R + markdown + GitHub Do your work Get a presentable, web-friendly version for free Present-ability is BAKED IN … not a separate process you never get around to

Slide 41

Slide 41 text

stuff you need to write stuff people like to read

Slide 42

Slide 42 text

stuff you need to write stuff people like to read foo.R foo.Rmd foo.md foo.html

Slide 43

Slide 43 text

markdown

Slide 44

Slide 44 text

Markdown HTML foo.md foo.html easy to write easy to publish

Slide 45

Slide 45 text

Title (header 1, actually) ===================================== This is a Markdown document. ## Medium header (header 2, actually) It's easy to do *italics* or __make things bold__. > All models are wrong, but some are useful. An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem. Absolute certainty is a privilege of uneducated minds-and fanatics. It is, for scientific folk, an unattainable ideal. What you do every day matters more than what you do once in a while. We cannot expect anyone to know anything we didn't teach them ourselves. Enthusiasm is a form of social courage. Code block below. Just affects formatting here but we'll get to R Markdown for the real fun soon! ``` x <- 3 * 4 ``` I can haz equations. Inline equations, such as ... the average is computed as $\frac{1}{n} \sum_{i=1}^{n} x_{i}$. Or display equations like this: $$ \begin{equation*} |x|= \begin{cases} x & \text{if $x≥0$,} \\\\ -x &\text{if $x\le 0$.} \end{cases} Title (header 1, actually) body { font-family: Helvetica, arial, sans-serif; font-size: 14px; ... <body> <h1>Title (header 1, actually)</h1> <p>This is a Markdown document.</p> <h2>Medium header (header 2, actually)</h2> <p>It&#39;s easy to do <em>italics</em> or <strong>make things bold</strong>.</p> <blockquote> <p>All models are wrong, but some are... <p>Code block below. Just affects formatting here but we&#39;ll get to R Markdown for the real fun soon!</p> Markdown HTML

Slide 46

Slide 46 text

Title (header 1, actually) ===================================== This is a Markdown document. ## Medium header (header 2, actually) It's easy to do *italics* or __make things bold__. > All models are wrong, but some are useful. An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem. Absolute certainty is a privilege of uneducated minds-and fanatics. It is, for scientific folk, an unattainable ideal. What you do every day matters more than what you do once in a while. We cannot expect anyone to know anything we didn't teach them ourselves. Enthusiasm is a form of social courage. Code block below. Just affects formatting here but we'll get to R Markdown for the real fun soon! ``` x <- 3 * 4 ``` I can haz equations. Inline equations, such as ... the average is computed as $\frac{1}{n} \sum_{i=1}^{n} x_{i}$. Or display equations like this: $$ \begin{equation*} |x|= \begin{cases} x & \text{if $x≥0$,} \\\\ -x &\text{if $x\le 0$.} \end{cases} Markdown HTML

Slide 47

Slide 47 text

Markdown HTML foo.md foo.html GitHub automatically renders Markdown as if it’s HTML!

Slide 48

Slide 48 text

Markdown, as rendered on GitHub

Slide 49

Slide 49 text

If you have an annoying process for authoring for the web .... or If you avoid authoring for the web, because you’re not sure how ... start writing in Markdown and fling it up on GitHub.

Slide 50

Slide 50 text

What’s so great about (R) Markdown + Git(Hub)?

Slide 51

Slide 51 text

R Markdown rocks ===================================== This is an R Markdown document. ```{r} x <- rnorm(1000) head(x) ``` See how the R code gets executed and a representation thereof appears in the document? `knitr` gives you control over how to represent all conceivable types of output. In case you care, then average of the `r length(x)` random normal variates we just generated is `r round(mean(x), 3)`. Those numbers are NOT hard- wired but are computed on-the-fly. As is this figure. No more copy-paste ... copy-paste ... oops forgot to copy-paste. ```{r} plot(density(x)) ``` Note that all the previously demonstrated math typesetting still works. You don't have to choose between having math cred and being web-friendly! Inline equations, such as ... the average is computed as $ \frac{1}{n} \sum_{i=1}^{n} x_{i}$. Or display equations like this: $$ \begin{equation*} |x|= \begin{cases} x & \text{if $x≥0$,} \\\\ -x &\text{if $x\le 0$.} R Markdown rocks ===================================== This is an R Markdown document. ```r x <- rnorm(1000) head(x) ``` ``` ## [1] -1.3007 0.7715 0.5585 -1.2854 1.1973 2.4157 ``` See how the R code gets executed and a representation thereof appears in the document? `knitr` gives you control over how to represent all conceivable types of output. In case you care, then average of the 1000 random normal variates we just generated is -0.081. Those numbers are NOT hard-wired but are computed on-the- fly. As is this figure. No more copy-paste ... copy-paste ... oops forgot to copy-paste. ```r plot(density(x)) ``` ![plot of chunk unnamed-chunk-2](figure/unnamed-chunk-2.png) ... R Markdown Markdown

Slide 52

Slide 52 text

R Markdown rocks ===================================== This is an R Markdown document. ```r x <- rnorm(1000) head(x) ``` ``` ## [1] -1.3007 0.7715 0.5585 -1.2854 1.1973 2.4157 ``` See how the R code gets executed and a representation thereof appears in the document? `knitr` gives you control over how to represent all conceivable types of output. In case you care, then average of the 1000 random normal variates we just generated is -0.081. Those numbers are NOT hard-wired but are computed on-the- fly. As is this figure. No more copy-paste ... copy-paste ... oops forgot to copy-paste. ```r plot(density(x)) ``` ![plot of chunk unnamed-chunk-2](figure/unnamed-chunk-2.png) ... Markdown HTML

Slide 53

Slide 53 text

R Markdown HTML foo.Rmd foo.html easy to write easy to publish Markdown foo.md

Slide 54

Slide 54 text

R Markdown Markdown foo.Rmd foo.md easy to write easy to GitHub

Slide 55

Slide 55 text

library(rmarkdown) render("foo.Rmd")

Slide 56

Slide 56 text

foo.Rmd foo.html --- title: "Untitled" output: html_document --- foo.Rmd foo.md foo.html --- title: "Untitled" output: html_document: keep_md: yes --- foo.Rmd foo.md --- output: md_document --- foo.Rmd foo.md --- output: github_document ---

Slide 57

Slide 57 text

library(rmarkdown) render("foo.R") Do I have to do everything in R markdown? What about plain R scripts?

Slide 58

Slide 58 text

R HTML

Slide 59

Slide 59 text

foo.R foo.html #’ --- #’ title: "Untitled" #’ output: html_document #’ --- foo.R foo.md foo.html #’ --- #’ title: "Untitled" #’ output: #’ html_document: #’ keep_md: yes #’ --- foo.R foo.md #’--- #’ output: #’ md_document #’--- foo.R foo.md #’ --- #’ output: #’ github_document #’ ---

Slide 60

Slide 60 text

Let’s work with Rmd from RStudio, with Git and GitHub. happygitwithr.com/rmd-test-drive.html

Slide 61

Slide 61 text

get a pseudo- website for free

Slide 62

Slide 62 text

http://happygitwithr.com/repo-browsability.html

Slide 63

Slide 63 text

Impressive showing by Git here

Slide 64

Slide 64 text

Make markdown! Commit markdown! Rmd → markdown! R → markdown! output_format: github_document

Slide 65

Slide 65 text

https://github.com/blog/2289-publishing-with-github-pages-now-as-easy-as-1-2-3

Slide 66

Slide 66 text

Comma (.csv) and tab (.tsv) delimited files are automatically rendered nicely in GitHub repositories Example: some Lord of the Rings data

Slide 67

Slide 67 text

“README.md as index.html”

Slide 68

Slide 68 text

“one definitive source” .Rmd → .R .md, .html

Slide 69

Slide 69 text

“party in the back”

Slide 70

Slide 70 text

https://github.com/jennybc/gapminder/tree/master/data-raw#readme

Slide 71

Slide 71 text

https://github.com/blog/2289-publishing-with-github-pages-now-as-easy-as-1-2-3 Turn this on for your practice repo!

Slide 72

Slide 72 text

the poor woman’s regression test of a data analysis

Slide 73

Slide 73 text

Your code’s the same Your data’s the same But you updated R + pkgs Surprise!

Slide 74

Slide 74 text

No content

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

subtle fig changes due to ggplot2 release

Slide 77

Slide 77 text

Let’s try this out in our Rmd. Do something. render, commit, push Change something. render, commit push

Slide 78

Slide 78 text

increase flow

Slide 79

Slide 79 text

machine readable & human readable

Slide 80

Slide 80 text

code can be machine & human readable

Slide 81

Slide 81 text

data can be machine & human readable

Slide 82

Slide 82 text

your project can be machine & human readable

Slide 83

Slide 83 text

code comments README explanation-of- mystifying- variable-names- and-codes.txt

Slide 84

Slide 84 text

What is here? When did it last change? Who changed it? Why did they change it? Can I have it? Oh, I want that other version.

Slide 85

Slide 85 text

Commits are how the files evolve

Slide 86

Slide 86 text

Commit message = short description of what/why changed

Slide 87

Slide 87 text

“diffs” show what actually changed

Slide 88

Slide 88 text

Issues for bug reports, feature requests, to do list, …

Slide 89

Slide 89 text

collaboration

Slide 90

Slide 90 text

Alice Bartlett Senior Developer, Financial Times @alicebartlett Git for humans https://speakerdeck.com/alicebartlett/git-for-humans Excerpt on remotes, push, pull

Slide 91

Slide 91 text

+ = + +

Slide 92

Slide 92 text

in theory more typical GitHub adapted from https://www.atlassian.com/git/tutorial/git-basics#!clone

Slide 93

Slide 93 text

Note the contributions to STAT 545 materials from one prof, 3 TAs, and one kind soul from the internet

Slide 94

Slide 94 text

“Pull requests” are a mechanism to propose, discuss, and merge changes into a repository.

Slide 95

Slide 95 text

when you’re the boss: link to evolving files, don’t attach static copies to email plain text everything you can use Git put it on the internet somewhere when you’re not the boss: try to talk everyone into Google Docs

Slide 96

Slide 96 text

  Jennifer Bryan 
 RStudio, University of British Columbia @JennyBryan @jennybc Happy Git and GitHub for the useR