Introduction to Git and GitHub
Patrick Kimes, PhD
Postdoctoral Fellow
Dana-Farber Cancer Institute
Harvard TH Chan School of Public Health
Data Science Seminar
November 27, 2018
Slide 2
Slide 2 text
why care about
Git and GitHub?
Slide 3
Slide 3 text
why care about
Git and GitHub?
sharing
collaboration
version control
Slide 4
Slide 4 text
Git
software for managing
files in a folder (repo)
Slide 5
Slide 5 text
GitHub
Git
software for managing
files in a folder (repo)
GitHub
cloud service for
hosting Git repos
Slide 6
Slide 6 text
GitHub
GitHub
cloud service for
hosting Git repos
/somewhere/on/my/computer/sigclust2/
https://github.com/pkimes/sigclust2/
Git
software for managing
files in a folder (repo)
Slide 7
Slide 7 text
GitHub
Git
software for managing
files in a folder (repo)
GitHub
cloud service for
hosting Git repos
Slide 8
Slide 8 text
GitHub
Git
software for managing
files in a folder (repo)
GitHub
cloud service for
hosting Git repos
Slide 9
Slide 9 text
Git
software for managing
files in a folder (repo)
Slide 10
Slide 10 text
version control software
Git
software for managing
files in a folder (repo)
Slide 11
Slide 11 text
Git
software for managing
files in a folder (repo)
version control software
http://phdcomics.com/comics.php?f=1323
“I already have a system”
Slide 12
Slide 12 text
rnaseq-analysis-update2-final.R
rnaseq-analysis.R
rnaseq-analysis-update.R
rnaseq-analysis-update2.R
rnaseq-analysis-update2-final-pkk.R
“I already have a system”
ad infinitum…
Slide 13
Slide 13 text
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
version control software
“I already have a system”
ad infinitum…
rnaseq-analysis-update2-final.R
rnaseq-analysis.R
rnaseq-analysis-update.R
rnaseq-analysis-update2.R
rnaseq-analysis-update2-final-pkk.R
Slide 14
Slide 14 text
version control software
Git
history of files is stored
as a series of commits
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
rnaseq-analysis.R
Slide 15
Slide 15 text
version control software
Git
history of files is stored
as a series of commits
rnaseq-analysis.R
Slide 16
Slide 16 text
version control software
Git
history of files is stored
as a series of commits
commit
snapshot of file +
useful message
rnaseq-analysis.R
Slide 17
Slide 17 text
version control software
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release rnaseq-analysis.R
Slide 18
Slide 18 text
version control software
Add new analysis
Update analysis parameters
Try new method
Remove older results
commit 6e40a27cb9415fd98fa3ef068efbb5e22eb7d497
Author: First Last
Date: Sun Nov 18 11:10:25 2018 -0500
Clean up notes for release
rnaseq-analysis.R
Slide 19
Slide 19 text
rnaseq-analysis.R
more commonly
visualized horizontally
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release
Slide 20
Slide 20 text
rnaseq-analysis.R
checkout an older commit
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release
Slide 21
Slide 21 text
rnaseq-analysis.R
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release
inspect a diff between two commits
Slide 22
Slide 22 text
rnaseq-analysis.R
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release
commit best practices
Slide 23
Slide 23 text
rnaseq-analysis.R
Add new analysis
Update analysis parameters
Try new method
Remove older results
Clean up notes for release
commit best practices
1. commits should be complete
Slide 24
Slide 24 text
1. commits should be complete
2. commit messages should be meaningful
commit best practices
https://xkcd.com/1296/
https://chris.beams.io/posts/git-commit/
Slide 25
Slide 25 text
Git
version control software for
managing files in a folder
Slide 26
Slide 26 text
repo
folder of files; a Git project
Git
version control software for
managing files in a folder
Slide 27
Slide 27 text
repo
folder of files; a Git project
commit
snapshot of files in a repo
Git
version control software for
managing files in a folder
Slide 28
Slide 28 text
Git
version control software for
managing files in a folder
git repo
Slide 29
Slide 29 text
GitHub
GitHub
cloud service for
hosting Git projects
Git
version control software for
managing files in a folder
git repo
Slide 30
Slide 30 text
GitHub
GitHub
cloud service for
hosting Git projects
Git
version control software for
managing files in a folder
git repo
git repo
Slide 31
Slide 31 text
GitHub
git repo
git repo
GitHub: hosting service
Slide 32
Slide 32 text
GitHub
git repo
git repo
GitHub: hosting service
sharing
newest
version
GitHub is more than just a cloud
sharing
share the complete Git history
Slide 44
Slide 44 text
GitHub is more than just a cloud
sharing
collaboration
share the complete Git history
open the code to suggestions and fixes
Slide 45
Slide 45 text
GitHub is more than just a cloud
https://kbroman.org/github_tutorial/pages/why.html
Slide 46
Slide 46 text
GitHub was built for Git
Slide 47
Slide 47 text
GitHub was built for Git
Slide 48
Slide 48 text
GitHub was built for Git
Slide 49
Slide 49 text
GitHub was built for Git
Slide 50
Slide 50 text
GitHub was built for Git
Slide 51
Slide 51 text
GitHub was built for Git
Slide 52
Slide 52 text
GitHub was built for Git
Slide 53
Slide 53 text
GitHub was built for Git
Slide 54
Slide 54 text
GitHub was built for Git
Slide 55
Slide 55 text
BitBucket GitLab
GitHub
GitHub isn’t the only option,
but it’s a good one
Slide 56
Slide 56 text
GitHub
cloud service for
hosting Git projects
GitHub
git repo
git repo
Slide 57
Slide 57 text
GitHub
cloud service for
hosting Git projects
GitHub
git repo
git repo
remote
hosted copy of a repo
local
remote
Slide 58
Slide 58 text
GitHub
cloud service for
hosting Git projects
GitHub
git repo
git repo
remote
hosted copy of a repo
local
remote
push/pull
sync commits between
local/remote
push
Slide 59
Slide 59 text
GitHub
cloud service for
hosting Git projects
GitHub
git repo
git repo
remote
hosted copy of a repo
local
remote
push/pull
sync commits between
local/remote
pull
push
Slide 60
Slide 60 text
push/pull
sync commits between
local/remote
GitHub
cloud service for
hosting Git projects
GitHub
git repo
git repo
git repo
remote
pull
push
local
local
remote
hosted copy of a repo
pull
push
Slide 61
Slide 61 text
repo
folder of files; a Git project
commit
snapshot of files in a repo
Git
version control software for
managing files in a folder
Slide 62
Slide 62 text
repo
folder of files; a Git project
commit
snapshot of files in a repo
Git
version control software for
managing files in a folder
GitHub
cloud service for
hosting Git projects
remote
remote copy of repo
push/pull
sync commits between local/remote
Slide 63
Slide 63 text
awesome!
Slide 64
Slide 64 text
https://xkcd.com/1597/
Git and GitHub IRL
(in real life)
Slide 65
Slide 65 text
Git is
command line
Git is unfriendly
Slide 66
Slide 66 text
enough with the what,
on to the how
Slide 67
Slide 67 text
enough with the what,
on to the how
what you’ll need:
1.Git
2.GitHub account
Slide 68
Slide 68 text
1.Git
2.GitHub account
3.Git GUI client
enough with the what,
on to the how
what you’ll need:
Slide 69
Slide 69 text
enough with the what,
on to the how
what you’ll need:
1.Git
2.GitHub account
3.Git GUI client
GitHub
Desktop
GitKraken
Slide 70
Slide 70 text
Git on the
command line
Slide 71
Slide 71 text
Git in RStudio
Slide 72
Slide 72 text
Git in RStudio
Slide 73
Slide 73 text
enough with the what,
on to the how
what you’ll need:
1.Git
2.GitHub account
3.Git GUI client
1.
2.
3.
/username
Slide 74
Slide 74 text
enough with the what,
on to the how
what you’ll need:
1.Git
2.GitHub account
3.Git GUI client
link
local/remote
1.
2.
3.
/username