$30 off During Our Annual Pro Sale. View Details »

DFCI Introduction to Git and GitHub

DFCI Introduction to Git and GitHub

Data Science Seminar
Department of Data Sciences
Dana-Farber Cancer Institute

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Patrick Kimes

November 27, 2018
Tweet

More Decks by Patrick Kimes

Other Decks in Programming

Transcript

  1. Introduction to Git and GitHub
    Patrick Kimes, PhD
    Postdoctoral Fellow
    Dana-Farber Cancer Institute
    Harvard TH Chan School of Public Health
    Data Science Seminar
    November 27, 2018

    View Slide

  2. why care about
    Git and GitHub?

    View Slide

  3. why care about
    Git and GitHub?
    sharing
    collaboration
    version control

    View Slide

  4. Git
    software for managing
    files in a folder (repo)

    View Slide

  5. GitHub
    Git
    software for managing
    files in a folder (repo)
    GitHub
    cloud service for
    hosting Git repos

    View Slide

  6. GitHub
    GitHub
    cloud service for
    hosting Git repos
    /somewhere/on/my/computer/sigclust2/
    https://github.com/pkimes/sigclust2/
    Git
    software for managing
    files in a folder (repo)

    View Slide

  7. GitHub
    Git
    software for managing
    files in a folder (repo)
    GitHub
    cloud service for
    hosting Git repos

    View Slide

  8. GitHub
    Git
    software for managing
    files in a folder (repo)
    GitHub
    cloud service for
    hosting Git repos

    View Slide

  9. Git
    software for managing
    files in a folder (repo)

    View Slide

  10. version control software
    Git
    software for managing
    files in a folder (repo)

    View Slide

  11. Git
    software for managing
    files in a folder (repo)
    version control software
    http://phdcomics.com/comics.php?f=1323
    “I already have a system”

    View Slide

  12. rnaseq-analysis-update2-final.R
    rnaseq-analysis.R
    rnaseq-analysis-update.R
    rnaseq-analysis-update2.R
    rnaseq-analysis-update2-final-pkk.R
    “I already have a system”
    ad infinitum…

    View Slide

  13. rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R
    version control software
    “I already have a system”
    ad infinitum…
    rnaseq-analysis-update2-final.R
    rnaseq-analysis.R
    rnaseq-analysis-update.R
    rnaseq-analysis-update2.R
    rnaseq-analysis-update2-final-pkk.R

    View Slide

  14. version control software
    Git
    history of files is stored
    as a series of commits
    rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R
    rnaseq-analysis.R

    View Slide

  15. version control software
    Git
    history of files is stored
    as a series of commits
    rnaseq-analysis.R

    View Slide

  16. version control software
    Git
    history of files is stored
    as a series of commits
    commit
    snapshot of file +
    useful message
    rnaseq-analysis.R

    View Slide

  17. version control software
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release rnaseq-analysis.R

    View Slide

  18. version control software
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    commit 6e40a27cb9415fd98fa3ef068efbb5e22eb7d497
    Author: First Last
    Date: Sun Nov 18 11:10:25 2018 -0500
    Clean up notes for release
    rnaseq-analysis.R

    View Slide

  19. rnaseq-analysis.R
    more commonly
    visualized horizontally
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release

    View Slide

  20. rnaseq-analysis.R
    checkout an older commit
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release

    View Slide

  21. rnaseq-analysis.R
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release
    inspect a diff between two commits

    View Slide

  22. rnaseq-analysis.R
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release
    commit best practices

    View Slide

  23. rnaseq-analysis.R
    Add new analysis
    Update analysis parameters
    Try new method
    Remove older results
    Clean up notes for release
    commit best practices
    1. commits should be complete

    View Slide

  24. 1. commits should be complete
    2. commit messages should be meaningful
    commit best practices
    https://xkcd.com/1296/
    https://chris.beams.io/posts/git-commit/

    View Slide

  25. Git
    version control software for
    managing files in a folder

    View Slide

  26. repo
    folder of files; a Git project
    Git
    version control software for
    managing files in a folder

    View Slide

  27. repo
    folder of files; a Git project
    commit
    snapshot of files in a repo
    Git
    version control software for
    managing files in a folder

    View Slide

  28. Git
    version control software for
    managing files in a folder
    git repo

    View Slide

  29. GitHub
    GitHub
    cloud service for
    hosting Git projects
    Git
    version control software for
    managing files in a folder
    git repo

    View Slide

  30. GitHub
    GitHub
    cloud service for
    hosting Git projects
    Git
    version control software for
    managing files in a folder
    git repo
    git repo

    View Slide

  31. GitHub
    git repo
    git repo
    GitHub: hosting service

    View Slide

  32. GitHub
    git repo
    git repo
    GitHub: hosting service
    sharing
    newest
    version

    View Slide

  33. GitHub
    git repo
    git repo
    GitHub: hosting service
    sharing
    git repo
    collaboration

    View Slide

  34. GitHub
    git repo
    git repo
    GitHub: hosting service
    sharing
    git repo
    collaboration

    View Slide

  35. GitHub?

    View Slide

  36. GitHub?

    View Slide

  37. GitHub?

    View Slide

  38. GitHub?

    View Slide

  39. GitHub?

    View Slide

  40. GitHub?

    View Slide

  41. why can’t we just use…
    Google Drive
    Dropbox
    …?

    View Slide

  42. GitHub is more than just a cloud

    View Slide

  43. GitHub is more than just a cloud
    sharing
    share the complete Git history

    View Slide

  44. GitHub is more than just a cloud
    sharing
    collaboration
    share the complete Git history
    open the code to suggestions and fixes

    View Slide

  45. GitHub is more than just a cloud
    https://kbroman.org/github_tutorial/pages/why.html

    View Slide

  46. GitHub was built for Git

    View Slide

  47. GitHub was built for Git

    View Slide

  48. GitHub was built for Git

    View Slide

  49. GitHub was built for Git

    View Slide

  50. GitHub was built for Git

    View Slide

  51. GitHub was built for Git

    View Slide

  52. GitHub was built for Git

    View Slide

  53. GitHub was built for Git

    View Slide

  54. GitHub was built for Git

    View Slide

  55. BitBucket GitLab
    GitHub
    GitHub isn’t the only option,
    but it’s a good one

    View Slide

  56. GitHub
    cloud service for
    hosting Git projects
    GitHub
    git repo
    git repo

    View Slide

  57. GitHub
    cloud service for
    hosting Git projects
    GitHub
    git repo
    git repo
    remote
    hosted copy of a repo
    local
    remote

    View Slide

  58. GitHub
    cloud service for
    hosting Git projects
    GitHub
    git repo
    git repo
    remote
    hosted copy of a repo
    local
    remote
    push/pull
    sync commits between
    local/remote
    push

    View Slide

  59. GitHub
    cloud service for
    hosting Git projects
    GitHub
    git repo
    git repo
    remote
    hosted copy of a repo
    local
    remote
    push/pull
    sync commits between
    local/remote
    pull
    push

    View Slide

  60. push/pull
    sync commits between
    local/remote
    GitHub
    cloud service for
    hosting Git projects
    GitHub
    git repo
    git repo
    git repo
    remote
    pull
    push
    local
    local
    remote
    hosted copy of a repo
    pull
    push

    View Slide

  61. repo
    folder of files; a Git project
    commit
    snapshot of files in a repo
    Git
    version control software for
    managing files in a folder

    View Slide

  62. repo
    folder of files; a Git project
    commit
    snapshot of files in a repo
    Git
    version control software for
    managing files in a folder
    GitHub
    cloud service for
    hosting Git projects
    remote
    remote copy of repo
    push/pull
    sync commits between local/remote

    View Slide

  63. awesome!

    View Slide

  64. https://xkcd.com/1597/
    Git and GitHub IRL
    (in real life)

    View Slide

  65. Git is
    command line
    Git is unfriendly

    View Slide

  66. enough with the what,
    on to the how

    View Slide

  67. enough with the what,
    on to the how
    what you’ll need:
    1.Git
    2.GitHub account

    View Slide

  68. 1.Git
    2.GitHub account
    3.Git GUI client
    enough with the what,
    on to the how
    what you’ll need:

    View Slide

  69. enough with the what,
    on to the how
    what you’ll need:
    1.Git
    2.GitHub account
    3.Git GUI client
    GitHub
    Desktop
    GitKraken

    View Slide

  70. Git on the
    command line

    View Slide

  71. Git in RStudio

    View Slide

  72. Git in RStudio

    View Slide

  73. enough with the what,
    on to the how
    what you’ll need:
    1.Git
    2.GitHub account
    3.Git GUI client
    1.
    2.
    3.
    /username

    View Slide

  74. enough with the what,
    on to the how
    what you’ll need:
    1.Git
    2.GitHub account
    3.Git GUI client
    link
    local/remote
    1.
    2.
    3.
    /username

    View Slide

  75. enough with the what,
    on to the how

    View Slide

  76. View Slide

  77. View Slide

  78. View Slide

  79. View Slide

  80. View Slide

  81. View Slide

  82. View Slide

  83. View Slide

  84. View Slide

  85. View Slide

  86. View Slide

  87. View Slide

  88. View Slide

  89. View Slide

  90. View Slide

  91. View Slide

  92. pkimes$ git remote add origin https://github.com/pkimes/dfci-example.git
    pkimes$ git push -u origin master

    View Slide

  93. View Slide

  94. View Slide

  95. View Slide

  96. View Slide

  97. wait, I’m lost

    View Slide

  98. git repo
    local
    wait, I’m lost

    View Slide

  99. git repo
    local
    wait, I’m lost

    View Slide

  100. GitHub
    git repo
    local
    wait, I’m lost

    View Slide

  101. GitHub
    git repo
    git repo
    local
    remote
    wait, I’m lost

    View Slide

  102. GitHub
    git repo
    local
    remote
    link
    wait, I’m lost
    git repo

    View Slide

  103. GitHub
    git repo
    git repo
    local
    remote
    push
    wait, I’m lost

    View Slide

  104. awesome!

    View Slide

  105. wait, I’m still lost

    View Slide

  106. Alice Bartlett
    Senior Developer, Financial Times
    @alicebartlett
    Git for humans
    Git for humans
    https://speakerdeck.com/alicebartlett/git-for-humans

    View Slide

  107. happy Git and GitHub for the useR
    http://happygitwithr.com/

    View Slide

  108. introduction to data science (section VIII)
    https://rafalab.github.io/dsbook/

    View Slide

  109. data science course in a box
    https://github.com/rstudio-education/datascience-box

    View Slide

  110. let’s give it a try!
    https://rafalab.github.io/dsbook/
    Sections 75 - 79

    View Slide