Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DFCI Introduction to R and RStudio

DFCI Introduction to R and RStudio

Patrick Kimes

October 01, 2019
Tweet

More Decks by Patrick Kimes

Other Decks in Education

Transcript

  1. Introduction to R/RStudio
    Patrick Kimes, PhD
    Postdoctoral Fellow
    Dana-Farber Cancer Institute
    Harvard TH Chan School of Public Health
    Top Ten Seminars
    October 1, 2019

    View Slide

  2. October 1, 2019
    October 22, 2019
    November 12, 2019
    December 10, 2019
    January 21, 2020
    February 11, 2020
    March 17, 2020
    April 14, 2020
    May 19, 2020
    June 16, 2020
    Introduction to R and RStudio
    Data visualization with ggplot2
    Data visualization principles and plots to avoid
    Design of Clinical Trials Basics
    Correlation: you are probably using it wrong
    How to detect and deal with batch effects
    Brief introduction to machine learning
    Culprits of the reproducibility crisis: multiple testing and p-hacking
    Experimental Design: How many size and should I pool?
    Detecting differentially expressed genes with RNA-seq
    top ten seminars
    in data science

    View Slide

  3. R and RStudio?

    View Slide

  4. R and RStudio?
    programming
    language
    R
    think Java, C,
    C++, Python, …

    View Slide

  5. RStudio
    programming
    language
    R and RStudio?
    software to make data
    analysis with R easier

    View Slide

  6. RStudio
    programming
    language
    R and RStudio?
    programming
    language
    R
    software to make data
    analysis with R easier

    View Slide

  7. View Slide

  8. RStudio
    engine
    R
    dashboard

    View Slide

  9. why
    R and RStudio?

    View Slide

  10. what about SAS?
    why
    R and RStudio?

    View Slide

  11. what about SAS?
    • R is free, open source
    • R is the home of new methods
    • R has a large, active community
    • R is highly interoperable, extensible
    why
    R and RStudio?

    View Slide

  12. what about Python?
    why
    R and RStudio?

    View Slide

  13. what about Python?
    • Good question! Up to you!
    • R is arguably easier to learn
    • R has more statistical tools
    • R makes exploration and visualization easier
    why
    R and RStudio?

    View Slide

  14. why
    R and RStudio?

    View Slide

  15. it gets you to
    the data fast!
    why
    R and RStudio?

    View Slide

  16. it gets you to
    the data fast!
    and that’s fun!
    https://twitter.com/avogado6/status/1165595520967954432

    View Slide

  17. who already has
    R / RStudio installed?

    View Slide

  18. how to install
    R and RStudio

    View Slide

  19. View Slide

  20. how to install
    R and RStudio
    1. Search “R”, 

    Search “RStudio”
    2. Install “R”, 

    Install “RStudio”

    View Slide

  21. how to install
    R and RStudio
    1. Search “R”, 

    Search “RStudio”
    2. Install “R”, 

    Install “RStudio”
    maybe a few more steps
    so please do this later
    a much better guide:
    rafalab.github.io/dsbook/installing-r-rstudio

    View Slide

  22. lucky us!
    we have a workaround!
    https://rstudio.cloud

    View Slide

  23. lucky us!
    we have a workaround!
    https://rstudio.cloud
    do this

    View Slide

  24. lucky us!
    we have a workaround!
    https://rstudio.cloud this too

    View Slide

  25. lucky us!
    we have a workaround!
    https://rstudio.cloud
    select
    [from Git repo]

    View Slide

  26. lucky us!
    we have a workaround!
    https://rstudio.cloud
    enter
    https://github.com/pkimes/dfci-introR

    View Slide

  27. welcome to
    RStudio!

    View Slide

  28. local RStudio select
    [New Project]

    View Slide

  29. local RStudio
    select
    [Version Control]

    View Slide

  30. local RStudio
    we’ll use [Git]

    View Slide

  31. local RStudio
    enter
    https://github.com/pkimes/dfci-introR

    View Slide

  32. we’re good to go!

    View Slide

  33. you now have a project!
    what’s an RStudio project?
    basically a folder to
    organize an analysis
    • input data
    • R scripts
    • results/figures

    View Slide

  34. coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    let’s give it a try!

    View Slide

  35. arithmetic
    variables
    functions
    help
    installing packages
    loading packages
    for-loops
    what did we
    (hopefully) cover?

    View Slide

  36. some pieces in the modern
    (R) data scientist’s toolbox

    View Slide

  37. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  38. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  39. .R file

    View Slide

  40. .R file .Rmd file

    View Slide

  41. .R file .Rmd file
    formal
    header

    View Slide

  42. .R file .Rmd file
    code
    “chunks”

    View Slide

  43. .R file .Rmd file
    plain text
    (markdown)

    View Slide

  44. .R file .Rmd file
    specified
    output format

    View Slide

  45. .Rmd file

    View Slide

  46. .Rmd file
    formatted text
    output!
    R code

    View Slide

  47. rmarkdown documentation, communication
    myfile.Rmd
    markdown
    +
    R code chunks

    View Slide

  48. rmarkdown documentation, communication
    myfile.Rmd
    markdown
    +
    R code chunks
    markdown
    myfile.md
    execute
    R code

    View Slide

  49. rmarkdown documentation, communication
    myfile.Rmd
    markdown
    +
    R code chunks
    markdown
    myfile.md
    execute
    R code
    pandoc
    conversion

    View Slide

  50. rmarkdown documentation, communication
    rmarkdown.rstudio.com

    View Slide

  51. coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    coding coding coding coding
    let’s give it a try!

    View Slide

  52. what did we
    (hopefully) cover?
    create a new Rmd file
    writing simple markdown
    creating code chunks
    executing code
    knitting documents

    View Slide

  53. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  54. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  55. tidyverse data manipulation, visualization
    tidyverse.org

    View Slide

  56. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  57. shiny web application framework
    shiny.rstudio.com

    View Slide

  58. shiny web application framework
    shiny.rstudio.com/gallery/kmeans-example

    View Slide

  59. some pieces in the modern
    (R) data scientist’s toolbox
    rmarkdown
    tidyverse
    shiny
    [bioconductor]
    documentation, communication
    data manipulation, visualization
    web application framework
    community of genomics packages

    View Slide

  60. bioconductor community of genomics packages
    bioconductor.org

    View Slide

  61. bioconductor community of genomics packages
    CRAN Bioconductor
    • genomic focus
    • software
    • annotations
    • data
    • package reviews
    • scope
    • consistency

    View Slide

  62. awesome!

    View Slide

  63. where do we go
    from here?

    View Slide

  64. where do we go
    from here?
    wait, I’m lost

    View Slide

  65. introduction to data science
    rafalab.github.io/dsbook

    View Slide

  66. RStudio Cloud tutorials
    rstudio.cloud

    View Slide

  67. r4ds.had.co.nz
    R for data science

    View Slide

  68. learn the tidyverse
    tidyverse.org/learn

    View Slide

  69. advanced R
    adv-r.hadley.nz

    View Slide

  70. biomedical data science open online training
    rafalab.github.io/pages/harvardx

    View Slide

  71. … and because it’s 2019, deep learning
    tensorflow.rstudio.com

    View Slide

  72. questions?
    wait, I’m still lost

    View Slide