RNAseq: A five course meal

RNAseq: A five course meal

These slides are from a talk I gave as part of a two-week next-generation sequencing analysis workshop (https://angus.readthedocs.io/en/2019/). In this talk, I provide a framework for understanding RNA-seq workflows that is based on the R for Data Science workflow of "import, tidy, transform, visualize, model, and communicate" mantra (https://r4ds.had.co.nz/). I also provide strategies for "healthy" workflows by making analogies to a five-course meal.

359f7070cb587948e7da4e1028f5fc41?s=128

Rayna M Harris

July 05, 2019
Tweet

Transcript

  1. 3.

    Special thanks to the Data Intensive Biology Lab and the

    Birds, Brain, & Banter Lab http://calisilab.ucdavis.edu/ http://ivory.idyll.org/lab/ 3
  2. 4.

    I learned R and RNAseq in communities of practice The

    University of Texas at Austin, University California, Davis, Data Carpentry, Software Carpentry, The Carpentries-es, @cienciaPR @RLadiesGlobal @RLadiesBA @r4ds_es #DISBI2018 4
  3. 10.

    Data snacks and source code `library()` `source()` `data()` Substantial, complex,

    noteworthy, valuable Provides insights into what follows 10
  4. 19.

    “You can’t have any pudding if you don’t eat yer

    meat. How can you have any pudding if you don’t eat yer meat?” Pink Floyd Dessert 19
  5. 20.

    “Communication breakdown, it’s always the same I’m having a nervous

    breakdown, drive me insane.” Led Zeppelin Nuts 20
  6. 23.

    If you were at a banquet, would you order? A.

    Lobster B. Steak C. Lobster and steak D. Other E. Lobster, steak, and other 23
  7. 24.

    If you were an RNAseq workflow, would you use: A.

    R B. Python C. R and Python D. Other E. An R/Python/other mix 24
  8. 25.

    FlavoRs of differential gene expression models library("DESeq2") dds <- DESeqDataSetFromMatrix(countData

    = cts, colData = coldata, design = ~ condition) results(dds, contrast=c("condition","B","A")) http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis 25
  9. 26.

    Strategies for a well-balanced RNAseq analysis R and/or Python or

    other library() data() counts dds colData Rmarkdown Jupyter GitHub pages ~ condition Wrangle Explore ggplot() matplotlib Communicate 26
  10. 35.

    My progress: from a novice with data and tools https://github.com/raynamharris/DissociationTest

    Harris, Kao, Alarcón, Hofmann, Fenton 2017 https://www.biorxiv.org/content/10.1101/153585v1 35
  11. 36.

    To a practitioner with reproducible workflows https://github.com/raynamharris/DissociationTest Harris, Kao, Alarcón,

    Hofmann, Fenton 2019 https://onlinelibrary.wiley.com/doi/10.1002/hipo.23095 36
  12. 40.

    Create functions to run all pairwise comparisons `contrast = c(“treatment”,

    “varB”, “varA”)` https://github.com/macmanes-lab/DoveParentsRNAseq/ 40
  13. 44.

    60% 20% 20% training set query set testing set Should

    we be analyzing all the data all the time? 44
  14. 47.

    Framework for understanding RNAseq workflows R and/or Python or other

    library() data() counts dds colData ggplot() matplotlib Rmarkdown Jupyter GitHub pages ~ condition 47
  15. 48.

    Strategies for creating healthy RNAseq workflows R and/or Python or

    other library() data() counts dds colData ggplot() matplotlib Rmarkdown Jupyter GitHub pages ~ condition Communicate Wrangle Explore Model 48