Slide 1

Slide 1 text

R/Bioconductor-powered Team Data Science Leonardo Collado-Torres Louise A. Huuki-Myers Joshua M. Stolz Nicholas J. Eagles + Geo Pertea https://lcolladotor.github.io/bioc_team_ds/ April 20, 2022 https://speakerdeck.com/lcolladotor/libd-ds-tldr

Slide 2

Slide 2 text

DNA Genotyping: PopTop Joshua M. Stolz

Slide 3

Slide 3 text

Benefit of TopMed ● TopMed offers a reference panel for far more snps (~300 million) ● The Rsq value for lower MAF is preserved in populations of african ancestry. ● Can this increased power be used to include lower MAFs? ● Should we just filter by Rsq? MAF: minor allele frequency Rsq: R squared

Slide 4

Slide 4 text

PopTop:NextFlow Pipeline ● Parallelizes computationally expensive tasks ● Allows for the automation of large jobs ● Documentation is available at: ○ https://research.libd.org/Topmed-Imputation-Pipeline/

Slide 5

Slide 5 text

Topmed Output: VCF Format

Slide 6

Slide 6 text

Future Works ● Working to make this modular with the upcoming LIBD Data Portal. ● Writing scripts to make delivering subsets of samples more time feasible. ● Continuing to maintain and update the documentation website to make it more robust and user friendly. @geo_pertea Geo Pertea @Nick-Eagles (GH) Nicholas J Eagles

Slide 7

Slide 7 text

Bulk RNA-seq Processing: SPEAQeasy Nicholas J. Eagles

Slide 8

Slide 8 text

Bulk RNA-seq Processing: Motivation and Challenges - Data processing should be uniform across time/ datasets, documented, and reproducible - What aligner was used for this dataset? - Did we use hg19 or hg38? What GENCODE release was used? - How did we decide which samples to trim, if any? - Which version of FastQC was used? - While many computational steps are involved before analyses (e.g. DE) are possible, data pre-processing should ideally not require technical expertise to apply

Slide 9

Slide 9 text

SPEAQeasy Workflow https://github.com/LieberInstitute/SPEAQeasy Raw sequencing reads R objects, ready for statistical analysis

Slide 10

Slide 10 text

Manuscript and Documentation https://doi.org/10.1186/s12859-021-04142-3 http://research.libd.org/SPEAQeasy/

Slide 11

Slide 11 text

Example Analysis - Demonstrate how to run SPEAQeasy on real data and work with its outputs - Use variant calling results to find and resolve identity issues originating from labelling mistakes - Perform a differential analysis after attaching experiment-specific sample metadata http://research.libd.org/SPEAQeasy-example @lahuuki Louise A Huuki-Myers @JoshStolz2 Joshua M Stolz

Slide 12

Slide 12 text

Configuration - Each user recommended to install SPEAQeasy separately - git clone [email protected] :LieberInstitute/SPEAQeasy.git - cd SPEAQeasy - bash install_software.sh "jhpce" - Control: - exact annotation files used (GENCODE/ Ensembl versions) - command-line arguments to software - how to trim samples, if at all - aligner (HISAT2/ STAR) and pseudo-aligner (kallisto/ salmon) /dcs04/lieber/ds2a/Data/CMC/Data/RNAseq/Raw/SPEAQeasy

Slide 13

Slide 13 text

Recent Improvements and Future Work Post-publication improvements: - Alignment-related optimizations resulting in reduction in disk space and computational time - Support for singularity leading to new Cardiff users and greatly expanding possible users (collaboration with Nick Clifton) - Added raw counts for transcripts in addition to TPM Future improvements: - Want to allow a user to only perform alignment, or only quantify transcripts - Reduce required memory when counting junctions to produce R objects @geo_pertea Geo Pertea

Slide 14

Slide 14 text

WGBS Processing: BiocMAP Nicholas J. Eagles

Slide 15

Slide 15 text

WGBS: Motivation and Challenges - Profile DNA methylation, a critical epigenetic modification, across the entire human genome - Differentially methylated regions (DMRs), e.g. between schizophrenia and controls - Methylation quantitative trait loci (MeQTLs) - Large data size (> 1B cytosines measured, ~2TB disk space per sample) - Careful choices must be made to fit files on JHPCE, generate/load results with available memory - Several steps are required before raw sequenced reads yield methylation proportions ready for analysis - Trimming, alignment to reference genome, extract methylation proportions, import to R

Slide 16

Slide 16 text

TL;DR Raw sequencing reads R objects, ready for statistical analysis https://github.com/LieberInstitute/BiocMAP

Slide 17

Slide 17 text

Manuscript and Documentation http://research.libd.org/BiocMAP/

Slide 18

Slide 18 text

Example Vignette https://github.com/LieberInstitute/BiocMAP/blob/master/ documentation/example_analysis/age_neun_analysis.pdf https://github.com/LieberInstitute/BiocMAP/blob/master/ documentation/example_analysis/example_analysis.pdf Price et al., BMC Genome Biology 2019 https://doi.org/10.1186/s13059-019-1805-1

Slide 19

Slide 19 text

Workflow in Practice - Datasets where we’ve applied BiocMAP: - 664 PsychENCODE Schizophrenia/control samples (Hippocampus, DLPFC, Caudate) - 20 PsychENCODE fetal samples - 2597 VA PTSD samples (year 1 and 2) - How long does it take to run? - ~3 months for 648 samples (second module) - ~2 weeks for 43 samples (both modules at JHPCE) - How much disk space is required? - ~2TB disk space per sample while generating; 1TB outputs

Slide 20

Slide 20 text

LIBD Data Portal Geo Pertea

Slide 21

Slide 21 text

LIBD Data integration • relational database tracking data assets at LIBD • linking LIMS to processed data • flexible database back-end & indexed file storage • unified web interface for data queries

Slide 22

Slide 22 text

brains histological samples extraction sequencing sequencing samples processing id brnum brint age sex race dx_id subjects id name subj_id region sdate samples id dataset_id s_id s_name sample_id protocol restricted numReads numMapped totalMapped overallMapRate ... mitoRate rRNA_rate totalAssignedGene exp_metadata exp_id dtype ftype G / E / J / T f_set_id f_data real [ ] version exp_data Experiment data flow H5 filesystem PostgreSQL database assay data Parquet

Slide 23

Slide 23 text

PostgreSQL relational database demographic data experiment metadata histological sample metadata genomic features (annotations) assay data id subj_id dnum sample_id panel_id batch_id call_rate p10gc, p50gc nPennCNV [ ] SUM16,SUM20 imputation data_path genotype location of data files on file system storage

Slide 24

Slide 24 text

Integration of PostgreSQL and R from back-end to front-end Leveraging R’s data processing and visualization capabilities SQL + R code: Front-end (web application) Back-end PostgreSQL server client selects dataset (sample metadata only) client receives results & plot data middleware (nodejs) retrieve sample data process large data output results SQL / R server returns results & baked plot data (plotly JSON ) srv16

Slide 25

Slide 25 text

sc/snRNA-seq Louise A. Huuki-Myers Joshua M. Stolz @mattntran Matthew N Tran With help from:

Slide 26

Slide 26 text

https://bioconductor.org/packages/3.14/SingleCellExperiment

Slide 27

Slide 27 text

https://doi.org/10.1038/s41592-019-0654-x https://bioconductor.org/books/release/OSCA @stephaniehicks Stephanie C Hicks

Slide 28

Slide 28 text

Quality control + normalization ● emptyDrops() from DropletUtils ○ Determine the empty droplets ● isOutlier() from scran ○ Identify outlier cells/nuclei based on mitochondrial expression and other metrics ● devianceFeatureSelection()+ nullResiduals() from scry ○ GLM-PCA approximation by Townes, Hicks, Ayree, and Irizarry https://doi.org/10.1186/s13059-019-1861-6 ● reduceMNN() from batchelor ○ Batch correction since sc/snRNA-seq has strong sample effects ● + much more before you get to annotated clusters of cells @mattntran Matthew N Tran @Erik-D-Nelson (GH) Erik D Nelson

Slide 29

Slide 29 text

1vAll Markers vs. Mean Ratio Markers 29 https://research.libd.org/DeconvoBuddies/ @lahuuki Louise A Huuki-Myers

Slide 30

Slide 30 text

Deconvolution Louise A. Huuki-Myers

Slide 31

Slide 31 text

● Inferring the composition of different cell types in a bulk RNA-seq data What is Deconvolution? Tissue Bulk RNA-seq snRNA-seq Estimated proportions 31 Deconvolution Get single cell like information from bulk RNA-seq $$$ $ Free! https://twitter.com/BoXia7/status/1261464021322137600

Slide 32

Slide 32 text

Mean Proportions By Region: Tran et al, bioRxiv, 2020 (5 donors, 6 cell types)

Slide 33

Slide 33 text

Peric = Mural + Endo Mean Proportions By Region: Tran et al, Neuron, 2021 (8 donors, 10 cell types)

Slide 34

Slide 34 text

● Bisque has more similar pattern of composition over regions vs. SPLITR ● MuSiC predicts large proportions of Endo + Mural (Peric) ● Both estimate lower proportions of Excit ○ MuSiC is more extreme and also predicts low portion Inhib Bisque & MuSiC vs SPLITR Different deconvolution methods, bulk RNA-seq data source, marker genes, and reference snRNA-seq data

Slide 35

Slide 35 text

● Run with set of 20 & 25 marker genes per cell type ● Bisque is more robust to changes in the marker set than MuSiC Method Sensitivity to Marker Set 25 vs. 20 Genes Currently Bisque is our method of choice

Slide 36

Slide 36 text

Dataset Regions Samples Case Control Analysis Publication BipSeq sACC + AMY 511 247 BPD 264 Revisions Zandi et al., Nat. Neurosci, 2022 Suicide Genomics DLPFC 329 226 103 Revisions Punzi et al., American Journal of Psychiatry, 2022 BrainSeq Phase III Caudate 464 298 SCZD 266 Revisions Benjamin et al., Nature Neuroscience, 2022 MDDseq sACC + AMY 1091 704 MDD/BPD 387 Main In Progress AANRI DG, Caudate, Hippo, DLPFC 1647 (263, 464, 447, 453) - - Main In Progress Astellas AD Main In Progress BrainSeq Phase I DLPFC 727 395 SCZD 332 Exploratory - BrainSeq Phase II DLPFC 453 153 SCZD 300 Exploratory - GTEx 13 Regions 2670 - - Exploratory - Degradation AMY, Caudate, DLPFC, HIPPO, mPFC, sACC 119 - - Exploratory -

Slide 37

Slide 37 text

Upcoming: Deconvolution Methods Benchmark ● Goal: determine the most accurate deconvolution method for brain bulk RNA-seq data ○ Test available softwares (Bisque, MuSiC, and others) over a variety of conditions ■ Reference set qualities ■ Marker Genes selection ■ Preparation of the bulk data ● Requires: A “gold standard” cell type composition reference to measure performance ○ snRNA-seq can be enriched for certain cell types ○ smFISH + RNAscope allows “direct” measurement from intact tissue, will be used to establish true composition

Slide 38

Slide 38 text

Bulk RNA-seq Goals for RNAscope Experiment ● Deconvolution R01 MH123183 ○ Kristen Maynard, Stephanie C Hicks ● Use six slices of DLPFC to generate corresponding RNA-seq & RNAscope data ● This information will be useful to evaluate and design deconvolution algorithms DLPFC Bulk RNA-seq snRNA-seq Spatial RNAscope RNAscope 38 polyA RiboZero @kr_maynard Kristen R Maynard @stephaniehicks Stephanie C Hicks Kelsey D Montgomery

Slide 39

Slide 39 text

What is a TREG? ● Total RNA Expression Gene ● Expression is proportional to the overall RNA expression in a cell ● In smFISH the count of TREG puncta in a cell can estimate the RNA content ○ Linking RNA content to nucleus size http://research.libd.org/TREG/ http://bioconductor.org/packages/TREG/

Slide 40

Slide 40 text

eQTLs Louise A. Huuki-Myers

Slide 41

Slide 41 text

Key inputs ● Genotype Data ○ Consider minor allele frequency ○ Full topMed imputed SNP data set ○ Risk SNP subset ● Expression Data ○ Gene, exon, junction, transcript ○ Position of the feature ● Covariates Data ○ Phenotype data: Dx, Age, Sex ○ Feature PCs ● Interaction Data ○ Example: cell fractions from deconvolution ● Parameters ○ Window size ○ Minor allele frequency PopTop Genotype data SPEAQeasy generated Summarizedexperiment TensorQTL + parameters Deconvolution or other analysis Covariate Data as matrix Plink files containing SNPs of interest Interaction vector Feature position + expression matrix Only for interaction analysis eQTL results

Slide 42

Slide 42 text

MatrixEQTL vs tensorQTL (fastQTL) MatrixEQTL ● R package ● Many Andrew E Jaffe analyses: ○ BrainSEQ Phase II ○ Burke et al stem cell ○ … ○ BipSeq by Zandi et al tensorQTL ● Python, GPU enabled ● Currently utilized in MDDseq project ● Recommended upgrade by Andrew Jaffe, utilized by other LIBD researchers ● github.com/broadinstitute/tensorqtl https://youtu.be/zOMU XYHtVJM

Slide 43

Slide 43 text

Genome-wide eQTLs: several flavors ● Nominal: evaluate all pairs ● Cis: find most significant pair per feature ● Independent: conditionally independent cis-QTLs using stepwise regression

Slide 44

Slide 44 text

tensorQTL at JHPCE (GPU-powered) Data Formatting ● Genotype Data ○ Needs .bed/.bim/.bam files ● Expression Data ○ Gene, exon, junction, transcript ○ As .bed.gz ● Covariates Data ○ Phenotype data: Dx, Age, Sex, Feature PCs ■ Categorical variables must be converted to numeric ○ File type flexible, need to read in as pandas.DataFrame How to Run on GPU ● Can be used as a function in python script or as command line tool ○ Requires conversion to correct data formats ● Fast when run on GPU ○ Completed MDDseq Amygdala Gene analysis in 2.52 min vs 51.21 min on CPU (vs. 288 min matrixEQTL) ■ 540 samples x 53.6M pairs ● Use GPU queue when submitting job ○ Example sh file #$ -l gpu,mem_free=50G,h_vmem=50G,h_fsize=100G

Slide 45

Slide 45 text

GWAS-loci eQTL analysis ● Subset genotype dataset to SNPs identified as risk loci by GWAS ● Check for association with cellular fractions predicted by deconvolution ○ Run nominal analysis w/ addition of interaction vector ○ Adds interaction term to the model ■ p ~ g + i + gi PGC Major Depressive Disorder GWAS Wray et al. Nature Genetics, 2018 Deconvolution Results

Slide 46

Slide 46 text

Interaction eQTLs with cell type proportions https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/tree/master/eqtl/code

Slide 47

Slide 47 text

Quality Surrogate Variable Analysis (qSVA) Joshua M. Stolz

Slide 48

Slide 48 text

Differential expression is confounded by degradation The t-statistics between SCZ vs Control and degradation time DE are correlated. Traditional methods (like RIN) fail to remove this affect. Jaffe AE, Tao R, Norris AL, Kealhofer M, Nellore A, Shin JH, et al. qSVA framework for RNA quality correction in differential expression analysis. Proc Natl Acad Sci U S A. 2017;114:7130–5.

Slide 49

Slide 49 text

qSVA Original Process Each sample was allowed to degrade on a bench for 0,15,30,60 minutes. From this we get the top 1000 expressed regions associated with degradation. Peterson, Amy. “Quality Surrogate Variable Analysis.” LIBD Rstats Club, LIBD Rstats Club, 11 Dec. 2018, research.libd.org/rstatsclub/2018/12/11/quality-surrogate-variable-analysis/

Slide 50

Slide 50 text

Updated pipeline 2000

Slide 51

Slide 51 text

Degradation is confounded by Region

Slide 52

Slide 52 text

Deconvolution @lahuuki Louise A Huuki-Myers

Slide 53

Slide 53 text

Deconvolution in Degradation Matrix ● Identify 2,976 degradation associated transcripts with cell proportion terms in model (vs. 1,792) ● Controlling expression for qSVs predicted with this set of transcripts shows lower correlations between DE results and degradation statistic (desired result) Cor = -0.091 Cor = -0.051

Slide 54

Slide 54 text

http://research.libd.org/qsvaR http://bioconductor.org/packages/qsvaR/ @HeenaDivecha Heena R Divecha With ongoing feedback on the documentation from:

Slide 55

Slide 55 text

Differential Gene Expression Louise A. Huuki-Myers

Slide 56

Slide 56 text

Key inputs ● Quality Controlled Expression Data ● Model & corresponding data ○ Primary Dx (explanatory variable) ○ Phenotype data (AgeDeath, Sex) ○ Quality control metrics ■ mitoRate, rRNA_rate, totalAssignedGene, RIN, ERCC ○ snpPCs ■ from DNA Genotyping ○ qSVs ■ from qSVA v1 or v2 (qsvaR) ○ Other Analysis ■ E.x. Deconvolution cell fractions ~Dx + pd + QC + snpPC + qSVs + ? Model Deconvolution or other analysis qsvaR Degradation matrix PopTop Genotype data SPEAQeasy generated Summarizedexperiment Model Matrix Normalized counts limma + voom process lmFit() eBayes() topTable() DE Results calcNormFactors() model.matrix()

Slide 57

Slide 57 text

Modeling with limma: quick overview ● calcNormFactors() from edgeR ○ For normalization of the bulk RNA-seq counts ● model.matrix() from stats ○ Define how you want to model gene expression ○ Covariates like qSVs, ancestry PCs, SPEAQeasy QC metrics, sex, age, diagnosis, … ● lmFit() ○ Fit the linear regression model for all genes ● eBayes() ○ Use empirical Bayes to compute the statistics ● topTable() ○ Extract results for downstream analyses More details at http://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/limmaWorkflow.html edgeR, DESeq2, dream are good alternatives

Slide 58

Slide 58 text

Why limma + voom? ● Limma utilizes linear regression vs. DESeq2 utilizes negative binomial distribution ○ Comparable results ○ Limma is less computationally expensive (faster) ● Other methods: ○ DREAM: linear mixed effect model (Hoffman et al. Bioinformatics, 2021) ■ More precise but computationally expensive ■ May be better for analysis with small sample sizes

Slide 59

Slide 59 text

https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/blob/master/differential_expression/cod e/run_DE.R Example DE code with limma

Slide 60

Slide 60 text

Interactively explore your model.matrix http://bioconductor.org/packages/ ExploreModelMatrix https://doi.org/10.12688/f1000re search.24187.2 ● Important for exploring the DE model

Slide 61

Slide 61 text

Adding Cell Fraction to DE Model ● Including Deconvolution can result in a more conservative model ○ For the most part similar t-statistics ○ Fever significant DE genes

Slide 62

Slide 62 text

Spatially-resolved transcriptomics Leonardo Collado-Torres With help from: @abspangler Abby Spangler @lmwebr Lukas M Weber @stephaniehicks Stephanie C Hicks @MadhaviTippani Madhavi Tippani @sowmyapartybun Sowmya Parthiban @HeenaDivecha Heena R Divecha @PardoBree Brenda Pardo @Nick-Eagles (GH) Nicholas J Eagles @martinowk Keri Martinowich @kr_maynard Kristen R Maynard @CerceoPage Stephanie C Page

Slide 63

Slide 63 text

63 SpatialExperiment: infrastructure for spatially resolved transcriptomics data in R using Bioconductor Righelli, Weber, Crowell, et al, bioRxiv, 2021 Accepted at Oxford Bioinformatics on 04/19/2022 DOI 10.1101/2021.01.27.428431 Dario Righellli Helena L Crowell @drighelli @CrowellHL Lukas M Weber @lmwebr

Slide 64

Slide 64 text

bioconductor.org/packages/spatialLIBD Pardo et al, bioRxiv, 2021 DOI 10.1101/2021.04.29.440149 Accepted at BMC Genomics on 04/20/2022 Maynard, Collado-Torres, Nat Neuro, 2021 Brenda Pardo Abby Spangler @PardoBree @abspangler

Slide 65

Slide 65 text

http://research.libd.org/spatialLIBD/articles/TenX_data_download.html

Slide 66

Slide 66 text

OSTA: https://lmweber.org/OSTA-book/ @lmwebr Lukas M Weber @lcolladotor Leonardo Collado-Torres @abspangler Abby Spangler @HeenaDivecha Heena R Divecha @MadhaviTippani Madhavi Tippani @stephaniehicks Stephanie C Hicks

Slide 67

Slide 67 text

67 Data-driven clustering: BayesSpace Zhao et al, Nature Biotechnology, 2021 https://doi.org/10.1038/s41587-021-00935-2

Slide 68

Slide 68 text

Spatial registration of your sc/snRNA-seq data Your sc/snRNA-seq data Our spatial data Hodge et al, Nature, 2019 Maynard, Collado-Torres, Nat Neuro, 2021

Slide 69

Slide 69 text

Spot deconvolution: Tangram https://www.nature.com/articles/s41592-021-01264-7/figures/1 @Nick-Eagles (GH) Nicholas J Eagles

Slide 70

Slide 70 text

Our Philosophy + Getting Help

Slide 71

Slide 71 text

Share knowledge openly ● As an independent researcher, my team and I are not a data science core, yet we share our knowledge openly so others can get up to speed if needed ○ Share research results early through pre-prints (bioRxiv) ○ Share code on GitHub with others ■ GitHub is widely used as a social coding platform ○ Share code snippets that might be useful to others ○ Share our experiences ○ Maintain and share information on several Slack communication channels ○ People are free to adapt what we have done and we would love to learn about what others have come up with, since we might need to update/change our own work ■ We do not impose solutions or make decisions for others

Slide 72

Slide 72 text

Different types of help ○ Things we can do ■ Guidance, feasibility, and/or brainstorming ■ Data processing like with bulk RNA-seq with SPEAQeasy, DNA genotyping with PopTop, WGBS with BiocMAP, … ● We would strongly prefer that others learn how to run these tools ○ Aka, please use the documentation we wrote =) ■ Sharing data with external collaborators ● After internal LIBD approval by Rujuta Narurkar ○ Things that are beyond what we can typically do ■ Lead analysis ■ Develop and/or maintain custom solutions ■ Write papers

Slide 73

Slide 73 text

Data Science guidance sessions (DSgs) ● https://lcolladotor.github.io/bioc_team_ds/data-science-guidance-sessions.html ○ JHPCE ○ R ○ Bioconductor ○ Understanding code we wrote ○ Training others on how they can more effectively get help from us or others ■ Providing reproducible examples: reprex in R https://reprex.tidyverse.org/ which provides a solution to “help me help you” ■ Framing questions and software bug reports ● The DSgs system works best over the long term ○ It’s based on my 3 yr experience as an JHBSPH MpH capstone teaching assistant

Slide 74

Slide 74 text

https://jhpce.jhu.edu/knowledge-base/knowledge-base-articles-from-lieber-institute/ Join us Fridays at 9 AM (check the code of conduct please!)

Slide 75

Slide 75 text

https://www.youtube.com/c/LeonardoColladoTorres/playlists Videos allow us to multiply ourselves We can make you custom selections of videos for a specific problem on DSgs sessions

Slide 76

Slide 76 text

https://github.com/

Slide 77

Slide 77 text

https://github.com/LieberInstitute Email Bill Ulrich your GitHub username to get added @ckbehemoth (GH) William S Ulrich

Slide 78

Slide 78 text

https://github.com/search?q=org%3ALieberInstitute Example question: How do you use aggregateAcrossCells() ?

Slide 79

Slide 79 text

https://github.com/search?q=org%3ALieberInstitute+aggregateAcrossCells&type=code GitHub is our library / encyclopedia It could be yours / LIBD’s too! Code is the ultimate documentation Git commit messages remind you of what you were thinking when you made a change Bill Ulrich or Leo can give you access

Slide 80

Slide 80 text

Project 1 ● https://github.com/LieberInstitute/HumanPilot/blob/ master/Analysis/Layer_Guesses/layer_specificity.R ● https://github.com/LieberInstitute/HumanPilot/blob/ master/Analysis/Layer_Guesses/asd_snRNAseq_re cast.R Project 2 ● https://github.com/LieberInstitute/spatialDLPFC/blo b/main/code/analysis/07_spatial_registration/07_sp atial_registration.R Project 3 ● https://github.com/LieberInstitute/Visium_IF_AD/blo b/master/code/10_spatial_registration/01_spatial_r egistration.R On the horizon: A new function at https://github.com/LieberInstitute/spatialLIBD @sowmyapartybun Sowmya Parthiban @abspangler Abby Spangler Code is constantly adapted and improved, both within and across projects Spatial registration code example It’s hard to keep track of code evolution ● Try to include comments linking back to where you adapted it from Basically: divide and conquer ^^

Slide 81

Slide 81 text

@lcolladotor Leonardo Collado-Torres @lahuuki Louise A Huuki-Myers @JoshStolz2 Joshua M Stolz @Nick-Eagles (GH) Nicholas J Eagles @geo_pertea Geo Pertea @abspangler Abby Spangler @mattntran Matthew N Tran @lmwebr Lukas M Weber @stephaniehicks Stephanie C Hicks @MadhaviTippani Madhavi Tippani @sowmyapartybun Sowmya Parthiban + Many more LIBD, JHU, and external collaborators @PardoBree Brenda Pardo @HeenaDivecha Heena R Divecha @ckbehemoth (GH) William S Ulrich @martinowk Keri Martinowich @kr_maynard Kristen R Maynard @CerceoPage Stephanie C Page