Statistical and Geometric Methods for the Analysis of Sequence Count Data

STATISTICAL AND GEOMETRIC METHODS FOR THE ANALYSIS OF SEQUENCE COUNT
DATA JUSTIN D. SILVERMAN MEDICAL SCIENTIST TRAINING PROGRAM COMPUTATIONAL BIOLOGY AND BIOINFORMATICS DUKE UNIVERSITY StatsAtHome.com inschool4life

Examples: ▸ 16s rRNA sequencing ▸ RNA-seq (± Single Cell)
▸ T-cell receptor sequencing Extended Applications   [Beyond Sequencing]: ▸ Multiparametric Flow Cytometry ▸ Political Polling WHAT IS SEQUENCE COUNT DATA? BACKGROUND Multivariate count data Yij representing the number of transcripts of type j sequenced in sample i

BACKGROUND HUMANS HARBOR TREMENDOUS DIVERSITY ▸ ~ 100 trilion bacteria
colonize epithelial surfaces ▸ 1-10X the number of human cells ▸ Each person hosts ~250 gut bacterial taxa (roughly the number of species in North Carolina Zoo) ▸ Highly Dynamic! Nature Reviews | Microbiology External auditory canal Gastrointestinal tract Hair on the head Nostril Skin Firmicutes Actinobacteria Bacteroidetes Cyanobacteria Fusobacteria Proteobacteria Mouth Penis Vagina Oesophagus Variations in those host genes that contribute to proper- ties of the gut habitat therefore have strong potential to affect the variation in the microbiome. Evidence to sup- port a contribution of host genetics to the diversity of the microbial community has been scarce, so the strength of the effect is controversial. However, an increasing number of studies are now evaluating this effect, and the analysis of host genetics is just beginning to be incorpo- rated into studies of how the diversity of the gut bacteria relates to host susceptibility to disease. In this Review, we describe how environmental factors can contribute to variation in the diversity and composition of the microbiota, and we explore the role of host genes in this process. We also highlight an emerg- ing view of the microbiota: one in which the microbiota itself may be considered as a complex trait that is under host genetic control and that interacts with environmental and host factors in a number of chronic inflammatory diseases. Environmental impact on the microbiota To measure the impact of host genetics on microbial diversity, it is useful to have an understanding of the factors that can influence variation in the microbiota in the absence of host genetic variation, as these environmental factors constitute the ‘noise’ that can mask host genetic effects. Model organisms provide a system for controlling variation between identical hosts: genetically inbred animals act as replicate hosts, allowing the impact of environmental factors on the variation in the microbiota to be assessed. Mice are useful models for studies of human microbial ecology because the intestines of mice harbour communities that are grossly similar in composition (that is, have similar phylum and family level abundances) to those of human intestines, diverging mainly at the genus level (BOX 1). Husbandry conditions can be standardized across mice, and experiments can Figure 1 | Microbial community composition at different body locations in a healthy human. The relative abundances of the six dominant bacterial phyla in each of the different body sites: the external auditory canal (nine subjects), the hair on the REVIEWS REVIEWS Spor, Koren & Ley Nat Rev Micro, 2011

BACKGROUND EXAMPLE QUESTIONS OF INTEREST Questions: •How fast does composition
change? •Where is most of the temporal variation?   (Partitioning Variation)

▸ Framing sequence count data as Count Compositional ▸ Multinomial
Logistic-Normal Model as a framework for analyzing sequence count data ▸ MALLARD for analysis of longitudinal sequence count studies ▸ General Inference Method based on marginalization using Kalman Filter ▸ +experimental design for partitioning technical and biological variation ▸ Total Augmentation Methods to quantify compositional effects as well as address Differential Expression and Correlation Analysis OVERVIEW KEY CONTRIBUTIONS

FRAMING

FRAMING SEQUENCE COUNT DATA DATA COLLECTION AND SAMPLE PROCESSING Adapted
from Hamady. et al., Nature Methods, 2008 Sample Collection   and Storage DNA Extraction  PCR Ampliﬁcation Sequencing BIOLOGICAL VARIATION AND SIGNAL TECHNICAL VARIATION AND BIAS TECHNICAL VARIATION AND BIAS RANDOM SAMPLING RANDOM SAMPLING RANDOM SAMPLING

FRAMING SEQUENCE COUNT DATA SAMPLE POOLING Preformed after PCR Ampliﬁcation
and before sequencing Sample 1 Sample 2 Sample 3 Barcoded DNA after PCR DNA Quantiﬁcation Subsampling Pooling

Group 1 Group 2 Group 3 5 10 15 20
1000 2000 3000 4000 5000 200 400 600 50 100 150 200 250 Time Counts Group 1 Group 2 Group 3 5 10 15 20 100 200 300 400 50 100 25 50 75 100 Time Counts Time Counts A B True Abundance Abundance after Random Sampling FRAMING SEQUENCE COUNT DATA IMPACT OF MULTIVARIATE RANDOM SAMPLING

Group 1 Group 2 Group 3 5 10 15 20
1000 2000 3000 4000 5000 200 400 600 50 100 150 200 250 Time Counts Group 1 Group 2 Group 3 5 10 15 20 100 200 300 400 50 100 25 50 75 100 Time Counts Time Counts A B True Abundance Abundance after Random Sampling FRAMING SEQUENCE COUNT DATA IMPACT OF MULTIVARIATE RANDOM SAMPLING NO PROPORTIONS, NOT CLASSICALLY COMPOSITIONAL BUT SAMPLING CAUSES COMPOSITIONAL-LIKE EFFECTS

GENERAL MODELING APPROACH

CENTRAL THEME MULTINOMIAL-LOGISTIC NORMAL •Handles sampling zeros, ≈ biological zeros,
and conventional "Dropouts" •Allows positive and negative covariation between taxa •Models Multiplicative Errors ILR = "Isometric Log-Ratio" Transform

PHILR Silverman, et al., eLife 2017 C D Balances Depicted
on Phylogenetic Tree Transform in Simplex Balance of Bacteroides to Ruminococcus and Lactobacillus Balance of Ruminococcus to Lactobacillus y2 * y1 * y2 * y1 * y2 * y1 * Lactobacillus Ruminococcus Bacteroides y1 * y2 * Lactobacillus Ruminococcus Bacteroides y1 * y2 * Community A Community B Bacteroides (%) Lactobacillus (%) Ruminococcus (%) B Observed Compositions Bacteroides (%) Lactobacillus (%) Ruminococcus (%) E Data Embedded in PhILR Space A Unobserved Absolute Abundances Community A Community B Lactobacillus (#) Bacteroides (#) Ruminococcus (#) PHYLOGENIC ISOMETRIC LOGRATIO (PHILR) TRANSFORM

TIME-SERIES MODELING

BUILDING A FRAMEWORK CLASSIC LINEAR GAUSSIAN STATE SPACE MODEL State
Observations State Observations Initialization

BUILDING A FRAMEWORK CLASSIC LINEAR GAUSSIAN STATE SPACE MODEL θ0
θ1 θ2 ... θT State Observations State Observations Initialization

BUILDING A FRAMEWORK CLASSIC LINEAR GAUSSIAN STATE SPACE MODEL θ0
θ1 θ2 ... θT W1 W2 WT State Observations State Observations Initialization

BUILDING A FRAMEWORK CLASSIC LINEAR GAUSSIAN STATE SPACE MODEL Y1
Y2 YT θ0 θ1 θ2 ... θT W1 W2 WT State Observations State Observations Initialization

BUILDING A FRAMEWORK CLASSIC LINEAR GAUSSIAN STATE SPACE MODEL Y1
Y2 YT θ0 θ1 θ2 ... θT V1 V2 VT W1 W2 WT State Observations State Observations Initialization

BUILDING A FRAMEWORK DYNAMIC LINEAR MODEL State Observations Priors Y1
Y2 YT θ0 θ1 θ2 ... θT V1 V2 VT W1 W2 WT State Observations

BUILDING A FRAMEWORK MODELING TIME-EVOLUTION Y1 Y2 YT η1 η2
ηT θ0 θ1 θ2 ... θT V1 V2 VT W1 W2 WT True State with Biological Noise Observed Counts Addition of Technical Noise ILR Silverman et al., bioRxiv 2018 True State with Biological Noise Addition of Technical Noise Observed Counts Priors

BUILDING A FRAMEWORK THE COMPUTATIONAL BOTTLENECK 20 Taxa with 650
Samples   As measured by Time to Effective Sample size of 2000 ▸ Metropolis-within-Gibbs → >2 months ▸ Adaptive Hamiltonian MCMC → >2 days ▸ Marginalize State Space → ≈ 2-10 hours ▸ Model Assumptions and Simpliﬁcations → ≈ minutes

BUILDING A FRAMEWORK MODELING TIME-EVOLUTION Deﬁne: Inference Goal: Composition with
Technical Variation System State Covariance of Technical Variation Covariance of Temporal Evolution ("Biological Variation") Observed Counts and   Covariates

BUILDING A FRAMEWORK COLLAPSED SAMPLING USING THE KALMAN FILTER Goal:

Term 2: Using 1st order   Markov Structure Arbitrary Prior 1-step ahead predictive  densities calculable by   Kalman ﬁlter Product of Multinomial  densities

Term 2: Using 1st order   Markov Structure Arbitrary Prior 1-step ahead predictive  densities calculable by   Kalman ﬁlter Product of Multinomial  densities Term 1: Can be sampled from directly   using Backwards Sampling algorithm  (aka Kalman Smoother) Θ ⊥ Y | H

BUILDING A FRAMEWORK KALMAN FILTER Conditional Model

BUILDING A FRAMEWORK KALMAN FILTER (a) Posterior at step t-1
Conditional Model

Conditional Model (b) Prior at step t

Conditional Model (b) Prior at step t (c) One-step forecast at step t

Conditional Model (b) Prior at step t (d) Posterior at step t (c) One-step forecast at step t

Conditional Model Quantities Stored from   Kalman Filter for Sampling (b) Prior at step t (d) Posterior at step t (c) One-step forecast at step t

BUILDING A FRAMEWORK KALMAN SMOOTHER / BACKWARDS SAMPLING ALGORITHM Conditional
Model Quantities Stored from   Kalman Filter for Sampling

Model KALMAN SMOOTHER Quantities Stored from   Kalman Filter for Sampling

Model KALMAN SMOOTHER (a) For step T (b) For step 0 ≥ t > T Quantities Stored from   Kalman Filter for Sampling

Model KALMAN SMOOTHER (a) For step T (b) For step 0 ≥ t > T Quantities Stored from   Kalman Filter for Sampling BACKWARDS SAMPLING

Model KALMAN SMOOTHER (a) For step T (b) For step 0 ≥ t > T Quantities Stored from   Kalman Filter for Sampling BACKWARDS SAMPLING (a) For step T (b) For step 0 ≥ t > T

Sampling Algorithm: Step 1: Sample using adaptive HMCMC marginalizing over Θ using Kalman Filter Step 2: Sample using Backwards sampling (aka Kalman Smoother)

Sampling Algorithm: Step 1: Sample using adaptive HMCMC marginalizing over Θ using Kalman Filter Step 2: Sample using Backwards sampling (aka Kalman Smoother) Summary:

Sampling Algorithm: Step 1: Sample using adaptive HMCMC marginalizing over Θ using Kalman Filter Step 2: Sample using Backwards sampling (aka Kalman Smoother) Summary: • By inverting the "classic" Metropolis within Gibbs sampler we can take advantage of adaptive Hamiltonian MCMC

Sampling Algorithm: Step 1: Sample using adaptive HMCMC marginalizing over Θ using Kalman Filter Step 2: Sample using Backwards sampling (aka Kalman Smoother) Summary: • By inverting the "classic" Metropolis within Gibbs sampler we can take advantage of adaptive Hamiltonian MCMC • Θ (typically very high dimensional) can be removed from HMCMC and instead sampled directly using recurrence relationships of Kalman Smoother (very fast)

BUILDING A FRAMEWORK Multinomial Logistic-Normal Dynamic Linear Models

BUILDING A FRAMEWORK MALLARD Multinomial Logistic-Normal Dynamic Linear Models www.audubon.org

PARTITIONING BIOLOGICAL AND TECHNICAL VARIATION

SIMULATED AND REAL DATA AN EXAMPLE STUDY DESIGN Silverman et
al., bioRxiv 2018 STANDARD LONGITUDINAL MODEL CONDITION TO HANDLE REPLICATES 28 DAILY SAMPLES 120 HOURLY SAMPLES 20 REPLICATE  SAMPLES 4x • Mixed frequency to address potential signal aliasing • Replicate samples to identify and partition technical vs. biological variation.

SIMULATED DATA Silverman et al., bioRxiv 2018 Silverman et al.,
  eLife 2017 

SIMULATED DATA Silverman et al., bioRxiv 2018

REAL DATA Silverman et al., bioRxiv 2018

BUILDING A FRAMEWORK Silverman et al., bioRxiv 2018 0.6 0.2
0.2 0.4 0.6 seq_10 0.2 0.4 0.6 0.8 0.2 • 0 1 2 3 3.5 0 5 10 15 20 25 as.integer(lag) p50 Total Variation Sampling Interval (Hours) Biological Technical B aceae Bact Lachnospiraceae V: Technical Variation W+V • Biological Variation  varies with sampling   interval • Technical Variation does not • "Break Even point" for   experimental design THE EFFECT OF SAMPLING INTERVAL

REAL DATA THERE ARE SUB-DAILY DYNAMICS A Vessel 3 Vessel
4 Vessel 1 Vessel 2 D ay 21 D ay 22 D ay 23 D ay 24 D ay 25 D ay 21 D ay 22 D ay 23 D ay 24 D ay 25 1.6 2.0 2.4 2.8 1.5 2.0 2.5 3.0 1.50 1.75 2.00 2.25 1.6 2.0 2.4 2.8 Balance Value (e.i.) B Bacteroidetes Proteobacteria Fusobacteria – + Balance Silverman et al., bioRxiv 2018

BUILDING A FRAMEWORK KEY CONTRIBUTIONS ▸ Multinomial Logistic-Normal Models confront
multiple challenges in the analysis of sequence count data (e.g., Count Composition or Zeros) ▸ MALLARD for analysis of longitudinal sequence count data ▸ Marginalized Inference Using Kalman Filter ▸ + Experimental design for partitioning technical and biological variation - "beyond detection, characterization and correction"

TOTAL AUGMENTATION METHODS

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION KEY QUESTIONS ▸ What is
the relationship between compositional data analysis and standard practice geometrically? ▸ When will compositional effects lead to spurious conclusions? ▸ Can we use these insights to improve current practice and compositional analysis?

FRAMING SEQUENCE COUNT DATA ITS ALL ABOUT THE BIOLOGICAL TOTALS
These are the biological totals  - Not the Sequencing Depth  - Not Sparsity

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION RELATIVE TOTALS ▸ We may
not know NB or NA but may be able to quantify uncertainty in their ratios. ▸ Examples ▸ Bounds on maximal change in total microbial load in healthy individual over 24 hours. ▸ Maximum reasonable difference between any two healthy human adults ▸ Potentially from external measurements

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION THE "FULL" MODEL WE OFTEN
WANT ψ1j ψ2j The marginal of this model along the "Log Simplex" is a   Logistic Normal Linear Model ψ1j +ψ2j =0  "Log Simplex" ψ∥ ψ1j +ψ2j ψ⊥ True abundance

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION WE CAN PERFORM INFERENCE ON
THE FULL MODEL ▸ Given sequence count data and side information / assumptions on the relative totals ▸ We can perform efﬁcient inference on the full model up to a linear intercept ▸ We call this framework TRAMs (Total Relative Augmentation Models) and this particular model the TRAM-CLM (Conjugate Linear Model)

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION TRAM-CLM Nearly unlimited   ﬂexibility
here Marginal  along ψ∥ Full Model Relative Model Total Model

COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION RELATIVE TOTAL MODELS −2 0
2 Lots of potential distributions over these log-ratios.   However, it becomes more difﬁcult with more than 2 samples. α is the maximum bound on the log-ratio of totals   between any two samples

▸ Alpha≈ 0, Centered ▸ Deseq2 ▸ Aldex2 COMPOSITION: SENSITIVITY
AND TOTAL AUGMENTATION METHODS FOR DIFFERENTIAL EXPRESSION var 1 var 2 var 3 • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • •• • • • • • •• • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • •• •• •• • • • • • • • • • • • • • • •• • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • •• • • • • • • • • • • • • • • • • • •• • • • • • • • • •• • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • •• • • • • • • • • • • • • • •• • •• • • • •• • • • •• • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • •• •• • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • •• • • •• • •• • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • •• • • • • • • • • • • • • • •• • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • •• • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • •• • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• •• • • • • • •• • • • • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • •• • • • •• •• • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • •• • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • •• • • •• • • • • • • •• • • • • • • • • • •• • • • •• • • • • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • •• • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • •• • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • •• • • • • • •• • • • • • • • • • •• • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • • •• • • • • • NA NB NC

▸ Alpha= ∞ ▸ CoDA Practice COMPOSITION: SENSITIVITY AND TOTAL
AUGMENTATION METHODS FOR DIFFERENTIAL EXPRESSION NA NB NC var 1 var 2 var 3 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

▸ Alpha = 4, Centered ▸ "Tram Naive" COMPOSITION: SENSITIVITY
AND TOTAL AUGMENTATION METHODS FOR DIFFERENTIAL EXPRESSION NA NB NC var 1 var 2 var 3 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

▸ Alpha=1, Not Centered ▸ "Tram Weak Knowledge" COMPOSITION: SENSITIVITY
AND TOTAL AUGMENTATION METHODS FOR DIFFERENTIAL EXPRESSION NA NB NC var 1 var 2 var 3 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

EXPECTED RESULTS COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION • • •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • D A TN TWTWTW TWTW TWTWTW TW TW TWTWTW TWTWTWTWTW TWTW 100 1000 10000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Taxa Count Condition • • 1 2 A: Aldex2  D: Deseq2  TW: TRAM - Weak Knowledge  TN: TRAM - Naive

LESS EXPECTED RESULTS COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION • •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • D D D D D D D A A A A A A A A A A A A A TN TN TW TW TW 100 1000 10000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Taxa Count Condition • • 1 2 A: Aldex2  D: Deseq2  TW: TRAM - Weak Knowledge  TN: TRAM - Naive

INTRODUCTION TO ALPHA DIAGRAMS COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION q0=0.161
q0=0.491 q0=0.852 X1 X2 X3 −3 −2 −1 0 1 2 3 0.95 0.9 0.8 0.5 Summarize Posterior by Quantile of Zero Effect

ALPHA DIAGRAMS - NAIVE COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION 4
17 21 0.00 0.25 0.50 0.75 1.00 0 10 20 30 40 alpha ecdf(0) q0

ALPHA DIAGRAMS - WEAK KNOWLEDGE COMPOSITION: SENSITIVITY AND TOTAL AUGMENTATION
4 17 21 0.00 0.25 0.50 0.75 1.00 0 10 20 30 40 alpha ecdf(0) q0

BUILDING A FRAMEWORK OPEN QUESTIONS ▸ How can we build
better priors that align with biological knowledge? ▸ How to adapt this for RNA-Seq (what type of biological knowledge could be used)? ▸ How to develop similar sensitivity summaries for correlation analysis?

ACKNOWLEDGEMENTS ACKNOWLEDGEMENTS Duke University Lawrence David Sayan Mukherjee Rachael Bloom
Heather Durand Universitat de Girona Vera Pawlowsky-Glahn Wife and   Collaborator Rachel Silverman Funding Duke Collaborative Quantitative Approaches to Problems in the Basic and Clinical Sciences   Duke MSTP NIH T32 xkcd.com StatsAtHome.com inschool4life Universitat Politécnica   de Catalunya Juan José Egozcue

BUILDING A FRAMEWORK A NOTE ON PRIOR CHOICE All Log-Ratio
Transforms: Not so realistic prior: More realistic prior:

Statistical and Geometric Methods for the Analy...

Statistical and Geometric Methods for the Analysis of Sequence Count Data

More Decks by Justin Silverman

Other Decks in Science

Featured

Transcript