Team lcolladotor Journal Club: Nonnegative Spatial Factorization

Team lcolladotor Journal Club: Nonnegative spatial factorization applied to spatial
genomics F. William Townes & Barbara E. Engelhardt Nature Methods, 2022 doi.org/10.1038/s41592-022-01687-w Presented by: Louise Huuki-Myers, Nov 20, 2024

Overview • Introduce NSF (nonnegative spatial factorization method) • Show
ability of nonnegative factorization to identify parts based representation in simulations • Benchmark dimension reductions methods on three spatial datasets ◦ Slide-seq, XYZeq, 10x VIsium

Nonnegative Matrix Factorization • “Parts based representation” • V =
W x H ◦ No negative components Image credit: https://en.wikipedia.org/wiki/Non-negative_matrix_factorization

Methods • FA (factor analysis) ◦ Ignores spatial context •
MEFISTO ◦ Spatially aware dimension reduction ◦ Velten et al., Nat Methods, 2022 • NSF (nonnegative spatial factorization) ◦ a model for spatially-aware dimension reduction using an exponentiated GP prior over the spatial locations with a Poisson or negative binomial likelihood for count data • NSFH (NSF hybrid model) ◦ that generalizes both NSF and probabilistic NMF to partition variability into both spatial and nonspatial sources • PNMF (Probabilistic NMF) • RSF (real-value spatial factorization)

Fig. 1 Simulation “To illustrate the ability of nonnegative models
to recover a parts-based factorization, we simulated multivariate count data from two sets of spatial patterns.” • “FA and RSF estimated latent factors consisting of linear combinations of the true factors” ⚠ • “Nonnegative models PNMF and NSF identiﬁed eachpattern as a separate factor” ✅ • “Leiden clustering accurately identiﬁed spatially disjoint patterns in the ggblocks simulation” ⚠

Simulation • 200 features randomly assigned to four shapes Spot
1 Spot 2 ... Spot n Gene 1 Gene 2 Gene 3 Gene 4 ... Gene 200

Benchmark • Three spatial datasets using different technology ◦ SlideSeq
- mouse hippocampus ◦ Visium - mouse brain ◦ XYZseq - mouse liver • 95%:5% Training:Testing split • Goodness of fit quantified with poisson distribution ◦ Observed counts in validation vs. predicted mean values from model fit ◦ Small deviance shows good fit ◦ Also used RMSE • Varying number of components (L) Field of View Resolution <- Fine course -> ⚪SlideSeq ⚪XYZeq ⚪Visium

Fig 2. Benchmark on Slide-Seq • “The unsupervised models (FA
and PNMF) had higher deviance than their spatially aware analogs” • “NSF and NSFH had similar deviances, suggesting that including a mixture of spatial and nonspatial components (NSFH) did not degrade generalization in comparison to a strictly spatial model“ • “Spatial importance scores indicated that most genes were strongly spatially variable, although a small number were entirely nonspatial (Fig. 2b).” Smaller deviance is better Number of Components

Extended Fig 4. RMSE • “Using RMSE instead of Poisson
deviance spatially aware models RSF, NSF and NSFH outperformed the nonspatial models FA and PNMF”

Fig 3. Biological Relevance • “Spatial factors mapped to speciﬁc
brain regions” ◦ 1. choroid plexus ◦ 6. medial habenula ◦ 8. dentate gyrus ◦ 10. meninges layer •

Fig S1 + S2 • these regions were also identiﬁed
by other nonnegative models such as PNMF and by Leiden clustering, albeit less clearly • “Real-valued factor models FA and RSF were unable to identify distinct regions”

Fig 4. NSHF in XYZeq data • Identiﬁed normal liver
vs. tumor tissue • Identiﬁed spatial (b) and non-spatial factors (c)

Fig 5. Visium Benchmark NSF works well on Visium data
• “NSF had generalization accuracy comparable to the best-performing model RSF” • RSF had the lowest predictive RMSE, followed by NSF • NSF outperforms hybrid NSFH (different from other benchmarks)

Fig 6. NSFH in VIsium • A. Spatial factors map
to brain regions • B. “The top genes for each spatial component again showed expression patterns overlapping with their associated factors” ◦ C. “a few of (non-spatial factors) did exhibit spatial localization“ • Lower resolution in visium may cause rare spatial factors to be miss-classiﬁed

Conclusion • Non-negative factors are more easily interpretable • NSF
is introduced as a spatially aware dimension reduction method • Spatial and non-spatial factors can be modeled with NSFH (hybrid NSF) • Spatial factors can deﬁne brain regions and tissue types in different types of spatial data • NSF is the strongest performing method in Visium data • NSF can be implemented via TensorFlow

Questions? • How can we use NSF? • Can this
replace/supplement spatial clustering?

Team lcolladotor Journal Club: Nonnegative Spat...

Team lcolladotor Journal Club: Nonnegative Spatial Factorization

Louise Huuki-Myers

More Decks by Louise Huuki-Myers

Featured

Transcript

Team lcolladotor Journal Club: Nonnegative spatial factorization applied to spatial

Overview • Introduce NSF (nonnegative spatial factorization method) • Show

Nonnegative Matrix Factorization • “Parts based representation” • V =

Methods • FA (factor analysis) ◦ Ignores spatial context •

Fig. 1 Simulation “To illustrate the ability of nonnegative models

Simulation • 200 features randomly assigned to four shapes Spot

Benchmark • Three spatial datasets using different technology ◦ SlideSeq

Fig 2. Benchmark on Slide-Seq • “The unsupervised models (FA

Extended Fig 4. RMSE • “Using RMSE instead of Poisson

Fig 3. Biological Relevance • “Spatial factors mapped to speciﬁc

Fig S1 + S2 • these regions were also identiﬁed

Fig 4. NSHF in XYZeq data • Identiﬁed normal liver

Fig 5. Visium Benchmark NSF works well on Visium data

Fig 6. NSFH in VIsium • A. Spatial factors map

Conclusion • Non-negative factors are more easily interpretable • NSF

Questions? • How can we use NSF? • Can this