Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Team lcolladotor Journal Club: Nonnegative Spat...

Louise Huuki-Myers
November 21, 2024
6

Team lcolladotor Journal Club: Nonnegative Spatial Factorization

Team lcolladotor Journal Club: Nonnegative spatial factorization applied to spatial genomics

F. William Townes & Barbara E. Engelhardt, Nature Methods, 2022, doi.org/10.1038/s41592-022-01687-w

Presented by: Louise Huuki-Myers, Nov 20, 2024
Recording of our journal club meeting: https://youtu.be/KiutcM4f4TE

NSF is introduced as a spatially aware dimension reduction method, and benchmarked in spatial transcriptomic datasets. NSF had strong performance in Visium data. Factorization has utility to identify spatial features such as brain regions and tissue types, the hybrid model (NSFH) can also identify non-spatial factors. Modeling results showed most genes have strong spatial variability.

Louise Huuki-Myers

November 21, 2024
Tweet

Transcript

  1. Team lcolladotor Journal Club: Nonnegative spatial factorization applied to spatial

    genomics F. William Townes & Barbara E. Engelhardt Nature Methods, 2022 doi.org/10.1038/s41592-022-01687-w Presented by: Louise Huuki-Myers, Nov 20, 2024
  2. Overview • Introduce NSF (nonnegative spatial factorization method) • Show

    ability of nonnegative factorization to identify parts based representation in simulations • Benchmark dimension reductions methods on three spatial datasets ◦ Slide-seq, XYZeq, 10x VIsium
  3. Nonnegative Matrix Factorization • “Parts based representation” • V =

    W x H ◦ No negative components Image credit: https://en.wikipedia.org/wiki/Non-negative_matrix_factorization
  4. Methods • FA (factor analysis) ◦ Ignores spatial context •

    MEFISTO ◦ Spatially aware dimension reduction ◦ Velten et al., Nat Methods, 2022 • NSF (nonnegative spatial factorization) ◦ a model for spatially-aware dimension reduction using an exponentiated GP prior over the spatial locations with a Poisson or negative binomial likelihood for count data • NSFH (NSF hybrid model) ◦ that generalizes both NSF and probabilistic NMF to partition variability into both spatial and nonspatial sources • PNMF (Probabilistic NMF) • RSF (real-value spatial factorization)
  5. Fig. 1 Simulation “To illustrate the ability of nonnegative models

    to recover a parts-based factorization, we simulated multivariate count data from two sets of spatial patterns.” • “FA and RSF estimated latent factors consisting of linear combinations of the true factors” ⚠ • “Nonnegative models PNMF and NSF identified eachpattern as a separate factor” ✅ • “Leiden clustering accurately identified spatially disjoint patterns in the ggblocks simulation” ⚠
  6. Simulation • 200 features randomly assigned to four shapes Spot

    1 Spot 2 ... Spot n Gene 1 Gene 2 Gene 3 Gene 4 ... Gene 200
  7. Benchmark • Three spatial datasets using different technology ◦ SlideSeq

    - mouse hippocampus ◦ Visium - mouse brain ◦ XYZseq - mouse liver • 95%:5% Training:Testing split • Goodness of fit quantified with poisson distribution ◦ Observed counts in validation vs. predicted mean values from model fit ◦ Small deviance shows good fit ◦ Also used RMSE • Varying number of components (L) Field of View Resolution <- Fine course -> ⚪SlideSeq ⚪XYZeq ⚪Visium
  8. Fig 2. Benchmark on Slide-Seq • “The unsupervised models (FA

    and PNMF) had higher deviance than their spatially aware analogs” • “NSF and NSFH had similar deviances, suggesting that including a mixture of spatial and nonspatial components (NSFH) did not degrade generalization in comparison to a strictly spatial model“ • “Spatial importance scores indicated that most genes were strongly spatially variable, although a small number were entirely nonspatial (Fig. 2b).” Smaller deviance is better Number of Components
  9. Extended Fig 4. RMSE • “Using RMSE instead of Poisson

    deviance spatially aware models RSF, NSF and NSFH outperformed the nonspatial models FA and PNMF”
  10. Fig 3. Biological Relevance • “Spatial factors mapped to specific

    brain regions” ◦ 1. choroid plexus ◦ 6. medial habenula ◦ 8. dentate gyrus ◦ 10. meninges layer •
  11. Fig S1 + S2 • these regions were also identified

    by other nonnegative models such as PNMF and by Leiden clustering, albeit less clearly • “Real-valued factor models FA and RSF were unable to identify distinct regions”
  12. Fig 4. NSHF in XYZeq data • Identified normal liver

    vs. tumor tissue • Identified spatial (b) and non-spatial factors (c)
  13. Fig 5. Visium Benchmark NSF works well on Visium data

    • “NSF had generalization accuracy comparable to the best-performing model RSF” • RSF had the lowest predictive RMSE, followed by NSF • NSF outperforms hybrid NSFH (different from other benchmarks)
  14. Fig 6. NSFH in VIsium • A. Spatial factors map

    to brain regions • B. “The top genes for each spatial component again showed expression patterns overlapping with their associated factors” ◦ C. “a few of (non-spatial factors) did exhibit spatial localization“ • Lower resolution in visium may cause rare spatial factors to be miss-classified
  15. Conclusion • Non-negative factors are more easily interpretable • NSF

    is introduced as a spatially aware dimension reduction method • Spatial and non-spatial factors can be modeled with NSFH (hybrid NSF) • Spatial factors can define brain regions and tissue types in different types of spatial data • NSF is the strongest performing method in Visium data • NSF can be implemented via TensorFlow
  16. Questions? • How can we use NSF? • Can this

    replace/supplement spatial clustering?