Upgrade to Pro — share decks privately, control downloads, hide ads and more …

scmap – projection of single-cell RNA-seq data across datasets

scmap – projection of single-cell RNA-seq data across datasets

Vladimir Kiselev

March 20, 2018
Tweet

More Decks by Vladimir Kiselev

Other Decks in Science

Transcript

  1. scmap – projection of single-cell RNA-seq data across datasets Vlad(imir)

    Kiselev (postdoc @ Martin Hemberg team) Head of Cellular Genetics Informatics team
  2. Single-cell RNA-seq The introductory slides were kindly provided by Mike

    Stubbington (from his Human Cell Atlas presentation)
  3. Single-cell RNA-seq atlases October 2016 400,000 single cells All major

    mouse organs Han et al, Cell, February 2018 Human Cell Atlas Mouse Cell Atlas Fly Cell Atlas All cells in a fly (~25 million) December 2017
  4. Yes! A method for projecting cells from a single-cell RNA-seq

    dataset onto cell-types or individual cells from other experiments. www.bioconductor.org www.bioconductor.org scmap
  5. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type A
  6. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type B
  7. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type C
  8. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell from the cell type A
  9. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell from the cell type C
  10. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be unassigned
  11. Discovery vs validation Query Reference scmap-cluster scmap-cell a Method scmap−cluster

    scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Validation Discovery
  12. Datasets Dataset Organism Tissue # of cells Experimental protocol Yan

    human Embryo development 90 Tang et al Goolam mouse Embryo development 124 Smart-Seq2 Deng mouse Embryo development 268 Smart-Seq Smart-Seq2 Pollen human Cerebral cortex 301 SMARTer Li human Colorectal tumors 561 SMARTer Usoskin mouse Brain 622 STRT-Seq Kolodziejczyk mouse Embryo stem cells 704 SMARTer Xin human Pancreas 1492 SMARTer Tasic mouse Cortex 1679 SMARTer Baron mouse Pancreas 1886 inDrop Muraro human Pancreas 2126 CEL-Seq2 Segerstolpe human Pancreas 2209 Smart-Seq2 Klein mouse Embryo stem cells 2717 inDrop Zeisel mouse Brain 3005 STRT-Seq UMI Baron human Pancreas 8569 inDrop Shekhar mouse Retina 27499 Drop-Seq Macosko mouse Retina 44808 Drop-Seq We used publicly available datasets to validate and benchmark scmap In all datasets the cell types were identified by the authors
  13. Algorithm 1. Feature (gene, transcript) selection 2. Index creation 3.

    Projection www.bioconductor.org www.bioconductor.org scmap
  14. Feature selection (Reference) Curse of dimensionality • With increased dimensions

    data becomes sparse • Definitions of density and distance between points become less meaningful • Classification algorithms do not work well https://shapeofdata.wordpress.com/2013/04/02/the-curse-of-dimensionality/ … N = 2 N = 3 N = 16 N = 17