scmap – projection of single-cell RNA-seq data across datasets

scmap – projection of single-cell RNA-seq data across datasets

D68d36a42d9c44c29abb391e051e592d?s=128

Vladimir Kiselev

March 20, 2018
Tweet

Transcript

  1. scmap – projection of single-cell RNA-seq data across datasets Vlad(imir)

    Kiselev (postdoc @ Martin Hemberg team) Head of Cellular Genetics Informatics team
  2. Single-cell RNA-seq The introductory slides were kindly provided by Mike

    Stubbington (from his Human Cell Atlas presentation)
  3. The Art of Clean Up, Ursus Wehrli

  4. The Art of Clean Up, Ursus Wehrli

  5. None
  6. Moore’s law in single-cell RNA-seq experiments Svensson et al., Nature

    Protocols, April 2018
  7. Single-cell RNA-seq atlases October 2016 400,000 single cells All major

    mouse organs Han et al, Cell, February 2018 Human Cell Atlas Mouse Cell Atlas Fly Cell Atlas All cells in a fly (~25 million) December 2017
  8. Typical analysis Macosko et al, Nature Biotechnology, 2016

  9. Can we make use of all these data in an

    integrative manner?
  10. Yes! A method for projecting cells from a single-cell RNA-seq

    dataset onto cell-types or individual cells from other experiments. www.bioconductor.org www.bioconductor.org scmap
  11. The Power of bioRxiv

  12. The Power of bioRxiv

  13. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type A
  14. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type B
  15. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell-type C
  16. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell from the cell type A
  17. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be assigned to the cell from the cell type C
  18. How does it work? Query Reference scmap-cluster scmap-cell a Method

    scmap−cluster scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Cell type A Cell type B Cell type C Unknown cell type This cell will be unassigned
  19. Discovery vs validation Query Reference scmap-cluster scmap-cell a Method scmap−cluster

    scmap−cell SVM RF b Method scmap−cluster scmap−cell SVM RF c Validation Discovery
  20. Datasets Dataset Organism Tissue # of cells Experimental protocol Yan

    human Embryo development 90 Tang et al Goolam mouse Embryo development 124 Smart-Seq2 Deng mouse Embryo development 268 Smart-Seq Smart-Seq2 Pollen human Cerebral cortex 301 SMARTer Li human Colorectal tumors 561 SMARTer Usoskin mouse Brain 622 STRT-Seq Kolodziejczyk mouse Embryo stem cells 704 SMARTer Xin human Pancreas 1492 SMARTer Tasic mouse Cortex 1679 SMARTer Baron mouse Pancreas 1886 inDrop Muraro human Pancreas 2126 CEL-Seq2 Segerstolpe human Pancreas 2209 Smart-Seq2 Klein mouse Embryo stem cells 2717 inDrop Zeisel mouse Brain 3005 STRT-Seq UMI Baron human Pancreas 8569 inDrop Shekhar mouse Retina 27499 Drop-Seq Macosko mouse Retina 44808 Drop-Seq We used publicly available datasets to validate and benchmark scmap In all datasets the cell types were identified by the authors
  21. Algorithm 1. Feature (gene, transcript) selection 2. Index creation 3.

    Projection www.bioconductor.org www.bioconductor.org scmap
  22. Feature selection (Reference) Curse of dimensionality • With increased dimensions

    data becomes sparse • Definitions of density and distance between points become less meaningful • Classification algorithms do not work well https://shapeofdata.wordpress.com/2013/04/02/the-curse-of-dimensionality/ … N = 2 N = 3 N = 16 N = 17
  23. None
  24. None
  25. None
  26. None
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. None
  39. None
  40. None
  41. None
  42. None
  43. None
  44. None
  45. None