Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CSHL Biological Data Science 2022

Evan Biederstedt
November 10, 2022
900

CSHL Biological Data Science 2022

Evan Biederstedt

November 10, 2022
Tweet

Transcript

  1. Cell Annotation Platform Evan Biederstedt Department of Biomedical Informatics De

    fi ning Cell Types and States for the Human Cell Atlas and Beyond
  2. Source: https://www.genome.gov/Multimedia/Slides/GSPFuture2014/10_Regev.pdf • Textbooks: “300” major cell types? • Genomics:

    +100 subtypes of neurons in the human retina alone Problem: We don’t know our cells human body: ~37 trillion cells Motivation
  3. Gene 1 Gene 2 Gene 3 Gene 1 Dendritic cell

    T cell Consider each cell in a +20,000- dimensional gene expression space Motivation Gene expression pro fi le cellular function + cell identity Gene 2 Gene 3 … … … … … … … … …
  4. https://www.nature.com/articles/s41588-021-00818-x MKI67: gene associated with cellular proliferation HBB: gene associated

    with hemoglobin production PAX2: gene associated with kidney development Motivation: Annotations
  5. Researchers manually examine prominent molecular patterns in light of prior

    biological knowledge, and annotate cells. Problem There’s no standard source of accumulated annotations with associated molecular data for researchers to explore Motivation: Annotations
  6. NK Cells Cytokines Monocytes Problem •Individual research groups end up

    annotating (potentially millions of) cells manually, which results in cells with inconsistent terms and labelings between groups. •This approach cannot scale. We need a solution for creating comprehensive references with a standardized nomenclature for all species. •There's no medium for researchers to compare annotations across studies, potentially resolving con fl icting results. •There’s no central location to access annotations used in publications.
 •How can we create a Human Cell Atlas??? Motivation: Annotations
  7. Cell Annotation Platform (CAP) • Community-driven platform to create, explore,

    and store annotations • Infrastructure to accumulate, share, and analyze annotation terms with associated molecular signatures to interpret cellular identities • Encourage researchers to converge upon consensus nomenclature • 
 • Homepage
  8. Cell Annotation Platform (CAP) Main Components • Data Repository •

    Annotation Upload and Publication • Annotation UI: Create Annotations • “CellCards” Reference Summaries •
  9. MVP User Workflow 
 1. Sign in & User Profile

    2. Upload (annotated or unannotated) data 3. Collaboratively edit and save 4. Publish version (with DOI) 5. Downloadable results 6. Browse / Search Upload + Edit + Publish + Download
  10. CAP organization • Workspace: Collaborative “repo” for researchers to organize

    annotations
 • Publication: Version • Datasets: Cell annotations with molecular data • Cell Label: Term associated with a cell or molecular subpopulation.
  11. • Collections of datasets, typically corresponding to a scienti fi

    c journal article • Timestamped
 • DOIs for citations in journals
 • Versioning • Downloaded annotations in standardized formats 
 Publications
  12. Workspace • Collaborative space to edit collections of annotations &

    other relevant metadata • Advanced user form • Allow user to “hide” irrelevant metadata within dataset • Specify which annotations & which metadata fi elds are relevant
 • Allow user to “hide” irrelevant metadata within dataset
  13. 
 • Autocomplete recommendations (with synonyms and related terms) from

    EMBL-EBI ontologies
 • “Nudges” to encourage consensus and standardization (if possible) but no requirements 
 Workspace
  14. • Users roles for collaborative work on annotations
 • User

    roles: • viewer (read-only) • editor (write access) • owner (administrative) • 
 • Collaborations
  15. Cell Synonyms & Categories & Evidence • Synonyms
 • Categories

    
 e.g. “CD8+ T cell” is a subset of “T Lymphocyte” • Relationships between annotations • Rationales • List of marker genes used
  16. Interactive Exploration: Molecular Data For every data on CAP, any

    user may:
 
 • Explore the annotations associated with this dataset
 • Select cells on embedding
 • Explore the heat maps with precalculated DE values for each annotation
 • Using the selection tool, select cells and calculate new DE values
  17. Interactive Exploration: Molecular Data Users now can interactively add annotations:


    
 • Users select cells (based either on prede fi ned clusters, or selections via the selection tool), and add cell annotations • User roles: only if users own datasets or invited others to collaborator
  18. Annotation Transfer Clarke, Z.A., Andrews, T.S., Atif, J. et al.

    Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat Protoc 16, 2749–2764 (2021). Annotation transfer
 • REF dataset used to transfer cell annotations to QUERY dataset
 
 • Promise to bottleneck posed by cell annotations
  19. Upcoming: Annotation Transfer Annotation transfer
 • User choses model &

    transfer algorithm
 • View predictions imposed on molecular data: accept/edit/ decline
 • Publish & share We are eager to include more models! Talk to us!
  20. Thank you! •Nils Gehlenborg •John Marioni •Rahul Satija •David Osumi-

    Sutherland •Aviv Regev •Peter Kharchenko •Chloé Villani
  21. Denis Ilguzin Maxim Svetlakov Levon Ghukasyan Michael Loktionov Sultan Arapov

    Mary Futey Nick Akhmetov Lusine Barseghyan Tigran Markosjan Konstantin Boyandin Uğur Bayindir David Osumi-Sutherland Pavel Istomin Dennis Bolgov Andrey Isaev Mo Lotfollahi David Fischer Evan Biederstedt