$30 off During Our Annual Pro Sale. View Details »

All the Clades in the World: Building a Semantically-Rich and Testable Ontology of Phylogenetic Clade Definitions

All the Clades in the World: Building a Semantically-Rich and Testable Ontology of Phylogenetic Clade Definitions

Presentation on the Phyloreferencing project (http://phyloref.org) delivered to the SPNHC+TDWG 2018 meeting in Dunedin, New Zealand. The abstract for this talk is available at http://dx.doi.org/10.3897/biss.2.25776

Gaurav Vaidya

August 29, 2018
Tweet

More Decks by Gaurav Vaidya

Other Decks in Science

Transcript

  1. All The Clades In The World
    Building a Semantically-Rich and Testable
    Ontology of Phylogenetic Clade Definitions
    Gaurav Vaidya, Hilmar Lapp, Nico Cellinese
    www.phyloref.org

    View Slide

  2. Overview
    • The limitations of names and the value of clade
    definitions
    • Curating clade definitions in the Web Ontology
    Language (OWL)

    View Slide

  3. What’s in a name?
    • Names and Concepts do not
    reconcile that easily
    • Names are text strings
    • Context is lacking or subjective
    • Meaning is not computable

    View Slide

  4. Linnean names point to concepts
    Antoine Laurent de Jussieu
    Genera Plantarum, 1789

    View Slide

  5. Linnean names point to concepts
    Antoine Laurent de Jussieu
    Genera Plantarum, 1789

    View Slide

  6. Linnean names point to concepts
    Antoine Laurent de Jussieu
    Genera Plantarum, 1789
    …and 200+
    …and 400+

    View Slide

  7. Our life perspective

    View Slide

  8. Genealogical relationships among taxa
    Life organized in nested hierarchies

    View Slide

  9. A clade, also known as monophyletic group,
    represents a branch on the tree of life. It is a group
    of organisms that consists of a common ancestor
    and all its descendants.
    What is a clade?
    Node = ancestor
    Descendants

    View Slide

  10. View Slide

  11. • We can unclutter concepts, and thereby
    nomenclature

    View Slide

  12. Tree-thinking
    Common descent àevolution at the center of taxonomy
    B C D
    Branches
    Synapomorphies
    A
    Clades = taxa
    Discovery

    View Slide

  13. Tree-thinking
    Common descent àevolution at the center of taxonomy
    Discovery
    Communication
    How??
    0 14
    7
    Density
    0.07
    0.22
    0.72
    Diversification rate

    View Slide

  14. Tree-thinking
    Berberidopsidaceae
    Opiliones
    Zingiberaceae
    Hamamelidaceae
    Sarcolaenaceae
    Lingulidae
    Hymenoptera
    Mammalia
    Apocynaceae
    Galliformes
    Rubiaceae
    Anarthriaceae
    Lineidae
    Crocodylidae
    Stylosiphonia
    Andrenidae Cracidae
    Gavialis
    Globba
    Glottidia
    Micrella
    Streptotham
    nus
    Rhodoleia
    Phalangiidae Tachyglossa
    Lyginia
    Mediusella
    Chamaeclitandra

    View Slide

  15. Tree-thinking
    Berberidopsidaceae
    Opiliones
    Zingiberaceae
    Hamamelidaceae
    Sarcolaenaceae
    Lingulidae
    Hymenoptera
    Mammalia
    Apocynaceae
    Galliformes
    Rubiaceae
    Anarthriaceae
    Lineidae
    Crocodylidae
    Stylosiphonia
    Andrenidae Cracidae
    Gavialis
    Globba
    Glottidia
    Micrella
    Streptotham
    nus
    Rhodoleia
    Phalangiidae Tachyglossa
    Lyginia
    Mediusella
    Chamaeclitandra
    These names are not generated in an evolutionary-based framework
    (Groups defined by character similarity vs. common descent)

    View Slide

  16. Both the Encyclopedia of Life (EOL) and the Open Tree of Life suggest that
    Campanuloideae is a misspelling of Campaniloidea (marine gastropods!)
    GBIF does not currently have Campanuloideae in its backbone taxonomy.

    View Slide

  17. Are you kidding me?
    These are the Campanuloideae!
    Wang et al. 2014

    View Slide

  18. View Slide

  19. View Slide

  20. Life as a street map
    How to navigate life as a machine

    View Slide

  21. Mapping data to phylogenetic
    knowledge space

    View Slide

  22. Phylogenetic Definitions
    Statements formally expressing the patterns we discover
    (analogous to map coordinates)
    Node-Based
    (minimum clade)
    Branch-Based
    (maximum clade)
    Apomorphy-Based
    A B C A B C A B C
    X
    The clade originating
    with the last common
    ancestor of B and C.
    The clade originating
    with the first ancestor of
    B that is not an
    ancestor of A.
    The clade originating
    with the first ancestor
    of C to evolve X.

    View Slide

  23. http://purl.obolibrary.org/obo/UBERON_0001702

    View Slide

  24. http://purl.obolibrary.org/obo/UBERON_0001702

    View Slide

  25. View Slide

  26. Hillis and Wilcox, 2005
    http://dx.doi.org/10.1016/j.ympev.2004.10.007

    View Slide

  27. http://phyloref.org/curation-tool/

    View Slide

  28. View Slide

  29. View Slide

  30. Yuan et al 2016 BEAST tree (fig 3) as curated by the Open Tree of Life
    (https://doi.org/10.1093/sysbio/syw055)

    View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. Phyloreferences
    • Portable to any phylogeny that has all the specifiers
    • Resolvable on any phylogeny represented in RDF by
    using an OWL 2 DL reasoner
    • Extensible to other kinds of definitions if need be
    • Working on better ways to match specifiers

    View Slide

  35. http://www.github.com/phyloref

    View Slide

  36. http://phyloref.org/blog/
    http://twitter.com/phyloref

    View Slide

  37. Acknowledgements
    • Funded by the US National Science Foundation
    through collaborative grants DBI-1458484
    and DBI-1458604.

    View Slide