Linked biology – from phenotypes towards phylogenetic trees
MSc Dissertation presented to the Post Graduate Program of the Institute of Computing of the University of Campinas to obtain a Mestre degree in Computer Science.
in the production of a huge amount of data, for example: • Phenotype descriptions • Trees of species • Trees of genes • DNA sequence • RNA sequence • etc.
in the production of a huge amount of data, for example: • Phenotype descriptions • Trees of species • Trees of genes • DNA sequence • RNA sequence • etc.
characteristics and behavior of an individual, resulting from the interaction of its genotype (genetic makeup) with the environment. • Eye color • Hair color • The sound of your voice
morphology size structure semicircular triangular quadrangular aliform subClassOf subClassOf subClassOf subClassOf Ontology – Phenotype and Trait Ontology (PATO) Definition: A 2-D shape quality that have shape or form of half a circle
morphology size structure semicircular triangular quadrangular aliform subClassOf subClassOf subClassOf subClassOf Ontology – Phenotype and Trait Ontology (PATO) Definition: A 2-D shape quality that have shape or form of half a circle
morphology size structure semicircular triangular quadrangular aliform subClassOf subClassOf subClassOf subClassOf Ontology – Phenotype and Trait Ontology (PATO) Definition: A 2-D shape quality that have shape or form of half a circle
morphology size structure semicircular triangular quadrangular aliform subClassOf subClassOf subClassOf subClassOf Ontology – Phenotype and Trait Ontology (PATO) Definition: A 2-D shape quality that have shape or form of half a circle
morphology size structure semicircular triangular quadrangular aliform subClassOf subClassOf subClassOf subClassOf Ontology – Phenotype and Trait Ontology (PATO) Definition: A 2-D shape quality that have shape or form of half a circle
ontology include • Standardization of terminology • Explicit definitions of concepts • The creation of structured representations of information that facilitate computability.
into EQs makes this difficult in two ways. 1. The mapping between a character state and an EQ is not necessarily one-to-one. Pattern of the end of tail banded or uniformly dark uniform pale yellow or white
into EQs makes this difficult in two ways. 1. The mapping between a character state and an EQ is not necessarily one-to-one. Pattern of the end of tail banded or uniformly dark uniform pale yellow or white
into EQs makes this difficult in two ways. 1. The mapping between a character state and an EQ is not necessarily one-to-one. Pattern of the end of tail banded or uniformly dark uniform pale yellow or white
into EQs makes this difficult in two ways. 2. In the EQ formalism the attribute that forms part of a traditional character description is implicit in the hierarchical structure of the quality ontology.
into EQs makes this difficult in two ways. 2. In the EQ formalism the attribute that forms part of a traditional character description is implicit in the hierarchical structure of the quality ontology. Character dorsal fin shape circular dorsal fin (VSAO:0000165) circular (PATO:0000411) 2D shape (PATO:0002006) subClassOf State Entity Quality shape (PATO:0000052) subClassOf
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales Morphological data matrices Characters Character-States Taxon Assigned States
nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales Morphological data matrices Characters Character-States Taxon Assigned States
nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales Morphological data matrices Characters Character-States Taxon Assigned States
nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales Morphological data matrices Characters Character-States Taxon Assigned States
nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales Morphological data matrices Characters Character-States Taxon Assigned States
OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character Characters Character-States Taxon Assigned States CategoricalCharacter, id=“c6”( States, StateDefini2on, id=“s12”( “well(round”( “Nostrils(look(like(a(quite(per...”( Label, Detail, StateDefini2on, id=“s13”( “oval(or(split<like”( “Nostrils(are(not(perfectly(rou...”( Label, Detail, “nostrils'(form”( “Monitors'(nostrils(may(have(different(forms...”( Label, Detail, Representa2on, Dataset, Datasets, “V.(albiguralis”( “White<throated(monitor.(DistribuIon:(Africa((West...”( Label, Detail, Representa2on, CodedDescrip2on, id=“D1”( SummaryData, Categorical, ref=“c6”( State, ref=“s13”( nostrils' form transversal section of the tail nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales 1st Conceptual graph based model
OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character Characters Character-States Taxon Assigned States CategoricalCharacter, id=“c6”( States, StateDefini2on, id=“s12”( “well(round”( “Nostrils(look(like(a(quite(per...”( Label, Detail, StateDefini2on, id=“s13”( “oval(or(split<like”( “Nostrils(are(not(perfectly(rou...”( Label, Detail, “nostrils'(form”( “Monitors'(nostrils(may(have(different(forms...”( Label, Detail, Representa2on, Dataset, Datasets, “V.(albiguralis”( “White<throated(monitor.(DistribuIon:(Africa((West...”( Label, Detail, Representa2on, CodedDescrip2on, id=“D1”( SummaryData, Categorical, ref=“c6”( State, ref=“s13”( nostrils' form transversal section of the tail nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales 1st Conceptual graph based model
OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character Characters Character-States Taxon Assigned States CategoricalCharacter, id=“c6”( States, StateDefini2on, id=“s12”( “well(round”( “Nostrils(look(like(a(quite(per...”( Label, Detail, StateDefini2on, id=“s13”( “oval(or(split<like”( “Nostrils(are(not(perfectly(rou...”( Label, Detail, “nostrils'(form”( “Monitors'(nostrils(may(have(different(forms...”( Label, Detail, Representa2on, Dataset, Datasets, “V.(albiguralis”( “White<throated(monitor.(DistribuIon:(Africa((West...”( Label, Detail, Representa2on, CodedDescrip2on, id=“D1”( SummaryData, Categorical, ref=“c6”( State, ref=“s13”( nostrils' form transversal section of the tail nuchal scales Varanus albiguralis 2 1 2 Varanus brevicauda 1 2 1 Nostrils' form 1 – well round 2 – oval or split-like Transversal section of the tail 1 – laterally compressed 2 – roundish Nuchal scales 1 – same size than head scales 2 – bigger than head scales 1st Conceptual graph based model
keeled triangular keeled, hull-shaped strongly keeled same size than head scales bigger than head scales Varanus beccarii Varanus bogerti Varanus komodoensis nuchal scales nuchal scales nuchal scales nuchal scales nuchal scales nuchal scales Prasinus.sdd Varanus.sdd OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character
keeled triangular keeled, hull-shaped strongly keeled same size than head scales bigger than head scales Varanus beccarii Varanus bogerti Varanus komodoensis nuchal scales nuchal scales nuchal scales nuchal scales nuchal scales nuchal scales Prasinus.sdd Varanus.sdd OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character
morphological data matrices, phylogenetic tree extraction Graph import LSID linking Character association 2 1 3 4 Linked data Ontologies 5 6 7 Data access DataBase TreeBase Dryad etc. Excel XML CSV RDF Book Mashup DBtune DBPedia Jamendo US Census Data Project Guten- berg DBLP FOAF Revyu Music- brainz Geo- names world Fact book
Label Detail Type Character Detail Character 1st Conceptual graph based model 80 Similarity Index 2nd Conceptual graph based model OTU Type OTU Label Detail Character Type Character Label Detail Character-State Type Character-State ! Label Detail
Label Detail Type Character Detail Character 81 Similarity Index 1st Conceptual graph based model 2nd Conceptual graph based model OTU Type OTU Label Detail Character Type Character Label Detail Character-State Type Character-State ! Label Detail
Label Detail Type Character Detail Character 82 Similarity Index 1st Conceptual graph based model 2nd Conceptual graph based model OTU Type OTU Label Detail Character Type Character Label Detail Character-State Type Character-State ! Label Detail
Label Detail Type Character Detail Character 83 Similarity Index 1st Conceptual graph based model 2nd Conceptual graph based model OTU Type OTU Label Detail Character Type Character Label Detail Character-State Type Character-State ! Label Detail
Character Label Detail HTU Type HTU Type TreeEdge Character-State Type Character-State Label Detail Character-State Type Character-State OTU Type OTU Label Detail Character-State Type State Label Detail Type Character Detail Character 84 Similarity Index 1st Conceptual graph based model 2nd Conceptual graph based model
planated parts within the LBS Pseudosporochnus restricted to the extremities Author A Planation Pseudosporochnus restricted to the extremities Author B
planated parts within the LBS Pseudosporochnus restricted to the extremities Author A Planation Pseudosporochnus restricted to the extremities Author B Scenario 2 Planation of vegetative leaves Pseudosporochnus inapplicable Author C Planation of vegetative leaves Pseudosporochnus restricted to the extremities Author D
1 • Cauline cladotaxy • Protoxylem position within the cauline stele • Xylem configuration in the rachis • Xylem configuration in the leaflets • Development of the LBS • Organotaxy of the LBS • Presence of planated parts within the LBS • Extent of the planation
1 • Cauline cladotaxy • Protoxylem position within the cauline stele • Xylem configuration in the rachis • Xylem configuration in the leaflets • Development of the LBS • Organotaxy of the LBS • Presence of planated parts within the LBS • Extent of the planation Author 2 • Cauline cladotaxy • Protoxylem position within the cauline stele • Xylem configuration in the rachis • Xylem configuration in the leaflets • Development of the foliar organ • Phyllotaxy • Planation
cauline stele Organotaxy of the LBS Xylem conguration in the leaets Planation Development of the foliar organ Phyllotaxy Xylem conguration in the rachis Extent of the planation Presence of planated parts w ithin the LBS Development of the LBS
cauline stele Organotaxy of the LBS Xylem conguration in the leaets Planation Development of the foliar organ Phyllotaxy Xylem conguration in the rachis Extent of the planation Presence of planated parts w ithin the LBS Development of the LBS
cauline stele Organotaxy of the LBS Xylem conguration in the leaets Planation Development of the foliar organ Phyllotaxy Xylem conguration in the rachis Extent of the planation Presence of planated parts w ithin the LBS Development of the LBS
of phylogenetic trees. absent Present (leaflets) unbranched root Marattia Pseudosporochnus Zygopteris Equisetum Ophioglossum Webbing within the LBS Webbing of the terminal units Branchiness of the LBS 0 1
of phylogenetic trees. Marattia Webbing within the LBS Webbing of the terminal units Branchiness of the LBS Pseudosporochnus Zygopteris Equisetum Ophioglossum root EvolvedTrait EvolvedTrait 0 1
data that are represented in many standards not often interconnectable and designed and implemented an approach to link and combine these resources. • Our approach enables us to discover and make explicit the potential semantics raised by linking previously unconnected information.
to transform phenotype descriptions and phylogenetic trees in graph representations. • An heuristic similarity measure. • A visual tool prototype to analyze chaacters correlation . • An algorithm to trace changes in traits.
Proceedings of the 6th Seminar on Ontology Research in Brazil, volume 1041, pages 154 – 165, September 2013. • Coupling phenotype descriptions and phylogenetic trees: from SDD to ontologies via graph databases. Talk at TDWG 2013 Annual Conference Florence, Italy, 28th of October – 1st of November 2013.
other knowledge bases. • Further investigations in the similarity measure. • Extend the correlations analysis to the relation between character nodes and ontology terms.
Régine Vignes Lebbe • Laboratory of Information Systems (LIS) • IC infrastructure and staff • Financial support: • CNPq (grant 138197/2011-3) • Microsoft Research FAPESP Virtual Institute (NavScales Project) • CNPq (MuZOO Project and PRONEX-FAPESP) • INCT in Web Science (CNPq 557.128/2009-9) • CAPES • FAPESP