Hobern_Towards a Digital Knowledge base

Hobern_Towards a Digital Knowledge base

64a45b3631ed7c31809d55cd44948dfb?s=128

Atlas of Living Australia

August 05, 2013
Tweet

Transcript

  1. Towards a digital knowledge base Supporting biodiversity science in the

    21st Century Donald Hobern, GBIF Executive Secretary, dhobern@gbif.org Global Biodiversity Information Facility (GBIF) Canberra, 13 June 2013 Bar-tailed Godwits (Limosa lapponica) and Red Knot (Calidris canutus) , in Merimbula, NSW on 10/11/2010. The Red Knot was banded on 25/11/2006, age approximately 2, at Miranda, New Zealand and was sighted again on 23/05/2007 at Broome, WA and back in Miranda on 21/11/2009 (information from Birds Australia).
  2. Centuries of biodiversity research Images CC-licensed by ap2il, pneumaticpost, Ben

    Salter and Thomas Hawk
  3. Decades of biodiversity informatics

  4. Weaknesses with distributed data Id: CURC-00015436 Species: Acalles dubius Location:

    Berlin, Germany Latitude: 13.41 Longitude: 52.52 Id: CURC-00015436 Species: Acalles dubius Species: Acalles camelus Location: Berlin, Germany Latitude: 13.41 Longitude: 52.52 Curator Taxonomist Reidentifies Digitises Id: CURC-00015436 Species: Acalles dubius Location: Berlin, Germany Latitude: 13.41 Longitude: 52.52 Distribution Modeller Uses Id: CURC-00015436 Species: Acalles dubius Location: Berlin, Germany Latitude: 13.41 Longitude: 52.52 Environment Agency Rejects x Id: CURC-00015436 Species: Acalles dubius Location: Berlin, Germany Latitude: 13.41 52.52 Longitude: 52.52 13.41 Climate Change Researcher Corrects
  5. Toward a shared knowledgebase Curator Taxonomist Reidentifies Digitises Distribution Modeller

    Uses Environment Agency Flags issue Climate Change Researcher Corrects Id: CURC-00015436 Version: 1 Species: Acalles dubius Location: Berlin, Germany Latitude: 13.41 Longitude: 52.52 Global Biodiversity Knowledgebase Id: CURC-00015436 Version: 2 Species: Acalles dubius Location: Berlin, Germany Latitude: 13.41 Longitude: 52.52 Issue detected: Locality Id: CURC-00015436 Version: 3 Species: Acalles dubius Location: Berlin, Germany Latitude: 52.52 Longitude: 13.41 Id: CURC-00015436 Version: 4 Species: Acalles camelus Location: Berlin, Germany Latitude: 52.52 Longitude: 13.41 Technician Photographer Id: CURC-00015436 Version: 6 Species: Acalles camelus Location: Berlin, Germany Latitude: 52.52 Longitude: 13.41 Sequence: ATTGCA... Image: 15436.jpg Images Sequences Distribution model Acalles camelus Id: CURC-00015436 Version: 6 Species: Acalles camelus Location: Berlin, Germany Latitude: 52.52 Longitude: 13.41 Sequence: ATTGCA... Image: 15436.jpg Uses Publishes model
  6. So what is needed? • Data standards • Persistent storage

    • Culture of open reuse • Collaborative curation • Mobilising all primary data • Interpretation of primary data
  7. Data standards ScientificName: Imbophorus pallidus Family: Pterophoridae Locality: Stirling Range

    Country: Australia State: WA Latitude: -34.3 Longitude: 118.0 CoordinatePrecision: 10000m CoordinateMethod: Google Earth DateCollected: 1963-09-15 BasisOfRecord: Preserved specimen TypeStatus: Paratypus • What species? • Where was it found? • When was it found? • What is the evidence? • Other information – Specimen data – Sampling event information – Sequences, images, etc.
  8. Data standards Integrated access for records of the occurrence of

    any species: • What? • When? • Where? • What evidence? • Data owner? • Link to full record Presence only Collections Ecological Monitoring Genomics Darwin Core
  9. Data standards Integrated access for records of the occurrence of

    any species: • What? • When? • Where? • What evidence? • Data owner? • Link to full record Presence only Collections Ecological Monitoring Genomics Darwin Core Fully compatible with existing Darwin Core data, plus: • Which species were recorded together? • Which sets of data are directly comparable? • Which species were most abundant in each sample? Presence/absence Darwin Core + Core Survey Fields Sample Id Method Id Relative abundance ...
  10. Generalised biodiversity data • During a specified recording event –

    At a specified time and place, and – With other specified properties • One or more specified observers – With various specified properties Recorded • One or more organisms – Identified to specified biological taxa, and – With other specified properties EVIDENCE ASSERTION
  11. Data standards – Darwin Core Archive Darwin Core Archive •

    GBIF and ALA’s preferred standard for sharing data • ZIP file with data spreadsheet and metadata • Easily stored, interpreted and replicated
  12. Persistent storage • Long-term data management • Stable data identifiers

    and citation • Data replication
  13. Culture of open reuse

  14. Collaborative curation

  15. Collaborative curation

  16. Mobilisation of all primary data

  17. Building the knowledgebase e-Infrastructure (services and culture to support data

    reuse) Organised Data (specific views and indexes of data) Models (best estimates of reality) Primary Data (all streams of biodiversity data) Assessments and Indicators Environmental, Climatic and Sociological Data
  18. Parallels with climate data https://mitpress.mit.edu/books/vast-machine

  19. Thank you Donald Hobern, dhobern@gbif.org GBIF Director Global Biodiversity Information

    Facility (GBIF) June 2013