VertNet bettertaxonomy.py presentation

VertNet bettertaxonomy.py presentation

Ef10e56567f5d6e20bc3f7f4ab4e3254?s=128

Gaurav Vaidya

May 28, 2014
Tweet

Transcript

  1. Scientific names (genus, species)

  2. Meaningless name Scientific names (genus, species) Meaningful name Semi-meaningless name

  3. Meaningless name Scientific names (genus, species) Meaningful name Semi-meaningless name

    UNKNOWN Blank Incidental
  4. Meaningless name Scientific names (genus, species) Meaningful name Semi-meaningless name

    UNKNOWN Blank Incidental FROG OR TOAD, UNIDENTIFIED Green bird with blunt bill. Unidentified. UNKNOWN ANTSHRIKE
  5. Meaningless name Scientific names (genus, species) Meaningful name Outdated name

    Current name Semi-meaningless name UNKNOWN Blank Incidental FROG OR TOAD, UNIDENTIFIED Green bird with blunt bill. Unidentified. UNKNOWN ANTSHRIKE
  6. Meaningless name Scientific names (genus, species) Meaningful name Outdated name

    Current name Semi-meaningless name UNKNOWN Blank Incidental FROG OR TOAD, UNIDENTIFIED Green bird with blunt bill. Unidentified. UNKNOWN ANTSHRIKE Oncifelis Acanthidositta Acanthocottus
  7. Meaningless name Scientific names (genus, species) Meaningful name Outdated name

    Current name Semi-meaningless name UNKNOWN Blank Incidental FROG OR TOAD, UNIDENTIFIED Green bird with blunt bill. Unidentified. UNKNOWN ANTSHRIKE Oncifelis Acanthidositta Acanthocottus Leopardus Acanthisitta Myoxocephalus
  8. Scientific names (genus, species) Python 3 script (bettertaxonomy.py)

  9. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database
  10. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists
  11. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine
  12. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV
  13. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Dropbox/GitHub
  14. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World Dropbox/GitHub
  15. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS Dropbox/GitHub
  16. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS Dropbox/GitHub
  17. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS Dropbox/GitHub
  18. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS NCBI Taxonomy Dropbox/GitHub
  19. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS NCBI Taxonomy Fishbase Dropbox/GitHub
  20. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS NCBI Taxonomy Fishbase … Dropbox/GitHub
  21. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS NCBI Taxonomy Fishbase … Dropbox/GitHub
  22. Scientific names (genus, species) Python 3 script (bettertaxonomy.py) 1. Internal

    database 2. GBIF Checklists 3. TaxRefine Local CSV Mammal Species of the World ITIS NCBI Taxonomy Fishbase … Dropbox/GitHub Semi-meaningless names!
  23. Usage • $ python3 better_taxonomy.py input.csv -i internal.csv > output.csv

  24. Usage • $ python3 better_taxonomy.py input.csv -i internal.csv > output.csv

    • Program: https://github.com/gaurav/bettertaxonomy/tree/ develop
  25. Usage • $ python3 better_taxonomy.py input.csv -i internal.csv > output.csv

    • Program: https://github.com/gaurav/bettertaxonomy/tree/ develop • From: https://docs.google.com/spreadsheets/d/16Dpuo- NqXjpjCLVrHHyQ0uvQxNQA35oIQOewPHrLLfg/edit? usp=sharing
  26. Usage • $ python3 better_taxonomy.py input.csv -i internal.csv > output.csv

    • Program: https://github.com/gaurav/bettertaxonomy/tree/ develop • From: https://docs.google.com/spreadsheets/d/16Dpuo- NqXjpjCLVrHHyQ0uvQxNQA35oIQOewPHrLLfg/edit? usp=sharing • To: https://docs.google.com/spreadsheets/d/ 1Jpr6stWGR3qe0swQ4_215kogau2lqmWxaDivBbGVd60/ edit?usp=sharing
  27. Next steps 1. Speed: caching, combine queries 2. Prioritise checklists:

    by class, in configuration file 3. Higher taxonomy: support semi-meaningful names