Generating reproducible workflows for the publication of open and FAIR data

6f6914b1cdb438695ec1aaabba7463bb?s=47 Peter Desmet
December 12, 2019

Generating reproducible workflows for the publication of open and FAIR data

Talk at "Research Data Management & Data Stewardship: much more than a FAIRytale" in Brussels, Belgium - December 12, 2019. https://training.vib.be/research-data-management-data-stewardship-much-more-fairytale

6f6914b1cdb438695ec1aaabba7463bb?s=128

Peter Desmet

December 12, 2019
Tweet

Transcript

  1. 12 December 2019, Brussels Peter Desmet & Lien Reyserhove Generating

    repeatable workflows for the publication of open and FAIR data
  2. Hi!

  3. Research data management

  4. Open data publication

  5. Research software development

  6. None
  7. We want to study (invasive) alien species to inform/guide environmental

    policy
  8. Reporting on invasive alien species

  9. •  What species? •  Where are they? •  How are

    they getting here? •  What is their impact? •  Future distributions? •  Future impact? Alien species in Belgium
  10. What species are alien in Belgium?

  11. Let’s check the Alien species checklist for Belgium

  12. We don’t have one! And certainly not one that is

    verified, open and FAIR
  13. We do have A number of authoritative checklist with a

    more specialized scope
  14. Data for alien plants

  15. Data for alien plants

  16. Open & FAIR? Checklist Open Findable Accessible Inter- operable Reusable

  17. Data for alien molluscs

  18. Open & FAIR? Checklist Open Findable Accessible Inter- operable Reusable

  19. Checklist Open Findable Accessible Inter- operable Reusable How to go

    from …
  20. Checklist Open Findable Accessible Inter- operable Reusable … to open

    & FAIR data?
  21. Checklist Open Findable Accessible Inter- operable Reusable … to open

    & FAIR data? Unified
  22. TrIAS data publication workflow

  23. Workflow 1.  Data management Tidy data

  24. Authors can manage their own data

  25. Tidy data (Wickham 2014)

  26. Tidy data (Wickham 2014) Each row is an observation

  27. Each column is a variable Tidy data (Wickham 2014)

  28. Tidy data (Wickham 2014) Each table is an observational unit

  29. Setup a repository

  30. Template structure

  31. Upload raw data

  32. Workflow 1.  Data management 2.  Standardization Tidy data Interoperable

  33. Darwin Core

  34. Darwin Core

  35. Darwin Core

  36. Reproducible data transformation

  37. Literate programming (R Markdown)

  38. Generate standardized data

  39. Repeatable

  40. Repeatable

  41. Collaborative

  42. Versioned

  43. Workflow 1.  Data management 2.  Standardization 3.  Documentation Tidy data

    Interoperable Understandable
  44. Documenting with metadata

  45. Bringing it all together

  46. Bringing it all together

  47. Workflow 1.  Data management 2.  Standardization 3.  Documentation 4.  Publication

    Tidy data Interoperable Understandable Open
  48. Publishing data

  49. Published data

  50. Published data

  51. Workflow 1.  Data management 2.  Standardization 3.  Documentation 4.  Publication

    5.  Registration Tidy data Interoperable Understandable Open FAIR
  52. Global Biodiversity Information Facility

  53. Registering a dataset with GBIF

  54. Dataset on GBIF

  55. FAIR metadata

  56. FAIR data

  57. Alien molluscs

  58. Checklist Open Findable Accessible Inter- operable Reusable FAIR datasets

  59. Going even further

  60. Imagine a future where dynamically, from year to year, we

    can track the progression of alien species (AS), identify emerging species, assess their current and future risk and timely inform policy in a seamless data-driven workflow. One that is built on open science and open data infrastructures. By using international biodiversity standards and facilities, we would ensure interoperability, repeatability and sustainability. This would make the process adaptable to future requirements in an evolving IAS policy landscape both locally and internationally. Mission
  61. Checklist Open Findable Accessible Inter- operable Reusable Creating a unified

    checklist Unified
  62. Multiple checklists on GBIF

  63. Using GBIF as an infrastructure

  64. Repeatable process

  65. Documented process

  66. FAIR unified checklist

  67. We now have an Alien species checklist for Belgium

  68. Going even further

  69. Checklist-based indicators

  70. Open occurrence data

  71. Occurrence-based indicators

  72. Reproducible, open & fair

  73. trias-project.be

  74. Thank you! Peter Desmet & Lien Reyserhove (2019) Generating reproducible

    workflows for the publication of open and FAIR data. Presentation. http://bit.ly/trias-open-fair @trias_project Tracking Invasive Alien Species (TrIAS) trias-project.be @oscibio Open science lab for biodiversity oscibio.inbo.be @peterdesmet