Challenges and needs of reproducible workflows of Open Weather and Climate Data

Challenges and needs of reproducible workflows of Open Weather and Climate Data

ECMWF offers, also as operator of the two Copernicus services on Climate Change (C3S) and Atmosphere Monitoring (CAMS), a range of open environmental data sets on climate, air quality, fire and floods.

Through Copernicus, a wealth of open data is being made available free and open and a new range of users, not necessarily ‘expert’ users, are interested in exploiting the data.
This makes the reproducibility of workflows particularly important A full, free and open data policy is vital for reproducible workflows and an important prerequisite. Reproducibility however has to be reflected in all aspects of the data processing chain. The biggest challenge is currently a limited data ‘accessibility’, where ‘accessibility’ means more than just improving data access. Accessibility aspects are strongly linked with being reproducible and
require improvements / developments along the entire data processing chain, including the development of example workflows and reproducible training materials, the need for data standards and interoperability, as well as developing or improving the right open-source software tools.

The presentation will go through each step of some example workflows for open meteorological and climate data and will discuss reproducibility and ‘accessibility’ challenges and future needs that will be required in order to make open meteorological and climate data fully accessible and reproducible.

157d70ab1729f9665879a445fb6b7c87?s=128

Julia Wagemann

October 14, 2019
Tweet

Transcript

  1. Challenges and Needs of REPRODUCIBLE WORKFLOWS of Open Big Weather

    and Climate Data Julia Wagemann PhD candidate at University of Marburg Visiting Scientist at ECMWF @JuliaWagemann #repwork19 Reading, 14 Oct 2019
  2. Basic NEEDS for reproducible workflows: ❖ Open data

  3. Basic NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G
  4. Basic NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G
  5. Basic NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G
  6. Islands of Open Big Earth Data

  7. Islands of Open Big Earth Data Meteorological / climate community

    Earth Observation community
  8. Islands of Open Big Earth Data Meteorological / climate community

    Earth Observation community • Copernicus Climate Data Store • GRIB, NetCDF • Google Earth Engine • GeoTiff, JPEG2000
  9. Principle components of a Big Earth Data workflow Access Processing

    Visualization
  10. Reproducibility challenge - Example

  11. Reproducibility challenge - DATA ACCESSIBILITY • different data are accessible

    via different data access systems • it is still about downloading data • community-specific data formats (GRIB, NetCDF, GeoTiff) • data structure and complexity (analyses vs forecast, multiple dimensions) Access
  12. ERA5 surface pressure

  13. Reproducibility challenge - DATA ACCESSIBILITY - Question If a process

    involves data download that might take several months... Do we call it reproducible?
  14. One great example of a reproducible workflow... • • Processing

    cdsapi xarray cfgrib xESMF cartopy https://colab.research.google.com/drive/1wW Hz_SMCHNuos5fxWRUJTcB6wqkTJQCR
  15. REPRODUCIBILITY does not happen either by accident Interoperability does not

    happen by accident. (Cliff Kottman)
  16. Reproducibility is ´going the extra mile´

  17. Additional NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G
  18. Additional NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G Reproducibility as personal code of conduct
  19. Additional NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G Prioritise Interoperability of data systems Reproducibility as personal code of conduct API
  20. “Problems cannot be solved by the same level of thinking

    that created them!” (Albert Einstein)
  21. Thank you! Questions? Julia Wagemann PhD candidate at University of

    Marburg Visiting Scientist at ECMWF @JuliaWagemann #repwork19 Reading, 14 Oct 2019