$30 off During Our Annual Pro Sale. View Details »

Challenges and needs of reproducible workflows ...

Challenges and needs of reproducible workflows of Open Weather and Climate Data

ECMWF offers, also as operator of the two Copernicus services on Climate Change (C3S) and Atmosphere Monitoring (CAMS), a range of open environmental data sets on climate, air quality, fire and floods.

Through Copernicus, a wealth of open data is being made available free and open and a new range of users, not necessarily ‘expert’ users, are interested in exploiting the data.
This makes the reproducibility of workflows particularly important A full, free and open data policy is vital for reproducible workflows and an important prerequisite. Reproducibility however has to be reflected in all aspects of the data processing chain. The biggest challenge is currently a limited data ‘accessibility’, where ‘accessibility’ means more than just improving data access. Accessibility aspects are strongly linked with being reproducible and
require improvements / developments along the entire data processing chain, including the development of example workflows and reproducible training materials, the need for data standards and interoperability, as well as developing or improving the right open-source software tools.

The presentation will go through each step of some example workflows for open meteorological and climate data and will discuss reproducibility and ‘accessibility’ challenges and future needs that will be required in order to make open meteorological and climate data fully accessible and reproducible.

Julia Wagemann

October 14, 2019
Tweet

More Decks by Julia Wagemann

Other Decks in Technology

Transcript

  1. Challenges and Needs of REPRODUCIBLE WORKFLOWS of Open Big Weather

    and Climate Data Julia Wagemann PhD candidate at University of Marburg Visiting Scientist at ECMWF @JuliaWagemann #repwork19 Reading, 14 Oct 2019
  2. Islands of Open Big Earth Data Meteorological / climate community

    Earth Observation community • Copernicus Climate Data Store • GRIB, NetCDF • Google Earth Engine • GeoTiff, JPEG2000
  3. Reproducibility challenge - DATA ACCESSIBILITY • different data are accessible

    via different data access systems • it is still about downloading data • community-specific data formats (GRIB, NetCDF, GeoTiff) • data structure and complexity (analyses vs forecast, multiple dimensions) Access
  4. Reproducibility challenge - DATA ACCESSIBILITY - Question If a process

    involves data download that might take several months... Do we call it reproducible?
  5. One great example of a reproducible workflow... • • Processing

    cdsapi xarray cfgrib xESMF cartopy https://colab.research.google.com/drive/1wW Hz_SMCHNuos5fxWRUJTcB6wqkTJQCR
  6. Additional NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G Reproducibility as personal code of conduct
  7. Additional NEEDS for reproducible workflows: ❖ Open data ❖ Open-source

    software / FOSS4G Prioritise Interoperability of data systems Reproducibility as personal code of conduct API
  8. “Problems cannot be solved by the same level of thinking

    that created them!” (Albert Einstein)
  9. Thank you! Questions? Julia Wagemann PhD candidate at University of

    Marburg Visiting Scientist at ECMWF @JuliaWagemann #repwork19 Reading, 14 Oct 2019