Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Catalog-Driven, Reproducible Workflows for Ocean Science

Rich Signell
August 13, 2015

Catalog-Driven, Reproducible Workflows for Ocean Science

Presented as an ESIP IT&I "Rant and Rave" Webinar. A recording of this presentation is here: https://www.youtube.com/watch?v=05ax0lkQFrg

Rich Signell

August 13, 2015

More Decks by Rich Signell

Other Decks in Science


  1. Catalog-driven, Reproducible Workflows for Ocean Science Rich Signell , USGS,

    Woods Hole, MA, USA Filipe Fernandes, Centro Universidade Monte Serrat, Santos, Brazil.
  2. 2015 Boston Light Swim, Aug 15, 7:00am since 1907, 8

    miles, no wet suit How cold will the water be?
  3. NECOFS Massbay Forecast

  4. US Integrated Ocean Observing System (IOOS® ) IOOS® Plan defines:

    • Global Component • Coastal Component  17 Federal Agencies  11 Regional Associations SECOORA Model Skill- Assessment Project: Deborah Hernandez and Vembu Subramanian
  5. IOOS Core Principles • Adopt open standards & practices •

    Avoid customer-specific stovepipes • Standardized access services implemented at data providers 5 Customer Web access service Data Provider Observations Models
  6. Ocean grids are often not regularly spaced! Stretched surface and

    terrain following vertical coordinates Curvilinear orthogonal horizontal coordinates
  7. Unstructured (e.g. triangular) grid

  8. Time Series, Trajectories Meteorology and Wave Buoy in the Gulf

    of Maine. Image courtesy of NOAA. Ocean Glider. Photo by Dave Fratantoni, Woods Hole Oceanographic Institution
  9. NetCDF Climate and Forecast (CF) Conventions provide a solution Groups

    using CF: GO-ESSP: Global Organization for Earth System Science Portal IOOS: Integrated Ocean Observing System ESMF: Earth System Modeling Framework OGC: Open Geospatial Consortium (GALEON: WCS profile) CF Convention Draft Spec for Unstructured Grid: http://bit.ly/ugrid_cf
  10. IOOS Recommended Web Services and Data Encodings In-situ data (buoys,

    piers, towed sensors) Gridded data (model outputs, satellite) OGC Sensor Observation Service (SOS) OPeNDAP with Climate and Forecast Conventions XML or CSV Binary DAP using Climate and Forecast (CF) conventions Images of data OGC Web Map Service (WMS) GeoTIFF, PNG etc. -possibly with standardized styles Data Type Web Service Encoding
  11. OGC Sensor Observation Service (SOS) • Provides standard access to

    sensor data – GetCapabilities: provides the means to access SOS service metadata – DescribeSensor - retrieves detailed information about the sensors and processes generating those measurements. – GetObservation - provides access to sensor observations and measurement data via a spatio- temporal query that can be filtered by phenomena
  12. IOOS Data Infrastructure Diagram ROMS ADCIRC HYCOM SELFE NCOM NcML

    NcML NcML NcML NcML Common Data Model OPeNDAP+CF WCS NetCDF Subset THREDDS Data Server Standardized (CF-1.6, UGRID-0.9) Virtual Datasets Nonstandard Model Output Data Files Web Services Matlab Panoply IDV Clients NetCDF -Java Library or Broker WMS ncISO ArcGIS NetCDF4 -Python FVCOM Python ERDDAP NetCDF-Java SOS Geoportal Server GeoNetwork GI-CAT Observed data (buoy, gauge, ADCP, glider) Godiva2 CKAN-pyCSW NcML Grid Ugrid TimeSeries Profile Trajectory TimeSeriesProfile Nonstandard Data Files Catalog Services
  13. Metadata harvest for Catalog Search 13

  14. 14

  15. Iris: Data Access using OPeNDAP+CF:

  16. OWSLib for SOS and CSW

  17. Final Result

  18. None
  19. None
  20. rsignell-usgs | ocefpaf & github

  21. 144 python packages on IOOS channel!

  22. None
  23. Summary • Standards, web services and catalogs allow us to

    serve data in a unified way • Python gives us a free scientific access, analysis and visualization environment • Ipython/Jupyter notebooks give us documented workflows and browser interface • Anaconda and anaconda.org lets anyone easily reproduce our workflows • Result: more efficient and effective access to ocean data, and anyone can assess ocean model skill
  24. Client Software Stack • Environment – IPython Notebooks, Anaconda, Binstar,

    Wakari, Github • Search – CSW using OWSLib • Access – OPenDAP+CF using Iris and Pyugrid – Sensor Observation Service (SOS) using OWSLib and PyOOS • Analysis and Plotting – Scipy, Pandas, Matplotlib, Cartopy, Vincent, Folium
  25. Using Github issues for everything

  26. Using Github to Capture Successes & Lessons Learned

  27. OWSlib CSW

  28. OGC Catalog Services for the Web (CSW) • Provides standardized

    services for search – GetCapabilities : returns the list of queryables – GetRecords : allows geospatial, temporal, keyword and free text search (and other queryables)
  29. CSW Request