Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Catalog-driven, Reproducible Workflows for Ocean Science

Cdbcc920e73869b6436479419b3a1841?s=47 Rich Signell
January 14, 2016

Catalog-driven, Reproducible Workflows for Ocean Science

Presentation at the Hazards-SEES Kick-off meeting, MIT


Rich Signell

January 14, 2016

More Decks by Rich Signell

Other Decks in Science


  1. Catalog-driven, Reproducible Workflows for Ocean Science Rich Signell , USGS,

    Woods Hole, MA, USA Filipe Fernandes, SECOORA, Salvador, Brazil Kyle Wilcox, Axiom Data Science, Wickford, RI Hazards-SEES Kick-off Meeting, MIT 2016-01-14
  2. The 4th Network Layer: Data • “We need an end-to-end,

    layer-by-layer, designed information technology … that are composed of no more than a stack of protocols” • “We need open standards… and above all, we need to teach scientists to work in this new layer of data” 2 From the essay: “I have seen the Paradigm Shift, and It Is Us”, byJohn Wilbanks, in the book “The Fourth Paradigm” Data Web TCP/IP Ethernet
  3. US Integrated Ocean Observing System (IOOS® ) IOOS® Plan defines:

    • Global Component • Coastal Component  17 Federal Agencies  11 Regional Associations
  4. IOOS Core Principles • Adopt open standards & practices •

    Avoid customer-specific stovepipes • Standardized access services implemented at data providers 4 Customer Web access service Data Provider Observations Models
  5. Ocean grids are often not regularly spaced! Stretched surface and

    terrain following vertical coordinates Curvilinear orthogonal horizontal coordinates
  6. Unstructured (e.g. triangular) grid

  7. NetCDF Climate and Forecast (CF) Conventions + UGRID + SGRID

    Groups using CF: GO-ESSP: Global Organization for Earth System Science Portal IOOS: Integrated Ocean Observing System ESMF: Earth System Modeling Framework OGC: Open Geospatial Consortium (GALEON: WCS profile)
  8. Time Series, Trajectories Meteorology and Wave Buoy in the Gulf

    of Maine. Image courtesy of NOAA. Ocean Glider. Photo by Dave Fratantoni, Woods Hole Oceanographic Institution
  9. IOOS Data Infrastructure Diagram ROMS ADCIRC HYCOM SELFE NCOM NcML

    NcML NcML NcML NcML Common Data Model OPeNDAP+CF WCS NetCDF Subset THREDDS Data Server Standardized (CF-1.6, UGRID-0.9) Virtual Datasets Nonstandard Model Output Data Files Web Services Matlab Panoply IDV Clients NetCDF -Java Library or Broker WMS ncISO ArcGIS NetCDF4 -Python FVCOM Python ERDDAP NetCDF-Java SOS Geoportal Server GeoNetwork GI-CAT Observed data (buoy, gauge, ADCP, glider) Web Portals CKAN-pyCSW NcML Grid Ugrid TimeSeries Profile Trajectory TimeSeriesProfile Nonstandard Data Files Catalog Services
  10. Catalog Search 10

  11. Interoperable access in Matlab (nctoolbox)

  12. Interoperable Access in Python (Iris)

  13. IOOS System Test

  14. 2015 Boston Light Swim, Aug 15, 7:00am since 1907, 8

    miles, no wet suit How cold will the water be?
  15. NECOFS Massbay Forecast

  16. Reproducible IPython/Jupyter Notebook

  17. None
  18. None
  19. None
  20. Final Result

  21. None
  22. None
  23. Reproducible in Minutes for Free

  24. 163 Python packages on IOOS channel!

  25. WMS-driven Model Viewing Portal

  26. Catalog Search 26 Catalog services can be federated via OGC

    CSW (Catalog Service for the Web)
  27. WMS services in TerriaJS 27

  28. Benefits of Standards-Based, Catalog-Driven, Reproducible Workflows • Find the real

    problems – Easy problems that can be fixed in minutes to day – Harder problems to guide future work • Fixes for specific workflows benefit everyone • Build success stories • Create reproducible workflows that others can learn from, expand on, or transform • Standardized workflows help develop the 4th network layer for data
  29. None
  30. Pablo Otero, Lagrangian Map Tracking

  31. Questions • 1. What do you see as being the

    most important items for a successful Hazards project? 2. What are you most looking forward to achieve personally and as a team? 3. What are the main things that you need from other team members? 4. What are the biggest challenges for your role in the Hazards project? • Work with team members to achieve standardized data, services and use community tools. Standards for Lagrangian scientific feature types, CZML, TerriaJS
  32. [ rsignell-usgs | ocefpaf ] & github