Slide 1

Slide 1 text

Catalog-driven workflows using CSW Rich Signell , USGS, Woods Hole, MA, USA Filipe Fernandes, SECOORA, Brazil Kyle Wilcox, Axiom Data Science, Wickford, RI, USA ESIP Winter Meeting, Washington, DC 2016-01-08

Slide 2

Slide 2 text

The 4th Network Layer: Data • “We need an end-to-end, layer-by-layer, designed information technology … that are composed of no more than a stack of protocols” • “We need open standards… and above all, we need to teach scientists to work in this new layer of data” 2 From the essay: “I have seen the Paradigm Shift, and It Is Us”, byJohn Wilbanks, in the book “The Fourth Paradigm” Data Web TCP/IP Ethernet

Slide 3

Slide 3 text

US Integrated Ocean Observing System (IOOS® ) • Global Component • Coastal Component  17 Federal Agencies  11 Regional Associations

Slide 4

Slide 4 text

IOOS Core Principles • Adopt open standards & practices • Avoid customer-specific stovepipes • Standardized access services implemented at data providers 4 Customer Web access service Data Provider Observations Models

Slide 5

Slide 5 text

Numerical model Output

Slide 6

Slide 6 text

Time Series, Trajectories Meteorology and Wave Buoy in the Gulf of Maine. Image courtesy of NOAA. Ocean Glider. Photo by Dave Fratantoni, Woods Hole Oceanographic Institution

Slide 7

Slide 7 text

IOOS Data Infrastructure Diagram ROMS ADCIRC HYCOM SELFE NCOM NcML NcML NcML NcML NcML Common Data Model OPeNDAP NetCDF Subset THREDDS Data Server Standardized (CF-1.6, SGRID-0.1, UGRID-0.9) Virtual Datasets Nonstandard Model Output Data Files Web Services Matlab Panoply IDV Clients NetCDF -Java Library or Broker WMS ncISO ArcGIS NetCDF4 -Python FVCOM Python EDC NetCDF-Java SOS Geoportal Server GeoNetwork CKAN Observed data (buoy, gauge, ADCP, glider) Web Portals pycsw NcML Grid TimeSeries Profile Trajectory TimeSeriesProfile Sgrid Ugrid Nonstandard Data Files Catalog Services Rectilinear ERDDAP WCS

Slide 8

Slide 8 text

Catalog Search 8

Slide 9

Slide 9 text

Interoperable Access in Python (Iris)

Slide 10

Slide 10 text

IOOS System Test

Slide 11

Slide 11 text

2015 Boston Light Swim 2015 Aug 15, 7:00 am start 8 mile swim No wet suit How cold will the water be?

Slide 12

Slide 12 text

NECOFS Massbay Forecast

Slide 13

Slide 13 text

Reproducible Jupyter Notebook Go to https://github.com/ocefpaf/boston_light_swim, click on “launch binder” to run on cloud

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Final Result

Slide 18

Slide 18 text

18

Slide 19

Slide 19 text

19

Slide 20

Slide 20 text

pycsw 20

Slide 21

Slide 21 text

Workflow for the USGS CMG Portal 21

Slide 22

Slide 22 text

Workflow (3/3) Axiom Data Science – Runs a CSW search (in a cron job) on the modeling groups pycsw services, filtering on datasets that contain a project called “CMG_Portal” – Datasets that have valid WMS services are added to the portal See for details of the workflow 22

Slide 23

Slide 23 text

23

Slide 24

Slide 24 text

WMS-driven Model Viewing Portal

Slide 25

Slide 25 text

25

Slide 26

Slide 26 text

Interoperable access in Matlab (nctoolbox)

Slide 27

Slide 27 text

27

Slide 28

Slide 28 text

28

Slide 29

Slide 29 text

Catalog-driven dynamic portals 29

Slide 30

Slide 30 text

30

Slide 31

Slide 31 text

Benefits of catalog-driven applications • Dynamically adapt to new or changing data • Find the machine-to-machine issues – Easy problems that can be fixed in minutes to day – Harder problems to guide future work • Fixes for your workflow benefit everyone • Build success stories • Create reproducible workflows that others can learn from, expand on, or transform • Standardized workflows help develop the 4th network layer for data