"NOAA One-Stop", Ken Casey, NCEI

"NOAA One-Stop", Ken Casey, NCEI

The OneStop Project is designed to improve NOAA's data discovery and access framework. Focusing on all layers of the framework and not just the user interface, OneStop is addressing data format and metadata best practices, ensuring more data are available through modern web services, working to improve the relevance of dataset searches, and improving both collection-level metadata management and granule level metadata systems to accommodate the wide variety and vast scale of NOAA's data.

1bfb948b87cd5766db6ceac17d854d62?s=128

ESIP Federation

July 13, 2016
Tweet

Transcript

  1. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N The NOAA OneStop Data Discovery and Access Framework Project Kenneth S. Casey, PhD 13 July 2016 1 ESIP Tech Deep Dive
  2. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N In response to the President’s Open Government Initiative and related policies, NOAA has committed to providing improved public access to all of its environmental information, to enable research and commercial innovation through ease of data discovery and use ▪OneStop supports NOAA's efforts by leveraging existing access technologies and infusing specific innovations to provide improved discovery, access, and visualization services for NOAA’s data ▪OneStop is viewed by a NESDIS as a pathfinder effort with an initial focus on selected high-priority datasets from NESDIS and other program data meeting OneStop standards, but eventually scalable across NOAA’s data ▪OneStop is implementing the USGEO Common Framework for Earth Observation Data and leveraging/supporting the NOAA Big Data Project (BDP) and Big Earth Data Initiative (BEDI) 2 NCEI Program Overview Motivation and Scope 2
  3. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Architected for Success: Design, Architecture, and Storage ConOps 3
  4. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N OneStop Data Framework30,000 ft 4 Data Storage Services Catalog Services Showcase User Interface Data access, subset, visualization, and granule services Disk Storage Metadata Other User Interfaces Other Metadata Systems Other Data Access Systems Inside OneStop Outside OneStop BDP Cloud CLASS Tape
  5. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Design and Architecture10,000 ft 5 • Build on foundation of existing, mature data standards and web services • Emphasize not just interface, but the supporting data infrastructure • User-centered, design focused Design Principles ✅ Use existing enterprise capabilities when possible ✅ Rely on loose coupling of reusable system components ✅ Use standards at interfaces ✅ Use Open Source DIP = Dissemination Information Package Showcase User Interface Account Management Metadata Repository Metadata Editor WAF Metadata Evaluation Authentication Authorization Audit Search Engine ETL Tools Data Ranking Discovery Service Security Services User Interface Services Disk-Based Storage Tape-Based Storage Cloud-Based Storage DIP DIP DIP Storage Services Hyrax TDS FTP HTTPS ERDDA P ArcGIS LAS WMS Proxy Data Access Services Metadata Management Services Catalog Services
  6. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N 6 OneStop Discovery, Metadata, Stewardship, and Access Services OneStop/NCEI disk storage CLASS tape-based storage BDP cloud storage Unified access for the user regardless of storage medium Any and all services for a given dataset provided to the user Success Goal: 66% of users tested prefer new interface over old* (metric to be vetted by professional external review team) Storage Services Unified for Users
  7. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Storage Concept of Operations 7 Reflects decision to place storage within NCEI 5009 system boundary. Agreement between NCEI, OSGS, and OSPO to consider OneStop storage as the next step toward enterprise Storage Infrastructure Service (SIS) and as a step toward a key Mission Science Network (MSN) capability. Mission Science Network (Future)
  8. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N OneStop Featured Data Groups and OneStop “Readiness” 8
  9. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N “OneStop Ready” 9 Readiness Metric Requirement ISO Compliant Collection-level Metadata Every collection level record in the data group has an ISO compliant metadata record. ISO Completeness Collection-level Rubric V2 Every collection level record in the data group shall have a completeness score of at least 90%. OneStop Collection-level Readiness Rubric Browse graphic, GCMD science keywords... Standardized metadata exists for each granule or is embedded within each granule ACDD and CF conventions for embedded metadata Granule metadata contains OneStop-required content See OneStop granule metadata specification Machine Independent Data File Format Each granule is formatted in a machine readable format, such as netCDF Each granule is accessible via a URL Minimally, direct download https/ftps but prefer interoperable services (USGEO Common Framework) Data Stewardship Maturity Matrix (DSMM) Assessment is complete and documented in collection- level metadata record Product Maturity Matrix (PMM) Optional. If PMM exists, then document results in collection level metadata % readiness for a data group assessed in each of Collection Metadata, Granule Metadata, Data Formats, Data Access, DSMM. Data group as a whole considered “OneStop Ready” when it reaches 95% overall or higher.
  10. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N OneStop Featured Data Groupings 10 Data Group Subject Matter Expert Number of Collections Digital Elevation Models Barry Eakins, Kelly Carignan 137 CO-OPS NWLON PORTS Tom Ryan 1 World Ocean Atlas 2013 Tim Boyer 8 Group for High Resolution SST Korak Saha 81 NDBC C-MAN Tom Ryan 1 NOAA Climate Data Records Jesse Glance, Tom Zhao 32 OCS Hydro Jason Baillio 17,763 COAPS SAMOS Chris Paver 1 NEXRAD Level 2 and 3 Steve Ansari 2 Reformatted Legacy GOES GVAR data Ken Knapp 2 to 10 S-NPP/JPSS Axel Graumann 75 Water Column Sonar Data Chuck Anderson, Carrie Wall-Bell 368 ESSA images Jason Cooper 1
  11. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N “OneStop Ready” Status - April 11 Data Group Percent Ready Collection Metadata Granule Metadata Data Formats Data Access DSMM PMM (optional) Digital Elevation Models P P P N CO-OPS NWLON PORTS Y N Y Y N World Ocean Atlas 2013 N N Y Y N Group for High Resolution SST Y Y Y Y N NDBC C-MAN Y N Y Y N NOAA Climate Data Records P N Y P N Y OCS Hydro P N P Y N COAPS SAMOS P N P Y N NEXRAD Level 2 and 3 P N N P N Reformatted Legacy GOES GVAR data N N N N N S-NPP/JPSS P N P N N Water Column Sonar Data P P N N N ESSA images N N N N N Y = yes, ready; P = partially ready; N= not ready; grey = not yet assessed or not applicable View Live Table Here
  12. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N “OneStop Ready” Status - July 12 Data Group Percent Ready Collection Metadata Granule Metadata Data Formats Data Access DSMM PMM (optional) Digital Elevation Models 100% Y N/A Y Y Y CO-OPS NWLON PORTS 97% 84% Y Y Y Y World Ocean Atlas 2013 77% 88% Y Y Y N Group for High Resolution SST 80% 95% 95% Y Y 10% NDBC C-MAN 57% 84% N Y Y N NOAA Climate Data Records 70% 90% P Y 75% Y Y OCS Hydro P N P Y N COAPS SAMOS P N P Y N NEXRAD Level 2 and 3 73% 90% P Y P 75% Reformatted Legacy GOES GVAR data N N N N N S-NPP/JPSS 35% 50% N P N 75% Water Column Sonar Data P P N N N ESSA images N N N N N Y = yes, ready; P = partially ready; N= not ready; grey = not yet assessed or not applicable View Live Table Here
  13. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User-Centered Development: Progress on User Interface 13
  14. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Wire Frames: Intro Page 14 Drawing from https://standards.usa.gov/ and other sources NOAA OneStop https://www.ncei.noaa.gov/onestop
  15. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Wire Frames: Simple Search 15 NOAA OneStop https://www.ncei.noaa.gov/onestop
  16. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Wire Frames: Grouped Results 16 NOAA OneStop https://www.ncei.noaa.gov/onestop
  17. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Wire Frames: Icon Grid Results 17 NOAA OneStop https://www.ncei.noaa.gov/onestop
  18. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N NOAA OneStop https://www.ncei.noaa.gov/onestop Wire Frames: Map Results 18
  19. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User Interface: Under the Hood 19 Overview of System Components: DEM
  20. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User Interface: Under the Hood 20 Data Loading: • Pulling data from: ◦ MD Geoportal: GHRSST ◦ CO WAF: DEM • Write metadata to local elasticsearch via OneStop API • Write metadata to Geoportal for availability via CSW & OpenSearch DE M
  21. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User Interface: Under the Hood 21 Server API: • Translates JSON search request from UI into elasticsearch query • Now supports temporal and spatial searching in addition to simple text search • Returns top 10 results (pagination features to be developed later) DE M
  22. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User Interface: Under the Hood 22 UI Features: • Text search against DEM and GHRSST metadata • Clickable flipcard results grid • NOAA look/feel • Header/Footer In progress: • Spatial & temporal search DE M
  23. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N User Interface Demo: Next Week! 23
  24. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Ensuring Consistency and Rigor: Metadata Tool Development 24
  25. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N ▪ Assessed metadata tools in use today against requirements determined from collected user stories ▪ ATRAC ▪ AMS/Accession Tracking DataBase (ATDB) ▪ DOCUCOMP ▪ CEdit ▪ Send2NCEI ▪ Geoportal Server ▪ Geonetwork ▪ EMMA ▪ MERMAid ▪ The Metadata Tool Analysis suggested that none currently meet all needs, and highlighted a path forward... Metadata Tool Assessments 25
  26. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Metadata Tool System 26 Overview ISO Adaptor/V alidator Geoportal + ElasticSearch Metadata database Collection CLOB Granule CLOB WAF Other Metadata Tools OneStop UI DIF JSON SPASE Etc... ETL Kibana (New Rubric) data.noaa.gov Google, schema.org Inputs Outputs WDS, WIS, etc.
  27. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N DSMM Graphics Tool 27 Look for Ge Peng’s session on Tuesday, next week at ESIP!
  28. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Ensuring Community Alignment: Map to USGEO Common Framework 28
  29. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Discovery Provide CSW and OpenSearch Provide Project Open Data and Schema.org metadata Mint DOis Publish WAF to data,gov, expose via OAI- PMH Use Resolvable Identifiers (e.g., ORCID) Access All: HTTPS/FTP Grids: WMS, WMTS, DAP, WCS Unstructured Grids: UGRID In Situ: SOS, WFS, DAP Features: WFS Tables: TableDAP Documentation ISO 19115-1 and -3 preferred ISO 19115-2 accepted ISO 19157 for Data Quality SensorML for Instruments Dynamic conversion of ISO to Project Open Data JSON Formats Numerical: netCDF4/HDF5 Imagery: GeoTIFF Points/Lines/ Polygons: GML Hydrological: WaterML2.0 Weather: WXXM Vocabularies Spatial Reference System: EPSG Geodetic P.D. Hydrologic: WBD Keywords: OMB Circular A- 16 GCMD Parameter Names: CF Content Models: US GIN Darwin Core NEPAnode This map was generated by Kenneth S. Casey, based on the USGEO Common Framework for Earth Observation Data (2016). USGEO Common Framework...
  30. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Discovery Provide CSW and OpenSearch Provide Project Open Data and Schema.org metadata Mint DOis Publish WAF to data,gov, expose via OAI- PMH Use Resolvable Identifiers (e.g., ORCID) Access All: HTTPS/FTP Grids: WMS, WMTS, DAP, WCS Unstructured Grids: UGRID In Situ: SOS, WFS, DAP Features: WFS Tables: TableDAP Documentation ISO 19115-1 and -3 preferred ISO 19115-2 accepted ISO 19157 for Data Quality SensorML for Instruments Dynamic conversion of ISO to Project Open Data JSON Formats Numerical: netCDF4/HDF5 Imagery: GeoTIFF Points/Lines/ Polygons: GML Hydrological: WaterML2.0 Weather: WXXM Vocabularies Spatial Reference System: EPSG Geodetic P.D. Hydrologic: WBD Keywords: OMB Circular A- 16 GCMD Parameter Names: CF Content Models: US GIN Darwin Core NEPAnode OneStop Currently Addresses OneStop Partly Addresses OneStop Not Addressing ...OneStop Currently Addressing
  31. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Organized for Success: Project Organization, Personnel, and Schedule 31
  32. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N 32 Project Organization NCEI Program Overview 32 OneStop Project Teams OneStop Integrated Project Team (IPT) NCEI, ACIO-S, OSGS Tom Karl/NCEI … ACIO-S NESDIS DAA NESDIS AA Kenneth Casey, Project Manager User Engagement Cross-LO Engagement Architecture Team IT Services and Tools Team Metadata and Data Improvement Team Agile Team
  33. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Organization Chart (for positions > 10% FTE) 33 MD CO NC Martin Aubrey Partha Chowdhuri Rich Fozzard Steven Marcus Project Manager Kenneth S. Casey Asst: Mike Chapman IT Tools and Services Team John Relph Metadata and Data Improvement Team Nancy Ritchey Thomas Jaensch Raisa Ionin MD Robert Partee Jason Shapiro CO Paul Lemieux Justin Reid NC Architecture Team Dave Fischman Jay Morris (OSGS) OneStop IPT User Interface Team (Agile) Dave Neufeld Evan McQuinn CO Aaron Rosenberg Procurement Support James Goudouros (OSGS) Don Collins Yuanjie Li Phil Jones Anna Milan Robert Briscoe Tom Carey Joseph Mangin (ACIO-S) Sonny Zinn Arianna Jakositz Aaron Caldwell Semere Ghebrechristos Funded by other Funded by OneStop J. Mize, K. Martinolich (MS) Data Group SMEs (CCOG/CWC) ✅ ✅ = At ESIP Next Week ✅ ✅ ✅ ✅ ✅
  34. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Task Name Days Start Finish Progress 1.0 Architecture 60 Thu 10/1/15 Fri 12/24/16 Design and Architecture Document Thu 12/17/16 100% 2.0 Identify Web Services 59 Thu 10/1/15 Wed 12/23/15 100% 3.0 Data/Metadata Best Practices 59 Thu 10/1/15 Wed 12/23/15 100% 4.0 Storage/IT Support 237 Wed 11/2/15 Thu 9/22/16 71% Storage ConOps Tue 5/24/2016 100% 5.0 Development Team Setup 123 Thu 10/1/15 Thu 4/28/16 100% 6.0 Develop Beta Version 164 Mon 4/7/16 Tue 12/6/16 14% Release Beta Wed 12/7/16 7.0 Internal Evaluation Report 26 Wed 12/7/17 Tue 1/17/17 0% 8.0 Develop Release 1.0 53 Tue 1/17/17 Fri 3/31/17 0% Release 1.0 Mon 4/3/17 9.0 Professional Usability Study 20 Mon 4/3/17 Fri 4/28/17 0% 10.0 Develop Release 1.1 64 Mon 4/3/17 Fri 6/30/17 0% Release 1.1 Mon 7/3/17 11.0 Data and Metadata Improvement 325 Tue 12/15/15 Wed 3/31/17 20% 2 data groupings Thu 6/30/16 100% 5 data groupings Fri 9/30/2016 20% 10 data groupings Wed 3/31/17 0% 12.0 Relevance Ranking Improvement 265 Fri 2/5/16 Wed 3/15/17 5% 13.0 Metadata Management System 378 Tue 12/15/15 Wed 6/15/17 5% 14.0 WMS Proxy 336 Fri 2/5/16 Wed 6/15/17 4% OneStop Schedule (top level with selected milestones shown) 34 link to full schedule
  35. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N Summary: Accomplishments to Date 35 • OneStop Storage ConOps (v2.0 signed by NCEI/OSGS) • Hiring completed for ERT, GST, and CIRES team members, plus dedicated IT support in NC. • Ongoing engagement following Communications Plan • Detailed Project Management Plan with quarterly updates • Agile Epics and five sprints completed • Initial user interface now functioning - demo next week at ESIP • USGEO Common Framework map documented • Data Set Maturity Matrix (DSMM) Quick Start Guide • DSMM Graphic Visualizer tool released • Defined where to capture DSMM results in ISO record • 2 Data Groupings/Metadata Improved! (DEMs and NWLON/PORTS). GHRSST to Q4 (80%) due to DSMM effort Progress on other datasets continues (see detailed tracking)
  36. N A T I O N A L O C

    E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N 36 Questions? OneStop See you at ESIP: https://2016esipsummermeeting.sched.org/event/c241039c436b775d4e228 2d64bdfd912?iframe=no