Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"NOAA One-Stop", Ken Casey, NCEI

"NOAA One-Stop", Ken Casey, NCEI

The OneStop Project is designed to improve NOAA's data discovery and access framework. Focusing on all layers of the framework and not just the user interface, OneStop is addressing data format and metadata best practices, ensuring more data are available through modern web services, working to improve the relevance of dataset searches, and improving both collection-level metadata management and granule level metadata systems to accommodate the wide variety and vast scale of NOAA's data.

ESIP Federation

July 13, 2016
Tweet

More Decks by ESIP Federation

Other Decks in Science

Transcript

  1. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    The NOAA OneStop Data Discovery and
    Access Framework Project
    Kenneth S. Casey, PhD
    13 July 2016
    1
    ESIP Tech Deep Dive

    View full-size slide

  2. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    In response to the President’s Open Government Initiative and related
    policies, NOAA has committed to providing improved public access to
    all of its environmental information, to enable research and commercial
    innovation through ease of data discovery and use
    ▪OneStop supports NOAA's efforts by leveraging existing access
    technologies and infusing specific innovations to provide improved
    discovery, access, and visualization services for NOAA’s data
    ▪OneStop is viewed by a NESDIS as a pathfinder effort with an
    initial focus on selected high-priority datasets from NESDIS and
    other program data meeting OneStop standards, but eventually
    scalable across NOAA’s data
    ▪OneStop is implementing the USGEO Common Framework for
    Earth Observation Data and leveraging/supporting the NOAA Big
    Data Project (BDP) and Big Earth Data Initiative (BEDI)
    2
    NCEI Program Overview
    Motivation and Scope
    2

    View full-size slide

  3. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Architected for Success:
    Design, Architecture, and Storage ConOps
    3

    View full-size slide

  4. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    OneStop Data Framework30,000 ft
    4
    Data Storage Services
    Catalog Services
    Showcase User
    Interface
    Data access, subset, visualization, and granule
    services
    Disk Storage
    Metadata
    Other User
    Interfaces
    Other Metadata
    Systems
    Other Data
    Access
    Systems
    Inside
    OneStop
    Outside OneStop
    BDP
    Cloud
    CLASS
    Tape

    View full-size slide

  5. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Design and Architecture10,000 ft
    5
    • Build on foundation of
    existing, mature data
    standards and web services
    • Emphasize not just interface,
    but the supporting data
    infrastructure
    • User-centered, design
    focused
    Design Principles
    ✅ Use existing enterprise capabilities
    when possible
    ✅ Rely on loose coupling of reusable
    system components
    ✅ Use standards at interfaces
    ✅ Use Open Source
    DIP = Dissemination Information Package
    Showcase User Interface Account Management
    Metadata
    Repository
    Metadata
    Editor
    WAF
    Metadata
    Evaluation
    Authentication Authorization Audit
    Search
    Engine
    ETL Tools
    Data
    Ranking
    Discovery
    Service
    Security Services
    User Interface Services
    Disk-Based
    Storage
    Tape-Based
    Storage
    Cloud-Based
    Storage
    DIP DIP DIP
    Storage Services
    Hyrax TDS
    FTP
    HTTPS
    ERDDA
    P
    ArcGIS LAS
    WMS
    Proxy
    Data Access Services
    Metadata Management Services
    Catalog Services

    View full-size slide

  6. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    6
    OneStop Discovery, Metadata,
    Stewardship, and Access Services
    OneStop/NCEI
    disk storage
    CLASS tape-based storage
    BDP cloud storage
    Unified access for the user regardless of storage medium
    Any and all services for a given dataset provided to the user
    Success Goal: 66% of users tested prefer new interface over old*
    (metric to be vetted by professional external review team)
    Storage Services Unified for Users

    View full-size slide

  7. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Storage Concept of Operations
    7
    Reflects decision to
    place storage within
    NCEI 5009 system
    boundary.
    Agreement between
    NCEI, OSGS, and
    OSPO to consider
    OneStop storage as the
    next step toward
    enterprise Storage
    Infrastructure Service
    (SIS) and as a step
    toward a key Mission
    Science Network (MSN)
    capability.
    Mission Science Network (Future)

    View full-size slide

  8. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    OneStop Featured Data Groups and
    OneStop “Readiness”
    8

    View full-size slide

  9. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    “OneStop Ready”
    9
    Readiness Metric Requirement
    ISO Compliant Collection-level Metadata Every collection level record in the data group has an ISO
    compliant metadata record.
    ISO Completeness Collection-level Rubric V2 Every collection level record in the data group shall have a
    completeness score of at least 90%.
    OneStop Collection-level Readiness Rubric Browse graphic, GCMD science keywords...
    Standardized metadata exists for each granule or is
    embedded within each granule
    ACDD and CF conventions for embedded metadata
    Granule metadata contains OneStop-required content See OneStop granule metadata specification
    Machine Independent Data File Format Each granule is formatted in a machine readable format,
    such as netCDF
    Each granule is accessible via a URL Minimally, direct download https/ftps but prefer
    interoperable services (USGEO Common Framework)
    Data Stewardship Maturity Matrix (DSMM) Assessment is complete and documented in collection-
    level metadata record
    Product Maturity Matrix (PMM) Optional. If PMM exists, then document results in
    collection level metadata
    % readiness for a data group assessed in each of Collection Metadata, Granule Metadata, Data Formats, Data
    Access, DSMM. Data group as a whole considered “OneStop Ready” when it reaches 95% overall or higher.

    View full-size slide

  10. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    OneStop Featured Data Groupings
    10
    Data Group Subject Matter Expert Number of Collections
    Digital Elevation Models Barry Eakins, Kelly Carignan 137
    CO-OPS NWLON PORTS Tom Ryan 1
    World Ocean Atlas 2013 Tim Boyer 8
    Group for High Resolution SST Korak Saha 81
    NDBC C-MAN Tom Ryan 1
    NOAA Climate Data Records Jesse Glance, Tom Zhao 32
    OCS Hydro Jason Baillio 17,763
    COAPS SAMOS Chris Paver 1
    NEXRAD Level 2 and 3 Steve Ansari 2
    Reformatted Legacy GOES GVAR data Ken Knapp 2 to 10
    S-NPP/JPSS Axel Graumann 75
    Water Column Sonar Data Chuck Anderson, Carrie Wall-Bell 368
    ESSA images Jason Cooper 1

    View full-size slide

  11. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    “OneStop Ready” Status - April
    11
    Data Group Percent
    Ready
    Collection
    Metadata
    Granule
    Metadata
    Data
    Formats
    Data
    Access
    DSMM
    PMM
    (optional)
    Digital Elevation Models P P P N
    CO-OPS NWLON PORTS Y N Y Y N
    World Ocean Atlas 2013 N N Y Y N
    Group for High Resolution SST Y Y Y Y N
    NDBC C-MAN Y N Y Y N
    NOAA Climate Data Records P N Y P N Y
    OCS Hydro P N P Y N
    COAPS SAMOS P N P Y N
    NEXRAD Level 2 and 3 P N N P N
    Reformatted Legacy GOES GVAR data N N N N N
    S-NPP/JPSS P N P N N
    Water Column Sonar Data P P N N N
    ESSA images N N N N N
    Y = yes, ready; P = partially ready; N= not ready;
    grey = not yet assessed or not applicable
    View Live Table Here

    View full-size slide

  12. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    “OneStop Ready” Status - July
    12
    Data Group Percent
    Ready
    Collection
    Metadata
    Granule
    Metadata
    Data
    Formats
    Data
    Access
    DSMM
    PMM
    (optional)
    Digital Elevation Models 100% Y N/A Y Y Y
    CO-OPS NWLON PORTS 97% 84% Y Y Y Y
    World Ocean Atlas 2013 77% 88% Y Y Y N
    Group for High Resolution SST 80% 95% 95% Y Y 10%
    NDBC C-MAN 57% 84% N Y Y N
    NOAA Climate Data Records 70% 90% P Y 75% Y Y
    OCS Hydro P N P Y N
    COAPS SAMOS P N P Y N
    NEXRAD Level 2 and 3 73% 90% P Y P 75%
    Reformatted Legacy GOES GVAR data N N N N N
    S-NPP/JPSS 35% 50% N P N 75%
    Water Column Sonar Data P P N N N
    ESSA images N N N N N
    Y = yes, ready; P = partially ready; N= not ready;
    grey = not yet assessed or not applicable
    View Live Table Here

    View full-size slide

  13. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User-Centered Development:
    Progress on User Interface
    13

    View full-size slide

  14. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Wire Frames: Intro Page
    14
    Drawing from https://standards.usa.gov/
    and other sources
    NOAA OneStop
    https://www.ncei.noaa.gov/onestop

    View full-size slide

  15. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Wire Frames: Simple Search
    15
    NOAA OneStop
    https://www.ncei.noaa.gov/onestop

    View full-size slide

  16. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Wire Frames: Grouped Results
    16
    NOAA OneStop
    https://www.ncei.noaa.gov/onestop

    View full-size slide

  17. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Wire Frames: Icon Grid Results
    17
    NOAA OneStop
    https://www.ncei.noaa.gov/onestop

    View full-size slide

  18. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    NOAA OneStop
    https://www.ncei.noaa.gov/onestop
    Wire Frames: Map Results
    18

    View full-size slide

  19. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User Interface: Under the Hood
    19
    Overview of System Components:
    DEM

    View full-size slide

  20. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User Interface: Under the Hood
    20
    Data Loading:
    ● Pulling data from:
    ○ MD Geoportal:
    GHRSST
    ○ CO WAF: DEM
    ● Write metadata to
    local elasticsearch via
    OneStop API
    ● Write metadata to
    Geoportal for
    availability via CSW &
    OpenSearch
    DE
    M

    View full-size slide

  21. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User Interface: Under the Hood
    21
    Server API:
    ● Translates JSON
    search request from UI
    into elasticsearch
    query
    ● Now supports
    temporal and spatial
    searching in addition
    to simple text search
    ● Returns top 10 results
    (pagination features to
    be developed later)
    DE
    M

    View full-size slide

  22. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User Interface: Under the Hood
    22
    UI Features:
    ● Text search against
    DEM and GHRSST
    metadata
    ● Clickable flipcard
    results grid
    ● NOAA look/feel
    ● Header/Footer
    In progress:
    ● Spatial & temporal
    search
    DE
    M

    View full-size slide

  23. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    User Interface Demo: Next Week!
    23

    View full-size slide

  24. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Ensuring Consistency and Rigor:
    Metadata Tool Development
    24

    View full-size slide

  25. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    ▪ Assessed metadata tools in use today against
    requirements determined from collected user stories
    ▪ ATRAC
    ▪ AMS/Accession Tracking DataBase (ATDB)
    ▪ DOCUCOMP
    ▪ CEdit
    ▪ Send2NCEI
    ▪ Geoportal Server
    ▪ Geonetwork
    ▪ EMMA
    ▪ MERMAid
    ▪ The Metadata Tool Analysis suggested that none currently
    meet all needs, and highlighted a path forward...
    Metadata Tool Assessments
    25

    View full-size slide

  26. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Metadata Tool System
    26
    Overview
    ISO
    Adaptor/V
    alidator
    Geoportal +
    ElasticSearch
    Metadata database
    Collection
    CLOB
    Granule
    CLOB
    WAF
    Other
    Metadata Tools
    OneStop UI
    DIF
    JSON
    SPASE
    Etc...
    ETL
    Kibana
    (New Rubric)
    data.noaa.gov
    Google,
    schema.org
    Inputs Outputs
    WDS, WIS,
    etc.

    View full-size slide

  27. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    DSMM Graphics Tool
    27
    Look for Ge Peng’s session on Tuesday, next week at ESIP!

    View full-size slide

  28. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Ensuring Community Alignment:
    Map to USGEO Common Framework
    28

    View full-size slide

  29. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Discovery
    Provide CSW
    and OpenSearch
    Provide Project
    Open Data and
    Schema.org
    metadata
    Mint DOis
    Publish WAF to
    data,gov,
    expose via OAI-
    PMH
    Use Resolvable
    Identifiers (e.g.,
    ORCID)
    Access
    All: HTTPS/FTP
    Grids: WMS,
    WMTS, DAP,
    WCS
    Unstructured
    Grids: UGRID
    In Situ: SOS,
    WFS, DAP
    Features:
    WFS
    Tables:
    TableDAP
    Documentation
    ISO 19115-1 and
    -3 preferred
    ISO 19115-2
    accepted
    ISO 19157 for
    Data Quality
    SensorML for
    Instruments
    Dynamic
    conversion of
    ISO to Project
    Open Data JSON
    Formats
    Numerical:
    netCDF4/HDF5
    Imagery:
    GeoTIFF
    Points/Lines/
    Polygons: GML
    Hydrological:
    WaterML2.0
    Weather:
    WXXM
    Vocabularies
    Spatial
    Reference
    System: EPSG
    Geodetic P.D.
    Hydrologic:
    WBD
    Keywords:
    OMB Circular A-
    16
    GCMD
    Parameter
    Names: CF
    Content
    Models:
    US GIN
    Darwin Core
    NEPAnode
    This map was generated by Kenneth S. Casey,
    based on the USGEO Common Framework for
    Earth Observation Data (2016).
    USGEO Common Framework...

    View full-size slide

  30. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Discovery
    Provide CSW
    and OpenSearch
    Provide Project
    Open Data and
    Schema.org
    metadata
    Mint DOis
    Publish WAF to
    data,gov,
    expose via OAI-
    PMH
    Use Resolvable
    Identifiers (e.g.,
    ORCID)
    Access
    All: HTTPS/FTP
    Grids: WMS,
    WMTS, DAP,
    WCS
    Unstructured
    Grids: UGRID
    In Situ: SOS,
    WFS, DAP
    Features:
    WFS
    Tables:
    TableDAP
    Documentation
    ISO 19115-1 and
    -3 preferred
    ISO 19115-2
    accepted
    ISO 19157 for
    Data Quality
    SensorML for
    Instruments
    Dynamic
    conversion of
    ISO to Project
    Open Data JSON
    Formats
    Numerical:
    netCDF4/HDF5
    Imagery:
    GeoTIFF
    Points/Lines/
    Polygons: GML
    Hydrological:
    WaterML2.0
    Weather:
    WXXM
    Vocabularies
    Spatial
    Reference
    System: EPSG
    Geodetic P.D.
    Hydrologic:
    WBD
    Keywords:
    OMB Circular A-
    16
    GCMD
    Parameter
    Names: CF
    Content
    Models:
    US GIN
    Darwin Core
    NEPAnode
    OneStop Currently
    Addresses
    OneStop Partly
    Addresses
    OneStop Not Addressing
    ...OneStop Currently Addressing

    View full-size slide

  31. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Organized for Success:
    Project Organization, Personnel, and Schedule
    31

    View full-size slide

  32. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    32
    Project Organization
    NCEI Program Overview 32
    OneStop Project Teams
    OneStop Integrated Project Team (IPT)
    NCEI, ACIO-S, OSGS
    Tom Karl/NCEI … ACIO-S
    NESDIS DAA
    NESDIS AA
    Kenneth Casey, Project Manager
    User
    Engagement
    Cross-LO
    Engagement
    Architecture Team
    IT Services and
    Tools Team
    Metadata and Data
    Improvement Team
    Agile Team

    View full-size slide

  33. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Organization Chart
    (for positions > 10% FTE)
    33
    MD
    CO NC
    Martin
    Aubrey
    Partha
    Chowdhuri
    Rich
    Fozzard
    Steven
    Marcus
    Project Manager
    Kenneth S. Casey
    Asst: Mike Chapman
    IT Tools and
    Services Team
    John Relph
    Metadata and Data
    Improvement Team
    Nancy Ritchey
    Thomas
    Jaensch
    Raisa Ionin
    MD
    Robert
    Partee
    Jason
    Shapiro
    CO
    Paul
    Lemieux
    Justin
    Reid
    NC
    Architecture
    Team
    Dave
    Fischman
    Jay Morris
    (OSGS)
    OneStop IPT
    User Interface
    Team (Agile)
    Dave Neufeld
    Evan
    McQuinn
    CO
    Aaron
    Rosenberg
    Procurement
    Support
    James Goudouros
    (OSGS)
    Don Collins
    Yuanjie Li
    Phil Jones
    Anna Milan
    Robert
    Briscoe
    Tom
    Carey
    Joseph
    Mangin
    (ACIO-S)
    Sonny
    Zinn
    Arianna
    Jakositz
    Aaron
    Caldwell
    Semere
    Ghebrechristos
    Funded by other
    Funded by OneStop
    J. Mize, K. Martinolich (MS)
    Data Group SMEs (CCOG/CWC)

    ✅ = At ESIP Next Week





    View full-size slide

  34. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Task Name Days Start Finish Progress
    1.0 Architecture 60 Thu 10/1/15 Fri 12/24/16
    Design and Architecture Document Thu 12/17/16 100%
    2.0 Identify Web Services 59 Thu 10/1/15 Wed 12/23/15 100%
    3.0 Data/Metadata Best Practices 59 Thu 10/1/15 Wed 12/23/15 100%
    4.0 Storage/IT Support 237 Wed 11/2/15 Thu 9/22/16 71%
    Storage ConOps Tue 5/24/2016 100%
    5.0 Development Team Setup 123 Thu 10/1/15 Thu 4/28/16 100%
    6.0 Develop Beta Version 164 Mon 4/7/16 Tue 12/6/16 14%
    Release Beta Wed 12/7/16
    7.0 Internal Evaluation Report 26 Wed 12/7/17 Tue 1/17/17 0%
    8.0 Develop Release 1.0 53 Tue 1/17/17 Fri 3/31/17 0%
    Release 1.0 Mon 4/3/17
    9.0 Professional Usability Study 20 Mon 4/3/17 Fri 4/28/17 0%
    10.0 Develop Release 1.1 64 Mon 4/3/17 Fri 6/30/17 0%
    Release 1.1 Mon 7/3/17
    11.0 Data and Metadata Improvement 325 Tue 12/15/15 Wed 3/31/17 20%
    2 data groupings Thu 6/30/16 100%
    5 data groupings Fri 9/30/2016 20%
    10 data groupings Wed 3/31/17 0%
    12.0 Relevance Ranking Improvement 265 Fri 2/5/16 Wed 3/15/17 5%
    13.0 Metadata Management System 378 Tue 12/15/15 Wed 6/15/17 5%
    14.0 WMS Proxy 336 Fri 2/5/16 Wed 6/15/17 4%
    OneStop Schedule
    (top level with selected milestones shown)
    34
    link to full schedule

    View full-size slide

  35. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    Summary: Accomplishments to Date
    35
    • OneStop Storage ConOps (v2.0 signed by NCEI/OSGS)
    • Hiring completed for ERT, GST, and CIRES team members, plus
    dedicated IT support in NC.
    • Ongoing engagement following Communications Plan
    • Detailed Project Management Plan with quarterly updates
    • Agile Epics and five sprints completed
    • Initial user interface now functioning - demo next week at ESIP
    • USGEO Common Framework map documented
    • Data Set Maturity Matrix (DSMM) Quick Start Guide
    • DSMM Graphic Visualizer tool released
    • Defined where to capture DSMM results in ISO record
    • 2 Data Groupings/Metadata Improved! (DEMs and
    NWLON/PORTS). GHRSST to Q4 (80%) due to DSMM effort
    Progress on other datasets continues (see detailed tracking)

    View full-size slide

  36. N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N
    36
    Questions?
    OneStop
    See you at ESIP:
    https://2016esipsummermeeting.sched.org/event/c241039c436b775d4e228
    2d64bdfd912?iframe=no

    View full-size slide