Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Programmatic Metadata Augmentation and Processing for Enhancing Data Discoverability

Programmatic Metadata Augmentation and Processing for Enhancing Data Discoverability

Michael Shensky, University of Texas at Austin

More Decks by Texas Natural Resources Information System

Other Decks in Technology

Transcript

  1. Programmatic Metadata Augmentation and Processing for Enhancing Discoverabilty of Geospatial

    Data Michael Shensky, GIS & Geospatial Data Coordinator
  2. Metadata & the Texas GeoData Portal Project The Texas GeoData

    portal project was completed in 2019 after 2 years of active development, much of it focused on metadata Build a feature rich, easy to use geodata portal Develop scalable, sustainable metadata processes
  3. Project Challenges No existing metadata in standard GIS formats Digital

    collection managers new to working with GIS software Collaboration between IT and project stakeholder group Over 60,000 datasets to processes Data from multiple, diverse collections Many scanned map raster datasets not georeferenced Limited metadata for individual datasets, more available at collection level ArcGIS Desktop ArcPy vs ArcGIS Pro ArcPy
  4. Current Texas GeoData Portal Technology Stack PostgreSQL ArcGIS Server GeoBlacklight

    Solr Docker Python + ArcPy
  5. Metadata for Individual Datasets (Simple) Guide for Metadata Authors Scripted

    Metadata Export Manual Metadata Entry
  6. Metadata for Collections of Raster Datasets (Complex) 1 2 3

    4 5 6 ISO 19139
  7. Raster Spatial Keyword Lookup Automated select by location run using

    map extent and Natural Earth data for countries, state, lakes, etc. ISO 19139
  8. Metadata Schema Mapping Close Up MARC Schema ISO 19139 Schema

    GeoBlacklight Schema
  9. Front End Publishing

  10. OpenGeoMetadata Harvesting

  11. GeoData Portal Demonstration  Federated Search  Browse  Facet

     Featured Collections  Download Data  Download Metadata
  12. Future Work 676 datasets shared… 60,000+ datasets to go

  13. Lessons Learned  Discoverability of data in large collections extremely

    dependent on metadata quality  Scripted processes can greatly enhance metadata processing efficiency  Scalability, sustainability, and adaptability must be carefully considered when planning metadata workflows for a geodata portal GIS Project Stakeholders past and present  Jessica Trelogan, Jenifer Flaxbart, Katie Pierce Meyer, Albert Palacios, Katherine Strickland, Elle Covington, Benn Chang, Dennis Trombatore, Anna Lamphear, Loretta Wallace, Beth Dodd UT Libraries IT  Dustin Slater, Crystal Arnspiger, Larry Yang, Heather Langley, Noah King, Dave Ronn, Perry Thompson, Brandon Stennett Acknowledgements