Save 37% off PRO during our Black Friday Sale! »

Crunching Data In GeoServer with Discrete Global Grid Systems (DGGS)

Crunching Data In GeoServer with Discrete Global Grid Systems (DGGS)

Discrete Global Grid Systems are a way to tessellate the entire planet into zones sharing similar characteristics, with multiple resolutions to address different precision needs, allowing integration of data coming from different data sources, and on demand analysis of data.

Come to this presentation to have an introduction to the DGGS concepts, learn when they are a good fit for a specific problem, and get an update on their implementation in GeoServer.

Bac74c17d65c22d0ae63915251f7750f?s=128

Simone Giannecchini
PRO

October 04, 2021
Tweet

Transcript

  1. Andrea Aime Simone Giannecchini GeoSolutions Crunching Data In GeoServer with

    Discrete Global Grid Systems (DGGS)
  2. GeoSolutions • Offices in Italy & US, Worldwide clients •

    30+ collaborators, 25+ Engineers • Our products • Our Offer Enterprise Support Services Deployment Subscription Professional Training Customized Solutions GeoNode
  3. Affiliations We strongly support Open Source, it Is in our

    core We actively participate in OGC working groups and get funded to advance new open standards We support standards critical to GEOINT
  4. Introduction to DGGS concepts

  5. DGGS • Discrete Global Grid System (DGGS) • Earth partitioning

    (no overlap). Each partition is called a “zone”. Each zone has a unique identifier. • Zones should have the same area (but not all impl do) • Partitioning has no arbitrary limits (poles, dateline) • Multi-resolution, zones have parent/child relationships
  6. DGGS libraries • A DGGS library implements the geometric structure

    of a particular DGGS. At a minimum: • Conversion between zone identifiers and their polygonal geometry. • From point/polygon to zones • Give a zone parent, children, neighbors • DGGS and libraries used in GeoServer implementation: rHealPix, Uber’s H3 P3 →
  7. rHealPix • Zones are equal area • Parents contain exactly

    9 children • The sum of the children builds the parent • A cell has 4 neighbors, diagonal ones are not considered close • Zone identifiers are easy to reason with • P is parent of P1 • P1 is parent of P12 • Only Python based implementation
  8. H3 • Hexagon based system, with a few pentagons in

    the mix • Each zone has 6 or 7 children • The sum of the children does not make the parent. Not equal area. • Zone neighbors share the same distance, center center • Zone identifiers hard to reason with • 817c3ffffffffff is 807dfffffffffff child • Excellent implementation, bindings in many languages
  9. Viewing the DGGS

  10. DGGS “geometry” datastore • First step, encapsulate the DGGS libraries

    behind a common interface • • Then build a GeoServer data store reporting the zone structure and attribute → WMS,WFS! • • Difficulties binding to rHealPix Python implementation: • Used JEP to call onto Python interpreter • Performance and scalability are limited
  11. Viewing rHealPix with the store rHealPix on WMS, plate carrée

    r=1 r=0
  12. Viewing H3 with the store H3 on WMS, plate carrée

    r=1 r=0
  13. WFS download • WFS download allows other software to display

    and manipulate DGGS zones In this example the WFS generated a shapefile, which has then been displayed in QGIS
  14. Representing data with DGGS

  15. ClickHouse DataStore • DGGS zones count can grow very large

    (100s trillions to cover entire planet at max resolution) • DGGS is especially interesting for analysis • ClickHouse DGGS datastore • OLAP database • Tables partitioned by default, can spread partitions over nodes • Runs queries using all cores and all nodes
  16. Sampling Sentinel 2 • Australian Capital Territory • Sampled Sentinel

    2 at resolution 11 • Stored results in ClickHouse OLAP database
  17. Multi-resolution database • Computed NDVI, NDWI, NDBI • Zone parents

    computed by aggregation (fast) • Multi-resolution representation • Each resolution stored in ClickHouse
  18. OGC API - DGGS

  19. DGGS API Data retrieval API reminiscent of OGC API Features,

    with features unique to DGGS https://tb16.geo-solutions.it/geoserver/ogc/dggs/api?f=text%2Fhtml
  20. Zones access • /collections/{collectionId}/zones • Retrieve DGGS zones • “resolution”

    parameter mandatory • Space filtering • By “bbox”, like in OGC API Features (CRS84) • By “geom”, CRS84 polygon • By “zones”, array of zone identifiers (most efficient) https://observablehq.com/@mxfh/iterative-h3-polyfill
  21. Neighboring zones • /collections/{collectionId}/neighbors • Retrieve neighbors of a given

    zone (by id) • Specify an integer search radius Neighbors of “N66” with a search radius of 2 Neighbors of “8075fffffffffff” with a search radius of 2
  22. Parents and childrens • /collections/{collectionId}/parents (up to r=0) • /collections/{collectionId}/children

    (to specified r) Children of “N66” at r=4 Children of “8075fffffffffff” at r=2
  23. Access by point and polygon • /collections/{collectionId}/point (location and target

    resolution) • /collections/{collectionId}/polygon (target res and compaction)
  24. DGGS based analysis

  25. Data Access and Processing API • Another group in TB16

    worked on DAPA • Allows access to data and quick aggregates: • min/max/avg/stddev • by time, area, both • • DGGS extra parameter: resolution • • Spatial filter, polygon or list of DGGS zones • • Turned into ClickHouse queries • • ClickHouse query computation uses all cores/nodes • • Especially fast if the spatial filter is expressed as a set of DGGS zones
  26. DGGS based DAPA https://tb16.geo-solutions.it/geoserver/ogc/dggs/collect ions/dggs:s2-h3/processes?f=text%2Fhtml https://tb16.geo-solutions.it/geoserver/ogc/dggs/collections/dggs:s2-h3 /processes/area:aggregate-space-time?resolution=11&f=application% 2Fgeo%2Bjson • Space/time

    aggregation • 8.9 million records aggregated in less than a second • On spinning disks (RAID)
  27. On demand accuracy • The “resolution” parameter allows to control

    the trade off between speed and accuracy • Analytics users, such as Jupyter notebooks, can use low resolution for development, and switch up to higher resolution to get final results
  28. Want to know more? • If you’d like to get

    more details about this activity, please lookup and read: • OGC Testbed-16, DGGS and DGGS API ER • OGC Testbed-16, Data Access and Processing Engineering Report https://www.ogc.org/docs/er
  29. Want to try? • Source code under the OGC API

    umbrella: • Part of the “ogcapi” community module
  30. The End Questions? andrea.aime@geosolutionsgroup.com simone.giannecchini@geosolutionsgroup.com info@geosolutionsgroup.com