Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gentle Introduction into Geospatial (using SQL in SAP HANA)

Gentle Introduction into Geospatial (using SQL in SAP HANA)

Presented at GeekWeekWro in Wrocław on 2018-04-21

Vitaliy Rudnytskiy

April 21, 2018
Tweet

More Decks by Vitaliy Rudnytskiy

Other Decks in Programming

Transcript

  1. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 1 Public The World in a Database: Geospatial Analytics for the rest of us Vitaliy @Sygyzmundovych Rudnytskiy GeekWeekWro, Wrocław, 2018/04
  2. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 2 Public Let me introduce myself - Vitaliy Rudnytskiy (aka Witalij Rudnicki) @Sygyzmundovych - SAP’s Developers Center (developers.sap.com) - IoT, Data, Analytics - Based in Wrocław, Poland 51°04'40.3"N 16°57'48.8"E (WGS84)
  3. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 4 Public Wrocław SAP Meetup: https://www.meetup.com/Wroclaw-SAP-Community-Meetup/ You are welcome to join ☺
  4. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 6 Public Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. Except for your obligation to protect confidential information, this presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or any related document, or to develop or release any functionality mentioned therein. This presentation, or any related document and SAP's strategy and possible future developments, products and or platforms irections and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information in this presentation is not a commitment, promise or legal obligation to deliver any material, code or functionality. This presentation is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This presentation is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this presentation, except if such damages were caused by SAP’s intentional or gross negligence. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
  5. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 9 Public Greater Business Insight & Action with Geographic Data Customer Examples Improve situational awareness with location intelligence to keep travelers safe Track thoughts of flights per day using rich 3D mapping interface with spatial and temporal coordinate to improve flight operations Predict impact of natural disasters, analyze cluster maps to assess critical infrastructure (roads, hospitals etc.).
  6. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 10 Public Greater Business Insight & Action with Geographic Data Customer Examples NSW Fire Rescue Identify & visualize emergency call trends RTSA Storm Tracker: Track storms in real-time to estimate damage based on predicted wind speeds City of Boston Manage roads with real- time analysis of snow fall
  7. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 11 Public How I got into the geospatial for the first time https://blogs.sap.com/2016/06/24/from-wroclaw-to-warsaw-and-back-iot-analysis-with-sap-hana-sql
  8. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 12 Public Why SQL? The main tools for Analytics/Data Science by popularity (2017) 1. Python 2. R language 3. SQL 4. Java 5. Unix shell/awk/gawk 6. C/C++ 7. Perl 8. Julia https://www.kdnuggets.com/2017/05/poll-analytics- data-science-machine-learning-software-leaders.html
  9. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 13 Public How I got into the geospatial for the second time https://twitter.com/Sygyzmundovych/status/808250140251324416
  10. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 15 Public The Forrester Wave™: Translytical Data Platforms, Q4 2017 “…Translytical data platforms, an emerging technology, deliver faster access to business data to support various workloads and use cases. EA pros can use them to drive new business initiatives…” source: https://reprints.forrester.com/#/assets/2/308/'RES134282'/reports “SAP crushes translytical workloads. SAP HANA is a shared-nothing, in-memory data platform, the core of SAP's translytical platform, which supports many use cases, including real- time applications, analytics, translytical apps, systems of insight, and advanced analytics…”
  11. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 16 Public SAP HANA, Express edition developers.sap.com/sap-hana-express • targeted to run in resource-constrained environments and contains a rich set of capabilities for a developer to work with, including geospatial data types and algorithms • free productive use up to 32 GB of RAM, • comes as a binary installer, as a pre-configured virtual machine image (ova file), as a docker or cloud image Database management Web server JavaScript Graphic modeler Data virtualization Extract, load, transform and replication Columnar store – Transaction and analytical processing Multicore and parallelization Advanced compression Multitenancy Multitier storage Graph Predictive Search Series data Business functions Apache Hadoop and Apache Spark integration Streaming analytics Application lifecycle management High availability and disaster recovery Openness Data modeling Administration and security Spatial Text analytics SAP Fiori® user experience(UX) Application development Data integration and quality Advanced analytical processing Data quality Remote data sync
  12. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 17 Public Spatial Data Types Vector, raster data, point cloud Vector data Raster data Point LineString Polygon CircularString MultiPoint MultiLineString MultiPolygon GeometryCollection 14 35 25 17 39 59 16 15 17 X Y Z Complex types
  13. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 19 Public Spatial: data types in SAP HANA and in SQL/MM standard Geometries  Point – a single location in space  MultiPoint – a collection of points  LineString – a geometry with a length  MultiLineString – a collection of LineStrings  Polygon – a region of space  MultiPolygon – a collection of polygons  CircularString – a sequence of circular arc strings https://help.sap.com/viewer/cbbbfc20871e4559abfd45a78ad58c02/latest/en-US/7a2ef60e787c10148e86fd0f4c60cb29.html
  14. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 20 Public Spatial predicates in SQL SAP HANA Spatial g1 g2 g1 g2 g1.ST_Touches(g2) (g1 ∩ g2 ≠ ∅) ∧ (B(g1) ∩ B(g2) = ∅) g1.ST_Within(g2) g1 ∩ g2 = g1 ∧ I(g1) ∩ E(g2) = ø g1.ST_Equals(g2) g1 = g2 g2 g1 g1 g1.ST_Crosses(g2) I(g1) ∩ I(g2) ≠ ∅ ∧ (g1 ∩ g2 ≠ g1) ∧ (g1 ∩ g2 ≠ g2)] g2 g1 g1 g2 g1.ST_Overlaps(g2) (I(g1) ∩ I(g2) ≠ ∅) ∧ (I(g1) ∩ E(g2) ≠ ∅) ∧ (E(g1) ∩ I(g2) ≠ ∅) g1.ST_Intersects(g2) g1 ∩ g2 ≠ ø g1 g2 g1.ST_Disjoint(g2) g1 ∩ g2 = ø g1 g2 g2 g1 g2 g1 g2 g1.ST_Contains(g2) g1 ∩ g2 = g2 ∧ I(g1) ∩ I(g2) ≠ ø g2 g1 g1 g1.ST_Covers(g2) * g1 ∩ g2 = g2 g2 g1 g2 * No OGC standard g1 g2 g2 g1 g1 g2 g1 g2 g1 g2
  15. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 22 Public Using spatial predicates in SQL SAP HANA Spatial In the SELECT clause SELECT "LOCATION".ST_Within([geometry]), "NAME" FROM "CUSTOMERS"; In the WHERE clause SELECT "NAME" FROM "CUSTOMERS" WHERE "LOCATION".ST_Within([geometry]) = 1; As JOIN condition SELECT * FROM "STORES" AS S LEFT JOIN "CUSTOMERS" AS C ON C."LOCATION".ST_Within(S."TRADING_AREA") = 1;
  16. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 23 Public Trying it out: number π by throwing virtual darts https://blogs.sap.com/2016/12/14/calculating-number-%CF%80-by-throwing-darts-digitally-in-sap-hana/ https://www.youtube.com/watch?v=M34TO71SKGk
  17. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 24 Public Transformation functions SAP HANA Spatial SQL methods to transform geometries "GEOM".ST_Boundary() "GEOM".ST_Buffer([distance]) "GEOM".ST_ConvexHull() "GEOM".ST_Envelope() Buffer Convex Hull Envelope Boundary
  18. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 25 Public Trying it out: ASCII world map from open data shape files https://blogs.sap.com/2017/02/13/ascii-art-map-of-the-world-using-sap-hana-geospatial-processing/
  19. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 26 Public Spatial data exchange formats SAP HANA Spatial Well-Known Text (WKT) 'POINT(1 1)' Well-Known Binary (WKB) 0101000000000000000000F03F000000000000F03F ESRI Shapefiles Popular format for geospatial vector data GeoJSON {"x" : 8.6, "y" : 49.3, "spatialReference" : {"wkid" : 4326}} Scalable Vector Graphic (SVG) <rect width="1" height="1" fill="deepskyblue" stroke="black" strokewidth=" 1" x="1" y="-1"/> SAP HANA Data sources Analysis / Apps files input output
  20. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 27 Public Examples of popular sources of geo data shape files OpenStreetMap: Coastlines, Lands and Waterbodies http://openstreetmapdata.com/data Natural Earth (public domain map dataset available at 1:10m, 1:50m, and 1:110m scales) http://www.naturalearthdata.com/downloads/ World Borders from Thematic Mapping http://thematicmapping.org/downloads/ GeoNames http://www.geonames.org
  21. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 28 Public Spatial aggregation SAP HANA Spatial ST_UnionAggr ("SHAPE") Returns the spatial union of all of the geometries ST_EnvelopeAggr ("SHAPE") Returns the bounding rectangle for all of the geometries in a group ST_ConvexHullAggr ("SHAPE") Returns the convex hull for all of the geometries in a group ST_IntersectionAggr ("SHAPE") Returns the spatial intersection of all of the geometries in a group ST_AsSVGAggr("SHAPE") Returns a complete or partial SVG document that renders the geometries in a group
  22. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 31 Public Set operations SAP HANA Spatial "SHAPE_A".ST_Union("SHAPE_B") "SHAPE_A".ST_Intersection("SHAPE_B") "SHAPE_A".ST_Difference("SHAPE_B") "SHAPE_A".ST_SymDifference("SHAPE_B") Union Intersection Difference Symmetric Difference
  23. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 32 Public A neighbor with the longest border? https://blogs.sap.com/2018/02/20/get-to-know-your-neighbors-with-sap-hana-geospatialtuesday/ For Germany:
  24. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 33 Public The neighbor with the longest border? https://blogs.sap.com/2018/02/20/get-to-know-your-neighbors-with-sap-hana-geospatialtuesday/ source: https://en.wikipedia.org/wiki/Geography_of_Germany
  25. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 34 Public Understanding the Coastline Paradox https://blogs.sap.com/2018/03/27/sap-hana-faces-coastline-paradox-geospatialtuesday/
  26. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 36 Public Spatial Data Types and Sources Interior, exterior, and boundary Interior All points that are part of the geometry – except for the boundary Exterior (all white) All points that are not part of the geometry
  27. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 40 Public The center of Europe? Source: https://en.wikipedia.org/wiki/Geographical_midpoint_of_Europe
  28. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 41 Public The center of Europe? Source: https://en.wikipedia.org/wiki/Geographical_midpoint_of_Europe
  29. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 42 Public Computation functions SAP HANA Spatial SQL methods to compute geometry properties "LINE".ST_Length() "GEOM".ST_Perimeter() "GEOM".ST_Area() "GEOM".ST_Centroid() "GEOM".ST_Distance("GEOM") "GEOM".ST_isValid() "GEOM".ST_isEmpty() "LINE".ST_isClosed() "GEOM".ST_isSimple() Length Perimeter Area Centroid Distance
  30. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 43 Public SAP HANA, tell me where the center of Europe is https://blogs.sap.com/2017/04/11/sap-hana-tell-me-where-the-center-of-europe-is/
  31. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 54 Public Round Earth and Planar World Projections to Planar World ▪ distance/area calculations with (acceptable?) distortion ▪ projections: cylindrical, conical, azimuthal, hybrid, … r Round Earth Models ▪ accurate results in specific UoM ▪ more complex calculations Projection Transformation
  32. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 56 Public Where do you put the reference line? A meridian, based at the Royal Observatory, Greenwich, in London, was established in 1851. In October of 1884, 41 delegates from 25 nations met in Washington, D.C. for the International Meridian Conference. This conference selected the meridian passing through Greenwich as the official prime meridian due to its popularity.
  33. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 58 Public Wrocław Meridian(s) Source: http://www.fakt.pl/wydarzenia/polska/wroclaw/dlaczego-we-wroclawiu-mamy-az-dwa-poludniki/6qjq2ef
  34. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 61 Public Trying it out: Different points of view on ASCII world maps https://blogs.sap.com/2017/02/28/ascii-art-map-of-the-world-different-point-of-view/
  35. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 65 Public Trying it out: ASCII world map seen from the space ;) https://blogs.sap.com/2017/03/06/ascii-art-map-of-the-world-the-polar-azimuthal-equidistant-projection/
  36. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 68 Public Round Earth Models Sphere is a simplified Earth model radius of earth = = 6371 km (constant) equator r Spheroid as proper earth model (here: WGS84) semi major axis = equatorial radius: = 6378.137 km semi minor axis = polar radius: = 6356.752 km flattening: = ( − ) / equator er pr
  37. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 70 Public Spatial Reference Systems The Earth is not flat – and it is not a perfect sphere either Due to its rotation, its approximate shape is more like that of an oblate spheroid (ellipsoid) The basic structure of an SRS consists of three parts:  Reference ellipsoid – what is the shape of the Earth?  Geometric datum – where is the Earth's center?  Coordinate system There are global fitting ellipsoids and many ‟locally optimized” ellipsoids A few of the “old” ellipsoids are still in use, e.g. the Airy ellipsoid (from 1830) or the Bessel ellipsoid (from 1841) Mass distribution around the earth. Source: Wikipedia
  38. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 73 Public Spatial Reference Systems GPS: WGS 1984, aka EPSG:4326: http://epsg.io/4326 Another one in the everyday use: OpenLayers:900913, aka EPSG:3857: http://epsg.io/3857 In Europe: http://www.crs-geo.eu In the US: https://www.nga.mil In Poland: http://gisplay.pl/geodezja/uklady-wspolrzednych.html Master EPSG Dataset (online and downloadable): http://www.epsg.org/EPSGhome.aspx There are 3000+ in total existing
  39. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 76 Public EPN Tracking network Source: http://www.epncb.oma.be/_networkdata/stationmaps.php
  40. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 79 Public What is the distance between two points? select new st_point('POINT (0 0)').st_distance(new st_point('POINT (0 1)')) as distance from dummy; -- The result is „1” But on Earth between the two GPS points? select new st_point('POINT (0 0)',4326).st_distance(new st_point('POINT (0 1)',4326)) as distance from dummy; -- Now the result is „110574.38855796008”, i.e. 110.57 km https://help.sap.com/viewer/cbbbfc20871e4559abfd45a78ad58c02/latest/en-US/7a2ea357787c101488ecd1b725836f07.html WGS84 - SRID 4326 „The WGS84 standard provides a spheroidal reference surface for the Earth. It is the spatial reference system used by the Global Positioning System (GPS). The coordinate origin of WGS 84 is the Earth's center, and is considered accurate up to ±1 meter. WGS stands for World Geodetic System.”
  41. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 86 Public Let’s look at one particular country: Poland Source: https://pl.wikipedia.org/wiki EPSG 3328
  42. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 90 Public What is so special about meridian 15 East?
  43. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 91 Public Access Functions SAP HANA Spatial Methods to access geometry properties "POINT".ST_X(), "POINT".ST_Y() "GEOM".ST_XMax(), "GEOM".ST_XMin() "LINE".ST_NumPoints(), "LINE".ST_PointN([n]) "GEOM".ST_Dimension() "LINE".ST_StartPoint(), "LINE".ST_EndPoint() "GEOM".ST_GeometryType() "GEOM".ST_Srid() "GEOM".ST_ExteriorRing() "GEOM".ST_InteriorRingN([n]) Geometry Type = ST_Polygon NumPoints = 3 Start End 2nd Point ST_YMin() ST_YMax() Exterior Ring
  44. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 92 Public Is CET time zone in the middle of Europe? https://blogs.sap.com/2017/09/18/sap-hana-is-cet-time-zone-in-the-middle-of-europe/
  45. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 94 Public And what are central points of Europe’s mainland? https://blogs.sap.com/2018/01/16/finding-the-central-point-of-continents-mainlands-with-sqlscript-spatial- support/
  46. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 95 Public What are Europe’s extreme points? https://blogs.sap.com/2017/10/31/finding-extreme-points-of-europe-with-sap-hana/
  47. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 96 Public Clustering SAP HANA Spatial You use clustering to divide/separate points into “meaningful” groups. SAP HANA supports three clustering algorithms ▪ Grid ▪ K-means ▪ DBSCAN When clustering, you can ▪ Associate each data point with a cluster ▪ Derive cluster properties, like centroid and envelope ▪ Aggregate data of the cluster’s data points spherical clusters non-spherical clusters
  48. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 97 Public Clustering https://mapy.geoportal.gov.pl/iMapLite/KMZBPublic.html
  49. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 100 Public 4*3 grid clustering Spatial Clustering Grid Quick and easy way to use clustering Methods ▪ ST_ClusterID() ▪ ST_ClusterEnvelope() Cells tessellate the grid. Every cell has a unique identifier. SELECT ST_ClusterID(), ST_ClusterEnvelope(), COUNT(*) FROM "CUSTOMERS" GROUP CLUSTER BY "LOCATION" USING GRID X CELLS 10 Y CELLS 10;
  50. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 101 Public k-means clustering Spatial Clustering K-means Centroid-based, iterative clustering method Methods ▪ ST_ClusterID() ▪ ST_ClusterCentroid() Suitable for spherical clusters. Initialization methods: RANDOM PARTITION, FORGY SELECT ST_ClusterID(), ST_ClusterCentroid(), COUNT(*) FROM "CUSTOMERS" GROUP CLUSTER BY "LOCATION" USING KMEANS CLUSTERS 10;
  51. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 102 Public Spatial Clustering DBSCAN Density-based clustering method Methods ▪ ST_ClusterID() Suitable for non-spherical clusters. Parameters: EPS (distance), MINPTS (min nr of points) SELECT ST_ClusterID(), COUNT(*) AS C FROM "CUSTOMERS" GROUP CLUSTER BY "LOCATION" USING DBSCAN EPS 40 MINPTS 50; DBSCAN clustering
  52. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 103 Public Clustering abandoned places in Poland https://blogs.sap.com/2018/02/28/geonames.org-in-sap-hana-opendataday-meets-geospatialtuesday
  53. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 108 Public openSAP course (archived): Spatial Analysis with SAP HANA Course Characteristics Duration: 3 weeks (4-6 hours per week) Course Content Week 1: Introduction to SAP HANA Spatial Week 2: Spatial Analysis Week 3: Development of Spatial Applications Week 4: Final Exam https://open.sap.com/courses/hsgs1
  54. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 109 Public openSAP course (archived): Maps and 3D Made Easy with SAPUI5 Course Characteristics Duration: 3 weeks (4-6 hours per week) Course Content Week 1: Visualize Geospatial Data Week 2: Connect Your Data to 3D Content Week 3: Beyond the Basics Week 4: Final Exam Target Audience • Application developers • Application architects • User interface designers https://open.sap.com/courses/3dmv1
  55. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 111 Public Esri MOOCs http://www.esri.com/mooc
  56. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. 115 Public Basics you have learned today Spatial data types Spatial predicates Spatial functions Access Computations Transformations Spatial sets and aggregations Data exchange formats Spatial Reference Systems (SRS)
  57. © 2014 SAP SE or an SAP affiliate company. All

    rights reserved. Thank you && Dziękuję!! SAP HANA Developers Center http://developers.sap.com/hana @sygyzmundovych Vitaliy Rudnytskiy