Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GPU version of PostGIS and GiST-Index

GPU version of PostGIS and GiST-Index

PGconf.Online 2021
GPU version of PostGIS and GiST-Index
- A new horizon of geospatial data analytics -

Avatar for KaiGai Kohei

KaiGai Kohei

February 26, 2021
Tweet

More Decks by KaiGai Kohei

Other Decks in Technology

Transcript

  1. GPU version of PostGIS and GiST-Index ~A new horizon of

    geospatial data analytics~ HeteroDB,Inc Chief Architect & CEO KaiGai Kohei <[email protected]>
  2. about Us  KaiGai Kohei (海外浩平)  Chief Architect &

    CEO of HeteroDB  Contributor of PostgreSQL (2006-)  Primary Developer of PG-Strom (2012-)  Interested in: Big-data, GPU, NVME/PMEM, ... about Myself about HeteroDB  Established: 4th-Jul-2017  Location: Shinagawa, Tokyo, Japan  Businesses: ✓ Development of high-performance data-processing software on top of heterogeneous architecture. ✓ Technology consulting service on GPU&DB area. PGconf.online 2021 - GPU version of PostGIS and GiST-Index 2 PG-Strom
  3. What is PG-Strom? PGconf.online 2021 - GPU version of PostGIS

    and GiST-Index 3  Adds custom-scan/join/aggregate path with GPU execution  SQL to GPU binary code generation & Just-in-time compilation  Designed for IoT/M2M grade log-data processing  Closely connected NVME-SSD and GPU using P2P data transfer [I/O acceleration]  Geospatial analytics with GPU-version of PostGIS and GiST-index support PG-Strom: An extension of PostgreSQL to pull out maximum capability of GPU and NVME for processing of terabytes scale data App GPU off-loading for IoT/Big-Data for ML/Analytics PG-Strom’s Features ➢ JIT of GPU code with SQL & Transparent SQL acceleration ➢ SSD-to-GPU Direct SQL ➢ Columnar Store (Arrow_Fdw) ➢ Asymmetric Partition-wise JOIN/GROUP BY ➢ GPU Memory Store ➢ PostGIS support ➢ GiST-Index on GPU ➢ BRIN-Index support ➢ NVME-over-Fabric support ➢ Data-frame exchange for Python scripts NEW
  4. GPU’s overview PGconf.online 2021 - GPU version of PostGIS and

    GiST-Index 4 Over 10years history in HPC, then massive popularization in Machine-Learning NVIDIA Tesla V100 Super Computer (TITEC; TSUBAME3.0) Computer Graphics Machine-Learning More than thousands cores and TB/s grade memory bandwidth on a chip for highly computing intensive workloads. Simulation
  5. Target: Fast data search of mobile devices ▌Mobile devices 

    Location (Longitude, Latitude) is updated very frequently.  Its log data often contains (device_id, timestamp, location (point), other attributes) ▌Area definitions  Relatively small items, and almost static data.  Polygon often has very complicated form, thus heavy “collision detection”. ▌Purpose  Area marketing, Logistics analytics, Advertisement delivery, Emergency Alert, etc… Latest Location (Point) Area definition (Polygon) Mobile device Extract mobile devices within the target areas PGconf.online 2021 - GPU version of PostGIS and GiST-Index 5 GPS
  6. about PostGIS (1/2)  Extension to add geometry type, functions

    and operators.  First release at 2005, then contentious development over 15 years.  More than 400 functions / operators  R-tree on GiST-index framework  Parallel query execution © GAIA RESOURCES © OSGeo © KTGIS.net PGconf.online 2021 - GPU version of PostGIS and GiST-Index 7
  7. about PostGIS (2/2) PGconf.online 2021 - GPU version of PostGIS

    and GiST-Index 8 St_Distance(a,b) St_Crosses(a,b) St_Contains(a,b)
  8. PostGIS’s optimization (1/2) ▌Bounding Box  The least rectangle that

    contains a polygon (that is often very complicated).  (x1 ,y1 ) - (x2 ,y2 ) form by FP32 (= 16bytes)  PostGIS assigns a bounding-box when it stores geometry values  It allows to skip heavy operations prior to geolocational operations obviously disjointed Tokyo Germany France PGconf.online 2021 - GPU version of PostGIS and GiST-Index 9
  9. PostGIS’s optimization (2/2) ▌R-Tree on GiST-Index  GiST (Generalized Search

    Tree) - A framework of index at PostgreSQL  PostGIS implements R-tree for Geometry type  Check for contains (‘@’ operator), overlaps (‘&&’ operator)  Works efficiently to check relationship between very large number of points and many polygons. # Data published by Geospatial Information Authority of Japan $ shp2pgsql N03-20_200101.shp | psql gistest gistest=# ¥d+ List of relations Schema | Name | Type | Owner | Size | --------+-------------+----------+--------+------------+ public | geo_japan | table | kaigai | 243 MB | gistest=# ¥di+ List of relations Schema | Name | Type | Owner | Table | Size | --------+--------------------+-------+--------+-----------+---------+ public | geo_japan_pkey | index | kaigai | geo_japan | 2616 kB | public | geo_japan_geom_idx | index | kaigai | geo_japan | 14 MB | PGconf.online 2021 - GPU version of PostGIS and GiST-Index 10
  10. Why GPU is capable for SQL acceleration? PGconf.online 2021 -

    GPU version of PostGIS and GiST-Index 11 ▌GPU’s characteristics  SIMT (Single-Instruction Multiple-Threads) architecture  More than thousands processor cores and 1.0TB/s grade memory band ➔ Designed for “same operations on large number of values” ▌SQL characteristics  “same operations on large number of values”, like evaluation of WHERE-clause, JOIN or GROUP BY. SQL runs WHERE, JOIN, GROUP BY on very large number of rows Thousands processor units on GPU evaluates thousands rows in parallel CPU Parallel GPU Parallel
  11. Performing PostGIS functions on GPU (1/2) postgres=# ¥d _gistest Table

    "public._gistest" Column | Type | Collation | Nullable | Default --------+----------+-----------+----------+-------------------------------------- id | integer | | not null | nextval('_gistest_id_seq'::regclass) a | geometry | | | b | geometry | | | postgres=# explain verbose select * from _gistest where st_contains(a,b); QUERY PLAN ------------------------------------------------------------------------------------ Custom Scan (GpuScan) on public._gistest (cost=4251.50..4251.50 rows=1 width=196) Output: id, a, b GPU Filter: st_contains(_gistest.a, _gistest.b) GPU Preference: None Kernel Source: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu Kernel Binary: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.5.ptx (6 rows) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 12
  12. Performing PostGIS functions on GPU (2/2) $ less /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu :

    #include "cuda_postgis.h" #include "cuda_gpuscan.h" DEVICE_FUNCTION(cl_bool) gpuscan_quals_eval(kern_context *kcxt, kern_data_store *kds, ItemPointerData *t_self, HeapTupleHeaderData *htup) { void *addr __attribute__((unused)); pg_geometry_t KVAR_2; pg_geometry_t KVAR_3; assert(htup != NULL); EXTRACT_HEAP_TUPLE_BEGIN(addr, kds, htup); EXTRACT_HEAP_TUPLE_NEXT(addr); pg_datum_ref(kcxt,KVAR_2,addr); // pg_geometry_t EXTRACT_HEAP_TUPLE_NEXT(addr); pg_datum_ref(kcxt,KVAR_3,addr); // pg_geometry_t EXTRACT_HEAP_TUPLE_END(); return EVAL(pgfn_st_contains(kcxt, KVAR_2, KVAR_3)); } : PGconf.online 2021 - GPU version of PostGIS and GiST-Index 13 Load the geometry values from column-A and column-B Calls GPU-revision of st_contains() for each thread GPU code automatically generated for evaluation of WHERE-clause
  13. Basic performance (1/2) --- GpuScan (GPU-rev PostGIS) =# SELECT count(*)

    FROM ft WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥ 10 90,10 10))’, st_makepoint(x,y)); count -------- 236610 (1 row) Time: 44.680 ms --- Vanilla PostGIS =# SET pg_strom.enabled = off; SET =# SELECT count(*) FROM tt WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥ 10 90,10 10))', st_makepoint(x,y)); count -------- 236610 (1 row) Time: 622.314 ms Count number of points in the specified area from 5 million points. PGconf.online 2021 - GPU version of PostGIS and GiST-Index 14
  14. Basic performance (2/2) PGconf.online 2021 - GPU version of PostGIS

    and GiST-Index 15 (100,100) (0,0) (90,90) (90,10) (90,12) (12,12) (12,88) (90,88) (10,90) (10,10)
  15. GPU-version of PostGIS  geometry st_makepoint(float8, float8[, float8[, float8]]) 

    float8 st_distance(geometry,geometry)  bool st_dwithin(geometry,geometry,float8)  bool st_contains(geometry,geometry)  bool st_crosses(geometry,geometry)  text st_relate(geometry,geometry)  ...and more in the future version Current Status of the supported functions PGconf.online 2021 - GPU version of PostGIS and GiST-Index 16
  16. Extract Points within Polygon PGconf.online 2021 - GPU version of

    PostGIS and GiST-Index 17 Number of Polygons x Points combination is too large even for GPU Latest Location (Point) Area definition (Polygon) Mobile device GPS 100K-10M devices 100 - 100K polygons 10M-1T Combinations?
  17. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1

    is a rectangle [(Xmin ,Ymin ) – (Xmax ,Ymax )] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 19 Search Key (Lon,Lat) 〇 〇
  18. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1

    is a rectangle [(Xmin ,Ymin ) – (Xmax ,Ymax )] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 20 Search Key (Lon,Lat) 〇 〇 × 〇 ×
  19. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1

    is a rectangle [(Xmin ,Ymin ) – (Xmax ,Ymax )] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 21 Search Key (Lon,Lat) 〇 〇 × 〇 × 〇 ×
  20. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1

    is a rectangle [(Xmin ,Ymin ) – (Xmax ,Ymax )] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 22 Search Key (Lon,Lat) 〇 〇 × 〇 × 〇 × × ×
  21. GPU version of GiST-Index PGconf.online 2021 - GPU version of

    PostGIS and GiST-Index 23 ▌Overview  PG-Strom may utilize GiST-Index for joining a table with area definition (polygons; small) and a table with locational data (points; large).  Both the area definition table and its index are loaded onto GPU on GpuJoin.  GpuJoin looks at the GiST-index first for rough pruning, using bounding-box  Then, evaluate the “collision detection” with polygon value on the table  Above operations by thousands cores of GPU in parallel, so we expected its search performance is better, but... collision detection of Polygons x Points as a part of GpuJoin GiST-Index (R-tree) Area Definitions (Polygon) A table with Location Data (Points) Thousands threads search R-tree index in parallel
  22. Simple Test: Random points and St_Contains (2020-Sep) PGconf.online 2021 -

    GPU version of PostGIS and GiST-Index 24 (123.0, 20.0) (154.2, 46.2) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; 10M of randomly generated geolocational points n03_001 | n03_004 | count ---------+-------------+------- Tokyo | Akiruno | 105 Tokyo | Miyake | 76 Tokyo | Mitaka | 17 Tokyo | Setagaya-ku | 67 Tokyo | Chuo-ku | 12 Tokyo | Nakano-ku | 18 Tokyo | Hachijo | 105 : : : Tokyo | Toshima-ku | 14 Tokyo | Adachi-ku | 55 Tokyo | Aogashima | 7 Tokyo | Ome | 117 (63 rows) CPU-rev: 30.539s GPU-rev: 33.841s (Slow!) Area definition data by the Geospatial Information Authority of Japan
  23. Background) GPU’s Thread Scheduling PGconf.online 2021 - GPU version of

    PostGIS and GiST-Index 25  GPU groups its processor cores by Streaming Multiprocessor(SM) ✓ Cores, Registers and L1 cache (shared memory) are managed per streaming-multiprocessor  A SM can run up to 1024 threads simultaneously on the shared 64 cores/SM ✓ Like a very large scale hyper-threading in CPU  Threads are scheduled per warp (32threads). called SIMT (Single-Instruction Multiple-Thread) architecture ✓ Uniformed workloads, like matrix operations, fully utilizes the processor cores ✓ If a particular thread tends to consume large cycles, other threads in a warp must wait. GPU Block Diagram (Tesla V100; 80SMs) Streaming Multiprocessors (64CUDA cores/SM) •••…••• •••…••• •••…••• •••…••• •••…••• •••…••• Thread Group (1~1024 Threads) Warp (32 threads)
  24. Background) Internal design of GpuJoin PGconf.online 2021 - GPU version

    of PostGIS and GiST-Index 26 SELECT * FROM A, B WHERE A.id = B.id; Table-A 512 threads fetches 512 rows at once. Table-B • • • • • • 〇 × 〇 × 〇 × GpuHashJoin / GpuNestLoop N = __syncthreads_count(...) Write out the JOIN results (if any) thread-0 allocates result buffer for N-items Fetch next frame, and repeat the above steps. GpuHashJoin • Hash calculation • Search Hash table • Evaluation of JOIN condition GpuNestLoop • Evaluation of JOIN condition ➔little differences in processing cycles between the threads References
  25. Issues of index-search in GPU PGconf.online 2021 - GPU version

    of PostGIS and GiST-Index 27 Large variance of thread’s processing time, that leads synchronization wait! Table-A • • • • • • Reference Index-A × × × × × 〇 N = __syncthreads_count(...) Reference Walk down to the leaf node of R-tree, but no matched item No matched items, by just looking at the root node of R-tree Walk down to the leaf node of R-tree, and evaluate JOIN conditions Other GPU cores must be idle by the completion of the longest operations.
  26. Naive implementation of GPU GiST-Index Search (2020-Sep) PGconf.online 2021 -

    GPU version of PostGIS and GiST-Index 28 Little utilization ratio of GPU cores, due to inter-core synchronization • • • • • • • • Every thread loads a row from the input buffer, and extract the key for GiST-Index. Hit on the GiST-Index? Search the GiST-Index by the key Is the JOIN-condition true? Evaluation of the JOIN-condition write out JOIN-results (if found) nitems = __syncthreads_count(found); Repeat found=true found=false No No A thread that found a matched entry can block other 511 threads in the same thread-group until completion of the evaluation of JOIN-condition. Very low efficiency of GPU core usage.
  27. A new more optimal implementation (2020-Nov) PGconf.online 2021 - GPU

    version of PostGIS and GiST-Index 29 Minimization of the synchronization point, to pull up utilization ratio • • • • • • • • Every thread loads a row from the input buffer, and extract the key for GiST-Index. Search the GiST-Index by the key Hit on the GiST-Index? Allocation of the temporary buffer, to save the pointers that hit GiST-Index above. Consumption of temporary buffer exceeds 512 items. __syncthreads() Evaluation of the JOIN-condition Is the JOIN-condition true? No No nitems = __syncthreads_count(found) Write out JOIN results Repeat found=false found=true As long as temporary buffer has space, threads continue to fetch rows and search GiST-Index. ➔ Then, evaluate JOIN-conditions by all the threads at once.
  28. Simple Test: Random points and St_Contains (2020-Nov; the latest) PGconf.online

    2021 - GPU version of PostGIS and GiST-Index 30 (123.0, 20.0) (154.2, 46.2) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; 10M of randomly generated geolocational points n03_001 | n03_004 | count ---------+-------------+------- Tokyo | Akiruno | 105 Tokyo | Miyake | 76 Tokyo | Mitaka | 17 Tokyo | Setagaya-ku | 67 Tokyo | Chuo-ku | 12 Tokyo | Nakano-ku | 18 Tokyo | Hachijo | 105 : : : Tokyo | Toshima-ku | 14 Tokyo | Adachi-ku | 55 Tokyo | Aogashima | 7 Tokyo | Ome | 117 (63 rows) CPU-rev: 30.539s GPU-rev: 0.316s Area definition data by the Geospatial Information Authority of Japan 100 times faster!!
  29. EXPLAIN of the simple test (1/2) - CPU version postgres=#

    EXPLAIN (analyze, costs off) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; QUERY PLAN -------------------------------------------------------------------------------------------- Finalize GroupAggregate (actual time=30709.855..30710.080 rows=63 loops=1) Group Key: j.n03_001, j.n03_004 -> Gather Merge (actual time=30709.838..30732.270 rows=244 loops=1) Workers Planned: 4 Workers Launched: 3 -> Partial GroupAggregate (actual time=30687.466..30687.572 rows=61 loops=4) Group Key: j.n03_001, j.n03_004 -> Sort (actual time=30687.452..30687.475 rows=638 loops=4) Sort Key: j.n03_001, j.n03_004 Sort Method: quicksort Memory: 73kB -> Nested Loop (actual time=71.496..30686.278 rows=638 loops=4) -> Parallel Seq Scan on geopoint p (actual time=0.012..207.553 rows=2500000 loops=4) -> Index Scan using geo_japan_geom_idx on geo_japan j ¥ (actual time=0.012..0.012 rows=0 loops=10000000) Index Cond: (geom ~ st_makepoint(p.x, p.y)) Filter: (((n03_001)::text ~~ ‘Tokyo’::text) AND st_contains(geom, st_makepoint(p.x, p.y))) Rows Removed by Filter: 0 Planning Time: 0.156 ms Execution Time: 30732.422 ms (21 rows) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 31
  30. EXPLAIN of the simple test (2/2) - GPU version postgres=#

    EXPLAIN (analyze, costs off) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; QUERY PLAN -------------------------------------------------------------------------------------------- GroupAggregate (actual time=329.118..329.139 rows=63 loops=1) Group Key: j.n03_001, j.n03_004 -> Sort (actual time=329.107..329.110 rows=63 loops=1) Sort Key: j.n03_001, j.n03_004 Sort Method: quicksort Memory: 29kB -> Custom Scan (GpuPreAgg) (actual time=328.902..328.911 rows=63 loops=1) Reduction: Local Combined GpuJoin: enabled -> Custom Scan (GpuJoin) on fgeopoint p (never executed) Outer Scan: fgeopoint p (never executed) Depth 1: GpuGiSTJoin(plan nrows: 10000000...60840000, actual nrows: 10000000...2553) HeapSize: 7841.91KB (estimated: 3113.70KB), IndexSize: 13.28MB IndexFilter: (j.geom ~ st_makepoint(p.x, p.y)) on geo_japan_geom_idx Rows Fetched by Index: 4952 JoinQuals: st_contains(j.geom, st_makepoint(p.x, p.y)) -> Seq Scan on geo_japan j (actual time=0.164..17.723 rows=6173 loops=1) Filter: ((n03_001)::text ~~ ‘Tokyo’::text) Rows Removed by Filter: 112726 Planning Time: 0.344 ms Execution Time: 340.415 ms (20 rows) Portion executed on GPU PGconf.online 2021 - GPU version of PostGIS and GiST-Index 32
  31. Conclusion (1/2) PGconf.online 2021 - GPU version of PostGIS and

    GiST-Index 33 ▌GPU-version PostGIS  Enhancement for parallel execution of PostGIS functions  Also support GiST-Index (R-tree) of geometry type  100 times faster results to pickup “points in area” type workloads. ▌Expected Use scenarios  Area marketing analytics  Real-time advertisement delivery  Push event notifications, etc... ➔ GPU+PostGIS allows to run “computing intensive” workloads on your workstation or cloud instance as like you are usually doing. ▌Resources  GitHub: https://github.com/heterodb/pg-strom  Document: http://heterodb.github.io/pg-strom/ja/  Contact: Tw: @kkaigai / ✉ [email protected]
  32. Conclusion (2/2) PGconf.online 2021 - GPU version of PostGIS and

    GiST-Index 34 ▌Resources  GitHub: https://github.com/heterodb/pg-strom  Document: http://heterodb.github.io/pg-strom/ja/  Contact: Tw: @kkaigai / ✉ [email protected] PG-Strom project welcomes your participation. Please contact us.