HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration

by Paul Finlayson Adams

Slide 1

Slide 1 text

HTAP By Accident HTAP By Accident

Slide 2

Slide 2 text

Confidential & Proprietary ©Swarm64 AS, 2019 2 There's Gold In Them Thar Hills "The stone age did not end for the lack of stone, and the oil age will end long before the world runs out of oil." Organizations store lots of transactional data. They want to derive value from that data. Fast.

Slide 3

Slide 3 text

Confidential & Proprietary ©Swarm64 AS, 2019 3 From OLTP To HTAP Data engineering today: driven by the right tool for the right job Elasticsearch for search / relevance Hadoop for Map-Reduce CouchDB and MongoDB for document stores Tableau for BI analytics and visualization Why?

Slide 4

Slide 4 text

Confidential & Proprietary ©Swarm64 AS, 2019 4 The Vs Of Big Data Value Veracity Velocity Volume Variety

Slide 5

Slide 5 text

DBAs & Analysts: One DB, Two Perspectives

Slide 6

Slide 6 text

Confidential & Proprietary ©Swarm64 AS, 2019 6 DBAs Volume & Velocity Care about Install & configure databases Monitor performance Do capacity planning They

Slide 7

Slide 7 text

Confidential & Proprietary ©Swarm64 AS, 2019 7 Analysts Veracity & Value Solve business problems by learning from data Convert data into information Help businesses make better decision using information Care about They

Slide 8

Slide 8 text

Confidential & Proprietary ©Swarm64 AS, 2019 8 HTAP By Accident P A U L A D A M S VP Engineering S E B A S T I A N D R E S S L E R Team Lead Solution Engineering

Slide 9

Slide 9 text

Confidential & Proprietary ©Swarm64 AS, 2019 9 The DBAs Benchmark Standardized & comparable: TPC-H* DWH benchmark 22 Queries 8 Tables (2 fact, 6 dimension) Various scale factors (up to PB) Run single or multiple streams (*) TPC-DS is newer but there are fewer (semi-)official data points to compare to

Slide 10

Slide 10 text

Confidential & Proprietary ©Swarm64 AS, 2019 10 Setup Hardware Dual Intel Xeon Gold 6140 384GB RAM 8x960 GB SSD Software CentOS 7.6 PostgreSQL 11.3 Swarm64 DA 2.0 TPC-H 1 TB worth of data Biggest tables: 6bn & 1.5bn rows Configurations Single Node 1 Coordinator + 2 Data Nodes 2 Coordinators + 2 Data Nodes Same as above + Swarm64 DA

Slide 11

Slide 11 text

Part 1: Postgres for Analytics (Single Instance)

Slide 12

Slide 12 text

Confidential & Proprietary ©Swarm64 AS, 2019 12 Postgres & Analytics Postgres One of the world's most trusted and powerful databases Maturity built on decades of community-driven development Well-respected for OLTP workloads Very capable in HTAP and OLAP Well, there is a "but"...

Slide 13

Slide 13 text

Let's Look At Some Analytics Queries

Slide 14

Slide 14 text

Confidential & Proprietary ©Swarm64 AS, 2019 14 Queries By Example: TPC-H Q6 SELECT SUM(l_extendedprice * l_discount) AS revenue FROM lineitem WHERE l_shipdate >= DATE '1993-01-01' AND l_shipdate < DATE '1993-01-01' + INTERVAL '1' YEAR AND l_discount BETWEEN 0.05 - 0.01 AND 0.05 + 0.01 AND l_quantity < 24; Scanning > Parallelism helps, yet limited > Indices may cause scatter-gather Statistics help > They narrow selectivity Typical runtime: 10min

Slide 15

Slide 15 text

Confidential & Proprietary ©Swarm64 AS, 2019 15 Queries By Example: TPC-H Q12 SELECT l_shipmode, SUM(CASE WHEN o_orderpriority = '1-URGENT' OR o_orderpriority = '2-HIGH' THEN 1 ELSE 0 END) AS high_line_count, SUM(CASE WHEN o_orderpriority <> '1-URGENT' AND o_orderpriority <> '2-HIGH' THEN 1 ELSE 0 END) AS low_line_count FROM orders, lineitem WHERE o_orderkey = l_orderkey AND l_shipmode IN ('TRUCK', 'AIR') AND l_commitdate < l_receiptdate AND l_shipdate < l_commitdate AND l_receiptdate >= DATE '1996-01-01' AND l_receiptdate < DATE '1996-01-01' + INTERVAL '1' YEAR GROUP BY l_shipmode ORDER BY l_shipmode; Typical runtime: >10min Expensive finalization > Early data reduction is key JOIN > May scatter-gather Filtering > Data reduction point #1 > Parallelism helps GROUP BY > Data reduction point #2

Slide 16

Slide 16 text

Demo Time

Slide 17

Slide 17 text

Confidential & Proprietary ©Swarm64 AS, 2019 17 Q6 Single Node

Slide 18

Slide 18 text

The Analyst's Point Of View

Slide 19

Slide 19 text

Confidential & Proprietary ©Swarm64 AS, 2019 19 Analytics On Real World Datasets Question How much more generous are passengers being picked up at The Dead Rabbit than those being dropped off? NYC Taxi Billions of rows Pickup location, drop-off location, tip, fare, ...

Slide 20

Slide 20 text

Confidential & Proprietary ©Swarm64 AS, 2019 20 The Analyst's Query SELECT (outbound.tip - inbound.tip) / inbound.tip * 100 AS generosity_increase FROM ( SELECT AVG(tip_amount / fare_amount) AS tip FROM trip_dropoff_locations JOIN trips ON trip_dropoff_locations.id = trips.id WHERE (trip_dropoff_locations.longitude BETWEEN -74.0114803745 AND -74.0105174615) AND (trip_dropoff_locations.latitude BETWEEN 40.7030212228 AND 40.7032184606) AND tip_amount != 0 AND fare_amount != 0) inbound CROSS JOIN ( SELECT AVG(tip_amount / fare_amount) AS tip FROM trip_pickup_locations JOIN trips ON trip_pickup_locations.id = trips.id WHERE (trip_pickup_locations.longitude BETWEEN -74.0114803745 AND -74.0105174615) AND (trip_pickup_locations.latitude BETWEEN 40.7030212228 AND 40.7032184606) AND tip_amount != 0 AND fare_amount != 0) outbound; Range-based scan > Index helps to reduce data volume JOIN > May scatter-gather Serial repetition > Similar query, executes in sequence Filtering > Reduces data volume

Slide 21

Slide 21 text

Confidential & Proprietary ©Swarm64 AS, 2019 21 Analytics On Real World Datasets SELECT (outbound.tip - inbound.tip) / inbound.tip * 100 AS generosity_increase FROM ( SELECT AVG(tip_amount / fare_amount) AS tip FROM trip_dropoff_locations JOIN trips ON trip_dropoff_locations.id = trips.id WHERE (trip_dropoff_locations.longitude BETWEEN -74.0114803745 AND -74.0105174615) AND (trip_dropoff_locations.latitude BETWEEN 40.7030212228 AND 40.7032184606) AND tip_amount != 0 AND fare_amount != 0) inbound CROSS JOIN ( SELECT AVG(tip_amount / fare_amount) AS tip FROM trip_pickup_locations JOIN trips ON trip_pickup_locations.id = trips.id WHERE (trip_pickup_locations.longitude BETWEEN -74.0114803745 AND -74.0105174615) AND (trip_pickup_locations.latitude BETWEEN 40.7030212228 AND 40.7032184606) AND tip_amount != 0 AND fare_amount != 0) outbound; 1min 56s

Slide 22

Slide 22 text

Part 2: Postgres for Analytics (Scaled-Out)

Slide 23

Slide 23 text

Confidential & Proprietary ©Swarm64 AS, 2019 23 Scale-Out: When & Why? Separation of concerns I/O intensive vs. CPU intensive Bottlenecks on your single node I/O CPU RAM Why is that? Data grows More concurrent users More demanding queries

Slide 24

Slide 24 text

Confidential & Proprietary ©Swarm64 AS, 2019 24 Scale-Out: Typical Approach Coordinator Table metadata Node to connect to & query on Does the compute intesive part Data Nodes Table data Nodes where coordinators connect to Performs scanning and other I/O operations (e.g. filtering)

Slide 25

Slide 25 text

Confidential & Proprietary ©Swarm64 AS, 2019 25 Scale-Out: "3rd-Party Options" (Patroni) ... there are more.

Slide 26

Slide 26 text

Confidential & Proprietary ©Swarm64 AS, 2019 26 Scale-Out With PG Tools Native scale-out Use the Postgres Foreign Data Wrapper extension (postgres_fdw) Coordinator tables are postgres_fdw tables Connect to data nodes where the data is actually located Use partitions for parallelism Pitfall postgres_fdw not parallelized out-of-the-box, needs a patch

Slide 27

Slide 27 text

Confidential & Proprietary ©Swarm64 AS, 2019 27 Q6 2 Coordinator + 2 Data Nodes

Slide 28

Slide 28 text

Confidential & Proprietary ©Swarm64 AS, 2019 28 Analytics On Real World Datasets Same query, better result? Split computation & I/O Data gathering on data nodes Final computation is done on the coordinator 1min 46s

Slide 29

Slide 29 text

Part 3: Software & Hardware Acceleration

Slide 30

Slide 30 text

Confidential & Proprietary ©Swarm64 AS, 2019 30 Tuning Postgres Data For Analytics Add Indices They help you on point-lookups, range queries, full text search, ... Upside Access data faster Downside They cost extra storage & CPU They can cause non-optimal I/O patterns Decide for fast reads over fast writes Add Partitions Mostly reduce data on range queries by selecting the right partition Upside Lower data volume & better maintenance (partitions can be plugged out) Downside Parallelism might be limited Changing the partition scheme might be hard

Slide 31

Slide 31 text

Are There Any Other Options?

Slide 32

Slide 32 text

Confidential & Proprietary ©Swarm64 AS, 2019 32 How To Increase Parallelism? Postgres limits parallelism... to prevent resource over-allocation to ensure transactional safety, even on a highly loaded system Patched postgres_fdw Parallelize scans on remote tables for higher throughput Workload management Determine and assign resources prior to query execution Monitor system state to acknowledge change Query rewriting Transform query plans to be executed more efficiently

Slide 33

Slide 33 text

Confidential & Proprietary ©Swarm64 AS, 2019 Optimized Columns ROW- / COLUMN-HYBRID BLOCKS UP TO 3 RANGE-INDICES I/O transfer from storage device WHERE ws_order_number BETWEEN 150 AND 15000 AND ws_sold_date_sk BETWEEN 2450820 AND 2452000 WHERE ws_order_number BETWEEN 150 AND 15000 AND ws_sold_date_sk BETWEEN 2450820 AND 2452000 M ulti-part index

Slide 34

Slide 34 text

Confidential & Proprietary ©Swarm64 AS, 2019 Decompress Pick Rows Pick Columns Result FROM SELECT Parallel Plan Optimized Columns WHERE Executed on the HW Accelerator WHERE 34 + Hardware Acceleration

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Confidential & Proprietary ©Swarm64 AS, 2019 37 Analytics On Real World Datasets Same query, best result? Split computation & I/O Data gathering on data nodes Final computation is done on the coordinator Plus Higher scan & filter parallelism Higher throughput due to compression 21s

Slide 38

Slide 38 text

Confidential & Proprietary ©Swarm64 AS, 2019 38 Conclusions Postgres is great for analytics! Single node performs well Postgres can scale-out natively Hardware and software optimizations allow for greater parallelism and higher throughput Postgres can be a true analytics engine

Slide 39

Slide 39 text

Confidential & Proprietary ©Swarm64 AS, 2019 39 Got Questions? Come and find us in the exhibitor area P A U L A D A M S VP Engineering [email protected] @theRealPAdams S E B A S T I A N D R E S S L E R Team Lead Solution Engineering [email protected] @theDressler