== GPU Accelerated PostgreSQL ==

GPU Accelerated Data Processing Speed of Thought Analytics at Scale

Richard Heyns

In the beginning In 2008, Hackers turbocharge password cracks with
NVIDIA card GeForce 8800 to speed up brute force attacks…

Brytlyt is the fastest, most advanced GPU database on the
market today. Our mission is to empower organisations through transformation data analytics and believe the way to achieving this is by delivering “Speed of Thought Analytics at Scale”. • Recognised as the world’s fastest database according to independent benchmarking. • Four years in research and development. • Patent pending IP. • Fourth generation GpuManagner bridges the gap between SQL and AI. The true value of Brytlyt lies in how its transformation data processing performance is put to use, for although Brytlyt is the fastest GPU database on the market today, it is how this extreme performance is package for the end user that really makes it stand out as the go- to solution for analysts. Present day

Patent pending intellectual property Programming database operations for parallel execution
on Graphics Processor Units (GPU) is not trivial and requires a fundamentally new approach Recursive Interaction Probability (RIP)

Recursive Interaction Probability • A parallelisable method that can be
applied to SQL joins • Which can also be used in filtering data • Embarrassing parallelisable • Very efficient Big O notation = O(n log n) • Binary search for single element Big O notation = O(log n)

Each vendor in this space has a different approach to
where data is stored and processed 8,000 GB/second 100 GB/second 1 GB/second

Brytlyt GpuManager Four Levels of Parallelisation Level 4 GPUs are
parallel machines Level 3 Data streaming on and off device Level 2 Coordinating multiple GPUs Level 1 Coordinating multiple machines …and is fully containerized!

Zero Copy By Extending PyTorch Memory Management SQL DDL and
DML via PostgreSQL AI and ML operations via PyTorch GPU RAM

NUMA architecture:

Summary of the 1.1 Billion Taxi Rides Benchmark Benchmark runtime
in seconds

13 2.1 Billion Rows per second 360 GB per second
Data throughput TPC-H benchmark on a single Dell R940xa GPU server 0.020 seconds TPC-H Q6 runtime 0.150 seconds Usain Bolt reaction time

Brytlyt Indexes are advanced data structures, designed for GPU to
improve performance Query 1 SELECT cab_type, count(*) FROM trips GROUP BY cab_type; Query 2 SELECT passenger_count, avg(total_amount) FROM trips GROUP BY passenger_count; 0 10 20 30 40 50 60 Q1 Q2 Indexed Queries Brytlyt MapD time in milliseconds

Brytlyt is exceptionally fast, even without indexing Query 3 SELECT
passenger_count, extract(year from pickup_datetime), count(*) FROM trips GROUP BY passenger_count, pickup_year; Query 4 SELECT passenger_count, extract(year from pickup_datetime), cast(trip_distance as int), count(*), FROM trips GROUP BY passenger_count, pickup_year, distance ORDER BY pickup_year, the_count desc; 0 100 200 300 400 500 600 Q3 Q4 None-indexed Queries Brytlyt MapD time in milliseconds Usain Bolt

Bitonic Sort

Radix Sort

Secondary Index Covering Index SELECT cab_type, count(*) FROM trips GROUP
BY cab_type; SELECT cab_type, count(*) FROM trips GROUP BY cab_type, trip_type, passenger_count;

relevance age High Value Data Brytlyt Brytlyt is not all
things to all use cases...

PostgreSQL parser / planner / optimizer

Query to Query Plan SELECT schemaname, relname, (pgstattuple(schemaname ||'.'||relname)).* FROM
pg_statio_user_tables;

Using flex and bison for a database frontend

Brytlyt is a PostgreSQL fork Foreign Data Wrapper 3rd Party
Data Sources Disk Storage DB Engine Planner Parser Brytlyt GPU Manager NVIDIA GPU Hardware User Client PostgreSQL 9.4

Data Acquisition • A single line of code is all
that is needed to connect to a third party data source • Connection treated as a table within Brytlyt and can be used directly within SQL queries • Over thirty data connectors already exist • Data connectors are bi-directional with both read and write capability • Using the Foreign Data Wrapper API it is quick and easy to create a new connector • Add streaming work loads • Very easy to incorporate into ETL processing • Bulk import and export also available {API} GPU CPU GPU Acceleration • Brytlyt patent pending IP combined with Graphic Processor Unit (GPU) acceleration • Query billions of rows of data in milliseconds, up to 1,000 times faster then alternatives • Fast GPU JOIN • Hot data caching in CPU RAM used for immediate access to large data sets • GPUs used to filter and render billions of geospatial data points in real time • Horizontal scale out allows multiple severs to be used in a single cluster User Interaction • PostgreSQL client tools used to manage and work with data using ANSI SQL • Stored procedures, database cursors, native JSON support, arrays and much more • User Defined Functions can be developed in C/C++/CUDA where necessary • Torch Machine Learning and AI available with zero data copy • SpotLyt visualisation tool built on Plot.ly with twenty interactive chart types • Point map can navigate billions of geospatial data points in real time • PostgreSQL data connector means any visualisation tool can benefit from GPU acceleration Scale Out SpotLyt + Geospatial Foreign Data Wrapper

Ultra-Fast GPU JOIN The Foundation of a Relational Database

Brytlyt DB GPU accelerated PostgreSQL

SpotLyt Interactive analytics workbench for billion row datasets

BrytMind SQL + AI + GPU

GPU Accelerated Data Processing Speed of Thought Analytics at Scale
CEO Richard Heyns Email [email protected] URL www.brytlyt.com Twitter @brytlytDB

== GPU Accelerated PostgreSQL ==

== GPU Accelerated PostgreSQL ==

Warsaw PostgreSQL Users Group

More Decks by Warsaw PostgreSQL Users Group

Other Decks in Technology

Featured

Transcript

GPU Accelerated Data Processing Speed of Thought Analytics at Scale

Richard Heyns

In the beginning In 2008, Hackers turbocharge password cracks with

Brytlyt is the fastest, most advanced GPU database on the

Patent pending intellectual property Programming database operations for parallel execution

Recursive Interaction Probability • A parallelisable method that can be

Each vendor in this space has a different approach to

Brytlyt GpuManager Four Levels of Parallelisation Level 4 GPUs are

Zero Copy By Extending PyTorch Memory Management SQL DDL and

NUMA architecture:

Summary of the 1.1 Billion Taxi Rides Benchmark Benchmark runtime

13 2.1 Billion Rows per second 360 GB per second

Brytlyt Indexes are advanced data structures, designed for GPU to

Brytlyt is exceptionally fast, even without indexing Query 3 SELECT

Bitonic Sort

Radix Sort

Secondary Index Covering Index SELECT cab_type, count(*) FROM trips GROUP

relevance age High Value Data Brytlyt Brytlyt is not all

PostgreSQL parser / planner / optimizer

Query to Query Plan SELECT schemaname, relname, (pgstattuple(schemaname ||'.'||relname)).* FROM

Using flex and bison for a database frontend

Brytlyt is a PostgreSQL fork Foreign Data Wrapper 3rd Party

Data Acquisition • A single line of code is all

Ultra-Fast GPU JOIN The Foundation of a Relational Database

Brytlyt DB GPU accelerated PostgreSQL

SpotLyt Interactive analytics workbench for billion row datasets

BrytMind SQL + AI + GPU

GPU Accelerated Data Processing Speed of Thought Analytics at Scale