Upgrade to Pro — share decks privately, control downloads, hide ads and more …

== GPU Accelerated PostgreSQL ==

== GPU Accelerated PostgreSQL ==

Millisecond response times for large join and aggregation type queries on billion row datasets is possible when tapping the power of Graphics Processor Units. Integrating this GPU accelerated data engine into PostgreSQL provides a solution that is familiar and easy to use and neatly drops into existing investments in software and technology.

Speaker: Richard Heyns
Richard is the Founder and CEO of Brytlyt and led the original research in bringing database operations to GPU. His background is a BSc in Engineering and he has over twenty years’ experience building and leading Business Intelligence and Data Warehousing solutions.

More Decks by Warsaw PostgreSQL Users Group

Other Decks in Technology

Transcript

  1. In the beginning In 2008, Hackers turbocharge password cracks with

    NVIDIA card GeForce 8800 to speed up brute force attacks…
  2. Brytlyt is the fastest, most advanced GPU database on the

    market today. Our mission is to empower organisations through transformation data analytics and believe the way to achieving this is by delivering “Speed of Thought Analytics at Scale”. • Recognised as the world’s fastest database according to independent benchmarking. • Four years in research and development. • Patent pending IP. • Fourth generation GpuManagner bridges the gap between SQL and AI. The true value of Brytlyt lies in how its transformation data processing performance is put to use, for although Brytlyt is the fastest GPU database on the market today, it is how this extreme performance is package for the end user that really makes it stand out as the go- to solution for analysts. Present day
  3. Patent pending intellectual property Programming database operations for parallel execution

    on Graphics Processor Units (GPU) is not trivial and requires a fundamentally new approach Recursive Interaction Probability (RIP)
  4. Recursive Interaction Probability • A parallelisable method that can be

    applied to SQL joins • Which can also be used in filtering data • Embarrassing parallelisable • Very efficient Big O notation = O(n log n) • Binary search for single element Big O notation = O(log n)
  5. Each vendor in this space has a different approach to

    where data is stored and processed 8,000 GB/second 100 GB/second 1 GB/second
  6. Brytlyt GpuManager Four Levels of Parallelisation Level 4 GPUs are

    parallel machines Level 3 Data streaming on and off device Level 2 Coordinating multiple GPUs Level 1 Coordinating multiple machines …and is fully containerized!
  7. Zero Copy By Extending PyTorch Memory Management SQL DDL and

    DML via PostgreSQL AI and ML operations via PyTorch GPU RAM
  8. 13 2.1 Billion Rows per second 360 GB per second

    Data throughput TPC-H benchmark on a single Dell R940xa GPU server 0.020 seconds TPC-H Q6 runtime 0.150 seconds Usain Bolt reaction time
  9. Brytlyt Indexes are advanced data structures, designed for GPU to

    improve performance Query 1 SELECT cab_type, count(*) FROM trips GROUP BY cab_type; Query 2 SELECT passenger_count, avg(total_amount) FROM trips GROUP BY passenger_count; 0 10 20 30 40 50 60 Q1 Q2 Indexed Queries Brytlyt MapD time in milliseconds
  10. Brytlyt is exceptionally fast, even without indexing Query 3 SELECT

    passenger_count, extract(year from pickup_datetime), count(*) FROM trips GROUP BY passenger_count, pickup_year; Query 4 SELECT passenger_count, extract(year from pickup_datetime), cast(trip_distance as int), count(*), FROM trips GROUP BY passenger_count, pickup_year, distance ORDER BY pickup_year, the_count desc; 0 100 200 300 400 500 600 Q3 Q4 None-indexed Queries Brytlyt MapD time in milliseconds Usain Bolt
  11. Secondary Index Covering Index SELECT cab_type, count(*) FROM trips GROUP

    BY cab_type; SELECT cab_type, count(*) FROM trips GROUP BY cab_type, trip_type, passenger_count;
  12. Brytlyt is a PostgreSQL fork Foreign Data Wrapper 3rd Party

    Data Sources Disk Storage DB Engine Planner Parser Brytlyt GPU Manager NVIDIA GPU Hardware User Client PostgreSQL 9.4
  13. Data Acquisition • A single line of code is all

    that is needed to connect to a third party data source • Connection treated as a table within Brytlyt and can be used directly within SQL queries • Over thirty data connectors already exist • Data connectors are bi-directional with both read and write capability • Using the Foreign Data Wrapper API it is quick and easy to create a new connector • Add streaming work loads • Very easy to incorporate into ETL processing • Bulk import and export also available {API} GPU CPU GPU Acceleration • Brytlyt patent pending IP combined with Graphic Processor Unit (GPU) acceleration • Query billions of rows of data in milliseconds, up to 1,000 times faster then alternatives • Fast GPU JOIN • Hot data caching in CPU RAM used for immediate access to large data sets • GPUs used to filter and render billions of geospatial data points in real time • Horizontal scale out allows multiple severs to be used in a single cluster User Interaction • PostgreSQL client tools used to manage and work with data using ANSI SQL • Stored procedures, database cursors, native JSON support, arrays and much more • User Defined Functions can be developed in C/C++/CUDA where necessary • Torch Machine Learning and AI available with zero data copy • SpotLyt visualisation tool built on Plot.ly with twenty interactive chart types • Point map can navigate billions of geospatial data points in real time • PostgreSQL data connector means any visualisation tool can benefit from GPU acceleration Scale Out SpotLyt + Geospatial Foreign Data Wrapper
  14. GPU Accelerated Data Processing Speed of Thought Analytics at Scale

    CEO Richard Heyns Email [email protected] URL www.brytlyt.com Twitter @brytlytDB