Large-scale GPU-Accelerated Data Visualization with MapD

Slide 1

Slide 1 text

Large-scale GPU-Accelerated Data Visualization with MapD January 30, 2018

Slide 2

Slide 2 text

Aaron Williams VP of Global Community @_arw_ [email protected] /in/aaronwilliams/ /williamsaaron Christophe Viau Data Visualization Engineer [email protected] /in/christopheviau/ /biovisualize

Slide 3

Slide 3 text

“Every business will become a software business, build applications, use advanced analytics and provide Saas services.” - Smart CEO Guy has

Slide 4

Slide 4 text

MapD System Architecture Accelerating the existing data infrastructure 4

Slide 5

Slide 5 text

5 MAPD DEMOS

Slide 6

Slide 6 text

Core Density Makes a Huge Difference 6 GPU Processing CPU Processing 40,000 Cores 20 Cores *fictitious example Latency Throughput CPU 1 ns per task (1 task/ns) x (20 cores) = 20 tasks/ns GPU 10 ns per task (0.1 task per ns) x (40,000 cores) = 4,000 task per ns Latency: Time to do a task. | Throughput: Number of tasks per unit time.

Slide 7

Slide 7 text

Query Compilation with LLVM 7 Traditional DBs can be highly inefficient • each operator in SQL treated as a separate function • incurs tremendous overhead and prevents vectorization MapD compiles queries w/LLVM to create one custom function • Queries run at speeds approaching hand-written functions • LLVM enables generic targeting of different architectures (GPUs, X86, ARM, etc). • Code can be generated to run query on CPU and GPU simultaneously 10111010101001010110101101010101 00110101101101010101010101011101 LLVM

Slide 8

Slide 8 text

Keeping Data Close to Compute MapD maximizes performance by optimizing memory use 8 SSD or NVRAM STORAGE (L3) 250GB to 20TB 1-2 GB/sec CPU RAM (L2) 32GB to 3TB 70-120 GB/sec GPU RAM (L1) 24GB to 256GB 1000-6000 GB/sec Hot Data Speedup = 1500x to 5000x Over Cold Data Warm Data Speedup = 35x to 120x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER Data Lake/Data Warehouse/System Of Record Speed Increases Space Increases

Slide 9

Slide 9 text

The GPU Open Analytics Initiative Model Standard in-memory format; zero-copy interchange 9 GPU

Slide 10

Slide 10 text

MapD: Extreme Analytics 10 100x Faster Queries MapD Core The world’s fastest columnar database, built specifically for GPUs + Visualization at the Speed of Thought MapD Immerse A visualization front end that leverages the speed & rendering superiority of GPUs

Slide 11

Slide 11 text

MapD Immerse Using a hybrid approach to speed and scale visualization 11 Basic charts are frontend rendered using D3 and other related toolkits Scatterplots, pointmaps + polygons are backend rendered using the Iris Rendering Engine on GPUs Geo-Viz is composited over a frontend rendered basemap

Slide 12

Slide 12 text

Built for an open-source ecosystem 12 Extending multiple APIs ● Dc.js (docs): Mapd-charting (docs) ● Crossfilter: Mapd-crossfilter ● Vega (editor): Mapd Raster ● GPU DB Connector (docs) Part of an ecosystem ● Related projects like Deck.gl ● Building blocks like Mapbox, which uses Leaflet ● Using smaller building blocks, like D3.js

Slide 13

Slide 13 text

Try MapD It’s free and it’s easy 13 Play with the live demos: https://www.mapd.com/demos/ Try the Test Drive: https://mapd.io/testdrive-enterprise Install the Community Edition: https://www.mapd.com/platform/download-community/ Join our forums: https://community.mapd.com/ Review these slides: https://speakerdeck.com/mapd

Slide 14

Slide 14 text

AWS Credits Available 14 Free GPU Compute! We’re looking for interesting use cases. Email Aaron Williams ([email protected]) with your ideas!

Slide 15

Slide 15 text

Aaron Williams VP of Global Community @_arw_ [email protected] /in/aaronwilliams/ /williamsaaron Christophe Viau Data Visualization Engineer [email protected] /in/christopheviau/ /biovisualize

Slide 16

Slide 16 text

No content