Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Big Data Bellevue Workshop

OmniSci
November 14, 2018

Big Data Bellevue Workshop

The goal of this talk is to deliver an introductory workshop to Graphical Processing Units (GPUs) and their application to general purpose analytics. The implications of GPU technology for machine and deep learning have already been enormous and there is a blossoming open source community to prove it. People are leveraging GPUs to make databases faster, complex geospatial analysis better, and to create visual analytics tools that work even with (literal) big data.

Aaron, VP of Global Community, will introduce OmniSci (Previously MapD), a startup based in San Francisco that builds open source and enterprise GPU-enabled SQL database solutions. Aaron will explain how OmniSci’s SQL engine and Immerse visualization platform were built from the ground up to harness all that GPU power. Bring your laptop if you'd like to follow along to the short workshop where we can use their web-based data visualization interface, which uses the GPU to accelerate speed and rendering capabilities.

OmniSci

November 14, 2018
Tweet

More Decks by OmniSci

Other Decks in Technology

Transcript

  1. GPU-Powered Analytics Workshop Big Data Bellevue | Bellevue | November

    14, 2018 slides: https://speakerdeck.com/omnisci
  2. © OmniSci 2018 © OmniSci 2018 Agenda • Why GPUs?

    • Live Demos • Software + Hardware FTW Short Break • Setting up OmniSci Cloud • Importing Data • Creating Dashboards • API Access
  3. © OmniSci 2018 Core Density Makes a Huge Difference GPU

    Processing CPU Processing 40,000 Cores 20 Cores Latency Throughput CPU 1 ns per task (1 task/ns) x (20 cores) = 20 tasks/ns GPU 10 ns per task (0.1 task per ns) x (40,000 cores) = 4,000 task per ns Latency: Time to do a task. | Throughput: Number of tasks per unit time. *fictitious example
  4. © OmniSci 2018 9 GPU Parallelism Drives Fast Analytics at

    Scale High Memory Bandwidth Native Rendering Pipeline Supercomputer Processing
  5. © OmniSci 2018 10 SSD or NVRAM STORAGE (L3) 250GB

    to 20TB 1-2 GB/sec CPU RAM (L2) 32GB to 3TB 70-120 GB/sec GPU RAM (L1) 24GB to 256GB 1000-6000 GB/sec Hot Data Speedup = 1500x to 5000x Over Cold Data Warm Data Speedup = 35x to 120x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER Data Lake/Data Warehouse/System Of Record Advanced Memory Management
  6. © OmniSci 2018 11 OmniSci Core: Query Compilation with LLVM

    10111010101001010110101101010101 00110101101101010101010101011101 Traditional DBs can be highly inefficient • Each operator in SQL treated as a separate function • Incurs tremendous overhead and prevents vectorization OmniSci compiles queries w/LLVM to create one custom function • Queries run at speeds approaching hand-written functions • LLVM enables generic targeting of different architectures (GPUs, X86, ARM, etc). • Code can be generated to run query on CPU and GPU simultaneously
  7. © OmniSci 2018 12 OmniSci Innovations Powering Extreme Analytics 3-Tier

    Memory Caching Query Compilation In-Situ Rendering
  8. © OmniSci 2018 © OmniSci 2018 • omnisci.com/demos Play with

    our demos - every demo you saw in this talk was live! • omnisci.cloud Get an OmniSci instance in 60 seconds • omnisci.com/platform/downloads/ Download the Community Edition • community.omnisci.com Ask questions and share your experiences Next Steps
  9. © OmniSci 2018 Aaron Williams VP, Global Community at OmniSci

    @_arw_ [email protected] /in/aaronwilliams/ /williamsaaron Thank you! Any Questions? slides: https://speakerdeck.com/omnisci