Save 37% off PRO during our Black Friday Sale! »

Speed Meets Scale: Interactively Analyzing & Visualizing Billions of Rows of Spatiotemporal Data

B368ef65fbf835fc57b08617f9b8d5a5?s=47 OmniSci
December 09, 2019

Speed Meets Scale: Interactively Analyzing & Visualizing Billions of Rows of Spatiotemporal Data

Data analytics is fundamentally changing: millions of rows are becoming billions of rows (either through more fine-grained collection, or through mash-ups with relevant, third-party data), and geospatial and time series data types are becoming more commonplace. On top of these technical challenges, analysts are also being pushed to make more effective, data-driven decisions in real time, with their data. The shift last decade from legacy databases to in-memory databases helped, but the speed and scale of traditional solutions has not kept pace with these challenges, and the lagging user experience costs time and money for companies that are increasingly data rich but insight poor. In this talk we’ll look at a way to classify this new category of big, visual, interactive data, and look at how OmniSci leverages the fastest hardware (fast memory, fast parallel processing) to run SQL queries hundreds of times faster than traditional tools. We will demonstrate a 10B row dataset, that can be queried in less than 300 milliseconds without any indexing or aggregation of the data.

B368ef65fbf835fc57b08617f9b8d5a5?s=128

OmniSci

December 09, 2019
Tweet

Transcript

  1. Speed Meets Scale: Interactively Analyzing & Visualizing Billions of Rows

    of Spatiotemporal Data Minneanalytics | Minneapolis | December 9, 2019
  2. @_arw_ aaron@omnisci.com /in/aaronwilliams/ /williamsaaron slides: https://speakerdeck.com/omnisci Aaron Williams VP, Global

    Community
  3. None
  4. OmniSci’s mission is to make analytics instant, powerful, and effortless

    for everyone.
  5. The Spring of Revolutions

  6. The Origin: TweetMap 400M Tweets public demo: https://omnisci.com/demos/tweetmap

  7. Technological Advantages Exploit modern compilation techniques in analytic workflows Efficiently

    use the modern memory hierarchy Rethink analytic operations for modern hardware 7
  8. None
  9. Times in Seconds

  10. None
  11. OmniSci on CPU 1.2B Drug Prescription Claims public data: https://community.omnisci.com/browse/dataset-library

  12. OmniSci Scale 11.6B Ship Positions public demo: https://omnisci.com/demos/ships/

  13. Points and Polygons 1B Taxi Rides + 1M Buildings public

    demo: https://omnisci.com/demos/taxis/
  14. 14 Efficient use of the modern memory hierarchy Minimize unnecessary

    data movement and exploit spatial/temporal locality SSD or NVRAM STORAGE (L3) 250GB to 20TB 1-4 GB/sec CPU RAM (L2) 32GB to 3TB 140-560 GB/sec GPU RAM (L1) 32GB to 256GB 1-7 TB/sec Hot Data Speedup = 250x to 1750x Over Cold Data Warm Data Speedup = 35x to 140x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER Data Lake/Data Warehouse/System Of Record
  15. 15 10111010101001010110101101010101 00110101101101010101010101011101 Exploit modern compiler infrastructure for analytics LLVM-based

    JIT compilation of both SQL queries and User-Defined kernels Traditional Analytics Engines use a ‘Chain of Iterators’ model (VOLCANO) • Each operator in SQL treated as a separate function • Incurs significant overhead and prevents vectorization OmniSci compiles both queries and UDF kernels using LLVM • LLVM enables generic targeting of different architectures (GPUs, X86, ARM, etc). • Code can be generated to run query on CPU and GPU simultaneously • Queries and UDFs can run at speeds approaching hand-written functions • Also allows support of modern analytic frontends - Python, Julia, Swift for greater productivity
  16. © OmniSci 2018 • omnisci.com/demos Play with our live demos

    for yourself! • omnisci.cloud Get an OmniSci instance in 60 seconds • omnisci.com/platform/downloads/ Download a 30-day trial of OmniSci • community.omnisci.com Ask questions and share your experiences Self Discovery
  17. USED BY 100+ GLOBAL ORGS $92 MILLION IN FUNDING OPEN-SOURCE

    COMMUNITY About OmniSci TOP-TIER VENTURE BACKING
  18. © OmniSci 2018 @_arw_ aaron@omnisci.com /in/aaronwilliams/ /williamsaaron slides: https://speakerdeck.com/omnisci Aaron

    Williams VP, Global Community Thank you!