Supercharging Analytics with GPUs - Masood Krohy

Supercharging Analytics with GPUs - Masood Krohy

Masood Krohy at April 30, 2019 event of EDPP Montreal (montrealml.dev/data)

Title: Supercharging Analytics with GPUs: OmniSci/cuDF vs Postgres/Pandas/PDAL

Presentation/Demo video: check out PatternedScience's YouTube channel at https://www.youtube.com/channel/UCjbEIZlS2DA45Bswi5EXWRw

Summary: GPUs are known to significantly accelerate machine learning model training speeds, especially when using deep learning libraries like TensorFlow. But did you know that there are now solid options to also accelerate data analytics workloads, BI tools and dashboards with the help of GPUs? Join us for a presentation of performance benchmarks of GPU-based options and their CPU-based counterparts. We compare the performance that one could get from OmniSci Core DB (a GPU database) compared to the performance of Postgres DB (for data analytics) and PDAL (for LiDAR processing). On the in-memory side, we benchmark cuDF (NVIDIA's GPU DataFrame) against the widely popular Pandas DataFrame. We will share results and include some code walk-throughs and live benchmarking. Coming out of this technical talk, you will have insight regarding how GPUs can accelerate your data analytics and geospatial workloads.

Code on GitHub: https://github.com/patternedscience/GPU-Analytics-Perf-Tests

Bio: Masood Krohy is a Data Science Platform Architect/Advisor and most recently acted as the Chief Architect of UniAnalytica, an advanced data science platform with wide, out-of-the-box support for time-series and geospatial use cases. He has worked with several corporations in different industries in the past few years to design, implement and productionize Deep Learning and Big Data products. He holds a Ph.D. in computer engineering.

C9254b955021b34b6cf0f61a40dd150a?s=128

PatternedScience

April 30, 2019
Tweet

Transcript

  1. Copyright © 2019, PatternedScience Inc. www.patterned.science Supercharging Analytics with GPUs

    A GPU-vs-CPU performance benchmark (OmniSci Core DB / cuDF GPU DataFrame) vs (Postgres DB / Pandas DataFrame / PDAL) Presenter Masood Krohy, Ph.D. Version 1.2 Apr 30, 2019
  2. This is an independent benchmarking carried out on the UniAnalytica

    platform by PatternedScience Inc. The technical talk and meetup are sponsored by OmniSci and PatternedScience.
  3. 3 Copyright © 2019, PatternedScience Inc. Final Notes / Q&A

    Code Walkthrough & Live Benchmarking Data Specs & Summary of Results Presenter Bio Presentation Layout
  4. 4 Copyright © 2019, PatternedScience Inc. Ph.D. in Computer Engineering

    Analytical modeling of botnets. Validated by data collected in industry. 3 top publications. Senior Analyst, Rogers Managing the analytics reporting/statistical analyses of the national benchmarking program. Data Scientist, Intact First Data Scientist of the company. Led the Big Data mining project for the UBI program. Lead Data Scientist, CN Implemented an object-within-object detection system to detect cracks in railway equipment. Masood Krohy Presenter Bio 2013 Sr Data Science Advisor, B.Yond Implemented a pattern detection system for stream of alarms coming from telecom devices. Chief Architect, UniAnalytica (advanced data science platform) Platform contains the open-source technologies benchmarked here, among many others. 2014 2016 2017 2018 2019 Data Science Platform Architect & Advisor
  5. 5 Copyright © 2019, PatternedScience Inc. In-memory DataFrames 0 1

    2 3 4 5 sec GPU Processing Times (financial time-series) Mean and variance of a column (trading volume) CPU cuDF On-disk Databases Pandas OmniSci Postgres 8.72 ms 673 ms Bar heights are approximate. Numbers are exact. First-run result for OmniSci (2nd and subsequent runs are faster due to caching) 1.6 s 3.05 s CPU node/worker 8 vCPUs, ~60GB RAM GPU node/worker V100 GPU, 8 vCPUs, ~60GB RAM CPU model in both workers: Intel Xeon CPU E5-2686 v4 @ 2.30GHz Hardware Specs • Covering last 20 years for 63 ETF symbols • Average per-minute price (available when traded) • 50 million records • 3.5 GB CSV file size • 6 GB in-memory size (RAM, Pandas DF) • 5 GB in-memory size (GPU memory, cuDF) Data Specs
  6. 6 Copyright © 2019, PatternedScience Inc. In-memory DataFrames 0 3

    6 9 12 15 sec GPU Processing Times (financial time-series) Sorting a column (trading volume) CPU cuDF On-disk Databases Pandas OmniSci Postgres 374 ms 15.1 s 604 ms 11.4 s CPU node/worker 8 vCPUs, ~60GB RAM GPU node/worker V100 GPU, 8 vCPUs, ~60GB RAM CPU model in both workers: Intel Xeon CPU E5-2686 v4 @ 2.30GHz Hardware Specs • Covering last 20 years for 63 ETF symbols • Average per-minute price (available when traded) • 50 million records • 3.5 GB CSV file size • 6 GB in-memory size (RAM, Pandas DF) • 5 GB in-memory size (GPU memory, cuDF) Data Specs Bar heights are approximate. Numbers are exact. First-run result for OmniSci (2nd and subsequent runs are faster due to caching)
  7. 7 Copyright © 2019, PatternedScience Inc. In-memory DataFrames 0 3

    6 9 12 15 sec GPU Processing Times (financial time-series) Mixed analytics (math ops + sorting) [largest return in %] CPU cuDF On-disk Databases Pandas OmniSci Postgres 327 ms 10.7 s 3.6 s 13.2 s CPU node/worker 8 vCPUs, ~60GB RAM GPU node/worker V100 GPU, 8 vCPUs, ~60GB RAM CPU model in both workers: Intel Xeon CPU E5-2686 v4 @ 2.30GHz Hardware Specs • Covering last 20 years for 63 ETF symbols • Average per-minute price (available when traded) • 50 million records • 3.5 GB CSV file size • 6 GB in-memory size (RAM, Pandas DF) • 5 GB in-memory size (GPU memory, cuDF) Data Specs Bar heights are approximate. Numbers are exact. First-run result for OmniSci (2nd and subsequent runs are faster due to caching)
  8. Code Walkthrough & Live Benchmarking

  9. 9 Copyright © 2019, PatternedScience Inc. Data: Montreal LiDAR aerial

    scan Benchmarking of an operation on a single tile Source: City of Montreal (http://donnees.ville.montreal.qc.ca/dataset/lidar-aerien-2015)
  10. 10 Copyright © 2019, PatternedScience Inc. Data: Montreal LiDAR aerial

    scan Sherbrooke St. Berri St.
  11. 11 Copyright © 2019, PatternedScience Inc. Data: Montreal LiDAR aerial

    scan Cropping a building out of a tile
  12. 12 Copyright © 2019, PatternedScience Inc. On-disk database and filesystem

    0 4 8 12 16 20 sec GPU Processing Times (LiDAR) Cropping a building out of a tile CPU OmniSci PDAL 268 ms 17 s Montreal LiDAR aerial scan Stats of 1 tile: • 18,306,827 points • 82 MB Laz • 1.6 GB CSV • 681 MB in the OmniSci DB Data Specs CPU node/worker 8 vCPUs, ~60GB RAM GPU node/worker V100 GPU, 8 vCPUs, ~60GB RAM CPU model in both workers: Intel Xeon CPU E5-2686 v4 @ 2.30GHz Hardware Specs Bar heights are approximate. Numbers are exact. First-run result for OmniSci (2nd and subsequent runs are faster due to caching)
  13. Code Walkthrough & Live Benchmarking

  14. 14 Copyright © 2019, PatternedScience Inc. Final Notes • CUDA

    10.0 • Postgres 10-2 • Pandas 0.24 • OmniSci (MapD) Core DB v4.4.2 More specifically, v20190123 (commit 568e77d) [just after v4.4.2 due to CUDA 10 support] • cuDF (GPU DataFrame) 0.5.0 • PDAL 1.8.0 Component versions evaluated https://github.com/patternedscience/GPU-Analytics-Perf-Tests Code and results are on GitHub To discuss technical results and give feedback, please email us at research@patterned.science We love feedback! https://www.linkedin.com/company/patterned-science/ Follow us for future results & announcements!
  15. Q&A