GPU Acceleration in the PyData community

1 GPU Acceleration in the PyData community Jacob Tomlinson, Dask
Maintainer and RAPIDS Developer Pangeo CNES 2024

2 Modern Applications Need Accelerated Computing Petabyte scale data |
Massive models | Real-time performance LLMs Forecasting Fraud Detection Genomic Analysis Cybersecurity Single-threaded perf 1.5X per year 1.1X per year 102 103 104 105 106 107 101 ACCELERATED COMPUTING Recommenders

3 Accelerated Computing Swim Lanes RAPIDS makes accelerated computing more
seamless while enabling specialization for maximum performance

4 100x faster feature engineering 20x faster model training Increased
forecast accuracy RAPIDS | Dask | XGBoost Processing relationships between 10 million biological entities through more than a billion edges. cuGraph 70% Cost savings 33% Performance improvement RAPIDS Accelerator for Apache Spark RAPIDS Adopted Across Industries

Accelerated pandas cudf.pandas: the zero code change GPU accelerator for
pandas built on cuDF

Bringing NVIDIA accelerated computing to Polars Polars GPU Engine Powered
by RAPIDS cuDF https://developer.nvidia.com/blog/polars-gpu-engine-powered-by-rapids-cudf-now-available-in-open-beta/

7 Accelerated Dask Just set “cudf” and “cupy” as the
backend and use Dask-CUDA Workers • Configurable Backend and GPU-Aware Workers • Memory Spilling (GPU->CPU->Disk) • Optimized Memory Management • Accelerated RDMA and Networking (UCX) • Community tools like xarray-cupy

8 Accelerated Apache Spark Zero code change acceleration for Spark
DataFrames and SQL spark.sql(""" select order count(*) as order_count from orders""" ) spark.conf.set("spark.plugins", "com.nvidia.spark.SQLPlugin") spark.sql(""" select order count(*) as order_count from orders""" ) CPU Spark GPU Spark Average Speed-Ups: >5x • Operates as a software plugin to popular Apache Spark platform • Automatically accelerates supported operations (with CPU fallback if needed) • Requires no code changes • Works with Spark standalone, YARN clusters, Kubernetes clusters • Deploy on: Apache Spark 3.4.1, RAPIDS Spark release 24.04 See GTC session S62257 for details NVIDIA Decision Support Benchmark 3TB (Public Cloud) Amazon EMR Google Cloud Dataproc

9 cuML Accelerated machine learning with a scikit-learn API >>>
from sklearn.ensemble import RandomForestClassifier >>> clf = RandomForestClassifier() >>> clf.fit(x, y) >>> from cuml.ensemble import RandomForestClassifier >>> clf = RandomForestClassifier() >>> clf.fit(x, y) GPU CPU Scikit-learn cuML Time Series Preprocessing Classification Tree Models Cross Validation Clustering Explainability Dimensionality Reduction Regression 50+ GPU-Accelerated Algorithms A100 GPU vs. AMD EPYC 7642 (96 logical cores) cuML 23.04, scikit-learn 1.2.2, umap-learn 0.5.3

10 Accelerated NetworkX nx-cugraph: the zero-code change GPU backend for
NetworkX • Zero-code-change GPU-acceleration of for NetworkX code • Accelerates algorithms up to 600x, based on algorithm and graph size • Support for 60 popular graph algorithms and growing • Falls back to using CPU NetworkX for unsupported algorithms NetworkX 3.2, CPU: Intel(R) Xeon(R) Platinum 8480CL 2TB, GPU: NVIDIA H100 80GB pip install nx-cugraph-cu12 --extra-index-url https://pypi.nvidia.com conda install -c rapidsai -c conda-forge -c nvidia nx-cugraph

GPU Acceleration in the PyData community

GPU Acceleration in the PyData community

Jacob Tomlinson

More Decks by Jacob Tomlinson

Other Decks in Technology

Featured

Transcript

1 GPU Acceleration in the PyData community Jacob Tomlinson, Dask

2 Modern Applications Need Accelerated Computing Petabyte scale data |

3 Accelerated Computing Swim Lanes RAPIDS makes accelerated computing more

4 100x faster feature engineering 20x faster model training Increased

Accelerated pandas cudf.pandas: the zero code change GPU accelerator for

Bringing NVIDIA accelerated computing to Polars Polars GPU Engine Powered

7 Accelerated Dask Just set “cudf” and “cupy” as the

8 Accelerated Apache Spark Zero code change acceleration for Spark

9 cuML Accelerated machine learning with a scikit-learn API >>>

10 Accelerated NetworkX nx-cugraph: the zero-code change GPU backend for