Slide 1

Slide 1 text

Jacob Tomlinson Senior Software Engineer, RAPIDS Dask Core Maintainer GPU accelerating your computation in Python EGU General Assembly 2022 EGU22-7610, https://doi.org/10.5194/egusphere-egu22-7610, 2022.

Slide 2

Slide 2 text

2 RAPIDS https://github.com/rapidsai

Slide 3

Slide 3 text

3 Jake VanderPlas - PyCon 2017

Slide 4

Slide 4 text

4 Pandas Analytics CPU Memory Data Preparation Visualization Model Training Scikit-Learn Machine Learning NetworkX Graph Analytics PyTorch, TensorFlow, MxNet Deep Learning Matplotlib Visualization Dask Open Source Data Science Ecosystem Familiar Python APIs

Slide 5

Slide 5 text

5 cuDF cuIO Analytics GPU Memory Data Preparation Visualization Model Training cuML Machine Learning cuGraph Graph Analytics PyTorch, TensorFlow, MxNet Deep Learning cuxfilter, pyViz, plotly Visualization Dask RAPIDS End-to-End Accelerated GPU Data Science

Slide 6

Slide 6 text

6 RAPIDS Matches Common Python APIs CPU-based Clustering from sklearn.datasets import make_moons import pandas X, y = make_moons(n_samples=int(1e2), noise=0.05, random_state=0) X = pandas.DataFrame({'fea%d'%i: X[:, i] for i in range(X.shape[1])}) from sklearn.cluster import DBSCAN dbscan = DBSCAN(eps = 0.3, min_samples = 5) y_hat = dbscan.fit_predict(X)

Slide 7

Slide 7 text

7 from sklearn.datasets import make_moons import cudf X, y = make_moons(n_samples=int(1e2), noise=0.05, random_state=0) X = cudf.DataFrame({'fea%d'%i: X[:, i] for i in range(X.shape[1])}) from cuml import DBSCAN dbscan = DBSCAN(eps = 0.3, min_samples = 5) y_hat = dbscan.fit_predict(X) RAPIDS Matches Common Python APIs GPU-accelerated Clustering

Slide 8

Slide 8 text

8 Benchmarks: Single-GPU cuML vs Scikit-learn 1x V100 vs. 2x 20 Core CPUs (DGX-1, RAPIDS 0.15)

Slide 9

Slide 9 text

9 Exactly as it sounds—our goal is to make RAPIDS as usable and performant as possible wherever science is done. We will continue to work with more open source projects to further democratize acceleration and efficiency in science. RAPIDS Everywhere The Next Phase of RAPIDS

Slide 10

Slide 10 text

10 Statistical genetics toolkit in Python

Slide 11

Slide 11 text

11 Integrations, feedback, documentation support, pull requests, new issues, or code donations welcomed! APACHE ARROW GPU OPEN ANALYTICS INITIATIVE https://arrow.apache.org/ @ApacheArrow http://gpuopenanalytics.com/ @GPUOAI RAPIDS https://rapids.ai @RAPIDSai DASK https://dask.org @Dask_dev Work with us Everyone Can Help!

Slide 12

Slide 12 text

THANK YOU Jacob Tomlinson [email protected] @_jacobtomlinson