gully
April 25, 2018
330

# GPUs for astronomy data

Essentially all astronomical data has correlated noise. New statistical approaches for overcoming correlated noise have recently become available and are now popular in the astronomy community. These approaches involve linear algebra manipulations amenable to GPU acceleration. Two barriers have prevented widespread adoption of GPUs for astronomy data: access to (NVIDIA) hardware, and the finite learning curve of programming (CUDA). Recent Python frameworks are lowering the barrier to programming GPUs.
In this presentation I demonstrate the performance of the PyTorch GPU programming framework on fitting a line to data with correlated noise. The task applies Gaussian Process regression with linear least squares. The hardware was NVIDIA K40 on a Sandy Bridge node of the NASA Pleiades supercomputer based at NASA Ames Research Center.

April 25, 2018

## Transcript

1. ### GPUs for astronomy data Michael Gully-Santiago, PhD Scientist, baeri.org Kepler/K2

Guest Observer Ofﬁce April 25, 2018 NVIDIA, Santa Clara, CA
2. ### What I want to say • (essentially all) Astronomical data

has correlated noise • New statistical approaches for overcoming correlated noise have recently become available and are now popular in the astronomy community • These approaches involve linear algebra manipulations amenable to GPU acceleration • Two barriers have prevented widespread adoption: access to (NVIDIA) hardware, and the ﬁnite learning curve of programming (CUDA) • Recent Python frameworks are lowering the barrier to programming GPUs
3. ### How to deal with correlated noise 6 4 2 0

2 4 6 4 3 2 1 0 1 2 3 4 y = m x + b 0 10 20 30 40 0 10 20 30 40 The true covariance of the observations. Slides from Dan Foreman-Mackey https://speakerdeck.com/dfm/an-astronomers-introduction-to-gaussian-processes-v2
4. ### Linear least-squares. A = 2 6 6 6 4 x1

1 x2 1 . . . . . . xn 1 3 7 7 7 5 C = 2 6 6 6 4 2 1 0 · · · 0 0 2 2 · · · 0 . . . . . . ... . . . 0 0 · · · 2 n 3 7 7 7 5 y = 2 6 6 6 4 y1 y2 . . . yn 3 7 7 7 5  m b = S AT C 1 y S = ⇥ AT C 1 A ⇤ 1 maximum likelihood & in this case only mean of posterior posterior covariance 6 4 2 0 2 4 6 4 3 2 1 0 1 2 3 4 Before. You get biased estimates of m,b when you ignore covariance. Ignoring the covariance! Slides from Dan Foreman-Mackey https://speakerdeck.com/dfm/an-astronomers-introduction-to-gaussian-processes-v2
5. ### 6 4 2 0 2 4 6 4 3 2

1 0 1 2 3 4 After. Including the covariance yields realistic error bars. Linear least-squares. A = 2 6 6 6 4 x1 1 x2 1 . . . . . . xn 1 3 7 7 7 5 C = 2 6 6 6 4 2 1 0 · · · 0 0 2 2 · · · 0 . . . . . . ... . . . 0 0 · · · 2 n 3 7 7 7 5 y = 2 6 6 6 4 y1 y2 . . . yn 3 7 7 7 5  m b = S AT C 1 y S = ⇥ AT C 1 A ⇤ 1 maximum likelihood & in this case only mean of posterior posterior covariance 0 10 20 30 40 0 10 20 30 40 Slides from Dan Foreman-Mackey https://speakerdeck.com/dfm/an-astronomers-introduction-to-gaussian-processes-v2
6. ### Inverting the C matrix is very computationally expensive. In practice,

can only be done on small (N~ 3000) data. Scales poorly (naively O(N3)) Cholesky factorization helps Applied mathematics research has developed approximate solvers George and celerité have improved scaling on CPUs. Problem is even more acute if inversion occurs in a repeated likelihood function call for optimization or sampling. That's where GPUs come in. https://github.com/dfm/george https://github.com/dfm/celerite

8. ### PyTorch just got even better with v. 0.4.0 - Includes

much needed Multivariate Normal distribution - Looks more like numpy yesterday!
9. ### Demo of solving y = m x + b with

covariance matrix. Three ways: 1. CPU with numpy 2. CPU with PyTorch 3. GPU with PyTorch I used the: NASA Advanced Supercomputing (NAS) High End Computing Capability (HECC) Pleiades System based at NASA Ames One 16-core Sandy Bridge GPU-Enhanced Node possessing an NVIDIA Tesla K40 (GPU) and CUDA/8.0.
10. ### A = 2 6 6 6 4 x1 1 x2

1 . . . . . . xn 1 3 7 7 7 5 C = 2 6 6 6 4 2 1 0 · · · 0 0 2 2 · · · 0 . . . . . . ... . . . 0 0 · · · 2 n 3 7 7 7 5 y = 2 6 6 6 4 y1 y2 . . . yn 3 7 7 7 5  m b = S AT C 1 y S = ⇥ AT C 1 A ⇤ 1 maximum likelihood & in this case only mean of posterior posterior covariance 0 10 20 30 40 0 10 20 30 40
11. ### GPUs outperform CPUs by 18x at N=3200 Simulating the noise

limits CPUs to N=3200 Michael Gully-Santiago, using NASA Ames Pleiades NVIDIA K40 on one Sandy Bridge node
12. ### Simulating the noise requires drawing from one huge multivariate normal

(aka Gaussian distribution) GPUs outperform CPUs by 75x on this task at C=3200 x 3200 matrix. We can simulate 23000 x 23000 matrix before we run out of memory on the NVIDIA K40. https://github.com/gully/bombcat
13. ### Kaggle Kernels are hosted, social Jupyter notebooks. They oﬀer NVIDIA

K80 GPUs https://www.kaggle.com/xgully/ﬁt-a-line-to-data-with-gpus/ Resources
14. ### Dan Foreman-Mackey (Flatiron Center for Computational Astrophysics) has pioneered many

methods for modern astrophysical statistical inference. Resources https://speakerdeck.com/dfm/pyastro16 https://speakerdeck.com/dfm/ an-astronomers-introduction-to-gaussian-processes-v2 https://speakerdeck.com/dfm/data-analysis-with-mcmc http://dfm.io/ https://github.com/dfm/tf-tutorial
15. ### What I said • (essentially all) Astronomical data has correlated

noise • New statistical approaches for overcoming correlated noise have recently become available and are now popular in the astronomy community • These approaches involve linear algebra manipulations amenable to GPU acceleration • Two barriers have prevented widespread adoption: access to (NVIDIA) hardware, and the ﬁnite learning curve of programming (CUDA) • Recent Python frameworks are lowering the barrier to programming GPUs

17. ### Examples of correlated noise in astronomy • Kepler high precision

time-series brightness measurements of variable stars • Model-data comparison of stellar spectroscopy • Thermal-induced noise in optical calibration systems • Galaxy surface brightness distributions • Apparent bulk motion of gas blobs on roiling stellar surfaces • Placement of any points on a chart when systematic bias is present