Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The first AI simulation of a black hole

The first AI simulation of a black hole

Black holes accrete gas from their surroundings in a chaotic, turbulent manner. This colloquium will describe the pilot application of deep learning for black hole weather forecasting. Our early results indicate that black hole simulations can benefit tremendously from AI. Along the way, I will explain the difference between artificial intelligence (AI), machine learning, and deep learning and why the scientific community is so excited about these topics.

Slides from colloquium presented in late 2021.

Rodrigo Nemmen

October 23, 2021
Tweet

More Decks by Rodrigo Nemmen

Other Decks in Science

Transcript

  1. Rodrigo Nemmen Universidade de São Paulo The First AI Simulation

    of a Black Hole w/ Roberta Duarte, João Paulo Navarro, Ivan Almeida Do androids dream of electric black holes?
  2. Rodrigo Nemmen Universidade de São Paulo Einstein meets The :

    The First AI Simulation of a Black Hole w/ Roberta Duarte, João Paulo Navarro, Ivan Almeida
  3. Rodrigo Nemmen IAG USP Einstein meets The : The First

    AI Simulation of a Black Hole w/ Roberta Duarte, Ivan Almeida, João Paulo Navarro
  4. Index 1. AI, machine and deep learning? 2. AI in

    astronomy 3. Nonlinear systems 4. Black holes and AI 5. Rodrigo Nemmen | Univ. de São Paulo Why is everybody so excited?
  5. Programs that mimic human behavior: learn, reason and adapt from

    data Statistical techniques that allow programs to learn from experience Deep neural networks Rodrigo Nemmen | Univ. de São Paulo AI
  6. Programs that mimic human behavior: learn, reason and adapt from

    data Statistical techniques that allow programs to learn from experience Deep neural networks Rodrigo Nemmen | Univ. de São Paulo AI ML
  7. Programs that mimic human behavior: learn, reason and adapt from

    data Statistical techniques that allow programs to learn from experience Deep neural networks Rodrigo Nemmen | Univ. de São Paulo AI ML DL
  8. θ 1 (parameter) Loss function θ 2 (parameter) ML theory

    = minimize loss function encoding a purpose Solution x y y = f(x) = θ1 x + θ2
  9. y = f(x) = θ1 x + θ2 θ 1

    (parameter) Loss function θ 2 (parameter) x (images) y (labels) house child cat dog woman people room y = f( ) = θ1 x + θ2 very complex function +million more 
 pars ML theory = minimize loss function encoding a purpose
  10. θ 1 (parameter) θ 2 (parameter) house child cat dog

    woman people room y = f( ) = θ1 x + θ2 very complex function Loss function +million more 
 pars ML theory = minimize loss function encoding a purpose x (images) y (labels)
  11. y = f( ) = θ1 x + θ2 very

    complex function what function?
  12. Neural networks: nonlinear function that maps inputs onto outputs inspired

    by the brain a(2) 1 = g ⇣ ⇥(1) 10 x0 + ⇥(1) 11 x1 + ⇥(1) 12 x2 + ⇥(1) 13 x3 ⌘ a(2) 2 = g ⇣ ⇥(1) 20 x0 + ⇥(1) 21 x1 + ⇥(1) 22 x2 + ⇥(1) 23 x3 ⌘ a(2) 3 = g ⇣ ⇥(1) 30 x0 + ⇥(1) 31 x1 + ⇥(1) 32 x2 + ⇥(1) 33 x3 ⌘ h⇥(x) = a(3) 1 = g ⇣ ⇥(2) 10 a(2) 0 + ⇥(2) 11 a(2) 1 + ⇥(2) 12 a(2) 2 + ⇥(2) 13 a(2) 3 ⌘ <latexit sha1_base64="d4tyUu406mkbZrM4LI/x6LgXDNY=">AAAEJnicfZNPi9QwGMazrX/W8d+sHr0EB2UGYWgTQQ8Ki148rrCzuzAdS5qmbdg0LU0qO5R+Gi9+FS8eVkS8+VFMZ7pDndn6QuHheRJ+b/MmQS640o7ze8+yb9y8dXv/zuDuvfsPHg4PHp2orCwom9FMZMVZQBQTXLKZ5lqws7xgJA0EOw3O3zf56WdWKJ7JY73M2SIlseQRp0Qbyz+w3ngBi7msiOCxZGENiV+59adqjCY1fP429gSL9Ng7TphuEqeJXBNd+JVTv9j4bsd3Oz7q+Kjj446Pa6/gcaIn0PMaPOrFox486sGjHjz6Hx734nEPHvfgcQ8e9+ITv1ovqscXE9PA1TDwpL5+FE2fpGlmrbcG0qbuToo6KdpJcSe90psmmQw3l8Ufjpypsyq4K9xWjEBbR/7w0gszWqZMaiqIUnPXyfWiIoXmVLB64JWK5YSek5jNjZQkZWpRra55DZ8ZJ4RRVphParhyuzsqkiq1TAOzMiU6UdtZY16XzUsdvV5UXOalZpKuQVEpoM5g82ZgyAtGtVgaQWjBTa+QJqQgVJuXNTCH4G7/8q44QVMXT9HHl6PDd+1x7IMn4CkYAxe8AofgAzgCM0CtL9Y369L6YX+1v9s/7V/rpdZeu+cx+KfsP38BWRZSVw==</latexit> units hidden units output a(2) 1 = g ⇣ ⇥(1) 10 x0 + ⇥(1) 11 x1 + ⇥(1) 12 x2 + ⇥(1) 13 x3 ⌘ a(2) 2 = g ⇣ ⇥(1) 20 x0 + ⇥(1) 21 x1 + ⇥(1) 22 x2 + ⇥(1) 23 x3 ⌘ a(2) 3 = g ⇣ ⇥(1) 30 x0 + ⇥(1) 31 x1 + ⇥(1) 32 x2 + ⇥(1) 33 x3 ⌘ h⇥(x) = a(3) 1 = g ⇣ ⇥(2) 10 a(2) 0 + ⇥(2) 11 a(2) 1 + ⇥(2) 12 a(2) 2 + <latexit sha1_base64="d4tyUu406mkbZrM4LI/x6LgXDNY=">AAAEJnicfZNPi9QwGMazrX/W8d+sHr0EB2UGYWgTQQ8Ki148rrCzuzAdS5qmbdg0LU0qO5R+Gi9+FS8eVkS8+VFMZ7pDndn6QuHheRJ+b/MmQS640o7ze8+yb9y8dXv/zuDuvfsPHg4PHp2orCwom9FMZMVZQBQTXLKZ5lqws7xgJA0EOw3O3zf56WdWKJ7JY73M2SIlseQRp0Qbyz+w3ngBi7msiOCxZGENiV+59adqjCY1fP429gSL9Ng7TphuEqeJXBNd+JVTv9j4bsd3Oz7q+Kjj446Pa6/gcaIn0PMaPOrFox486sGjHjz6Hx734nEPHvfgcQ8e9+ITv1ovqscXE9PA1TDwpL5+FE2fpGlmrbcG0qbuToo6KdpJcSe90psmmQw3l8Ufjpypsyq4K9xWjEBbR/7w0gszWqZMaiqIUnPXyfWiIoXmVLB64JWK5YSek5jNjZQkZWpRra55DZ8ZJ4RRVphParhyuzsqkiq1TAOzMiU6UdtZY16XzUsdvV5UXOalZpKuQVEpoM5g82ZgyAtGtVgaQWjBTa+QJqQgVJuXNTCH4G7/8q44QVMXT9HHl6PDd+1x7IMn4CkYAxe8AofgAzgCM0CtL9Y369L6YX+1v9s/7V/rpdZeu+cx+KfsP38BWRZSVw==</latexit> weights (forward mapping between layers) g(x) = (x) = 1 1 + e x <latexit sha1_base64="bKX9WO8ieNCo/NS5nm9BRLWUals=">AAACCnicbZDLSsNAFIYn9VbrLerSTbQIFbEkVdCNUHTjsoK9QBvLZDpph85MwsxEWkLWbnwVNy4UcesTuPNtnKZZaOsPAx//OYcz5/dCSqSy7W8jt7C4tLySXy2srW9sbpnbOw0ZRALhOgpoIFoelJgSjuuKKIpbocCQeRQ3veH1pN58wEKSgN+pcYhdBvuc+ARBpa2uud8vjY4uO5L0GUzJFxDFThI7x/g+PhklSdcs2mU7lTUPTgZFkKnWNb86vQBFDHOFKJSy7dihcmMoFEEUJ4VOJHEI0RD2cVsjhwxLN05PSaxD7fQsPxD6cWWl7u+JGDIpx8zTnQyqgZytTcz/au1I+RduTHgYKczRdJEfUUsF1iQXq0cERoqONUAkiP6rhQZQh6F0egUdgjN78jw0KmXntFy5PStWr7I48mAPHIAScMA5qIIbUAN1gMAjeAav4M14Ml6Md+Nj2pozspld8EfG5w+/dZmu</latexit> g(x) = ⇢ 0.01x for x < 0 x for x 0 <latexit sha1_base64="2EslhaIkYNs9opWjdT93r6xiO6M=">AAACWHicbVFNbxMxFPQuLW3DVyhHLk9EoHKJdlskOIBUwYVjkUhbKY4ir/N2Y9VrL/ZblMjaP4nEAf4KF5x0D9B2TqOZ9+Vx0WjlKct+Jem9nd37e/sHgwcPHz1+Mnx6eO5t6yROpNXWXRbCo1YGJ6RI42XjUNSFxovi6tPGv/iOzitrvtK6wVktKqNKJQVFaT601dHq9QeusSQeeIGVMkE4J9Zd0LoL2TjLYdXBKwiccEUQoLQOOli9zzrgHMKdJvAKv0HWcTSLfhx3qlrSeD4cxZlbwG2S92TEepzNhz/4wsq2RkNSC++nedbQLA4lJTV2A956bIS8EhVOIzWiRj8L22A6eBmVxfao0hqCrfpvRxC19+u6iJW1oKW/6W3Eu7xpS+W7WVCmaQmNvF5UthrIwiZlWCiHkvQ6EiGdireCXAonJMW/GMQQ8ptPvk3Oj8f5yfj4y5vR6cc+jn32nL1gRyxnb9kp+8zO2IRJ9pP9SXaS3eR3ytK99OC6NE36nmfsP6SHfwFWQrKI</latexit> g(x) = ⇢ 0 for x  0 x for x > 0 <latexit sha1_base64="AoB2teFGB7Bvz8bg7IAi7W+/LAE=">AAACU3icbVHPT9swGHUC21g3RoHjLhYVE7tUCSBtFxCCC0eQKCDVVeW4X1IPx8nsL6iVlf8RIXHgH+HCAdySwwZ8p6f33vfDz0mppMUoug/ChcUPHz8tfW59+br8baW9unZui8oI6IlCFeYy4RaU1NBDiQouSwM8TxRcJFdHM/3iGoyVhT7DaQmDnGdaplJw9NSw/SfbmvzcYwpSZI4lkEntuDF8WjulahfV9Ad1DGGC1NG0MLSmE+rtf6mXGKNu8p5jP6oZ6FEziRmZjbE7bHeibjQv+hbEDeiQpk6G7Vs2KkSVg0ahuLX9OCpx4IeiFArqFqsslFxc8Qz6Hmqegx24eSY13fTMaH5PWmikc/bfDsdza6d54p05x7F9rc3I97R+henvgZO6rBC0eFmUVopiQWcB05E0IFBNPeDCSH8rFWNuuED/DS0fQvz6yW/B+XY33ulun+52Dg6bOJbId7JBtkhMfpEDckxOSI8IckMeyFNAgrvgMQzDxRdrGDQ96+S/CpefAZ7cszQ=</latexit> g(x) = activation function Sigmoid ReLU Leaky ReLU
  13. Deep learning: increase number of layers and neurons → Complex

    nonlinear function Good approximator for empirical functions too complex to have an analytical form x f(x)
  14. Y. LeCun Supervised Learning works but requires too many samples

    Training a machine by showing examples instead of programming it When the output is wrong, tweak the parameters of the machine PLANE CAR Works well for: Speech→words Image→categories Portrait→ name Photo→caption Text→topic …. AI today: mostly DL (supervised learning) Slide adapted from Yann LeCun
  15. Y. LeCun Supervised Learning works but requires too many samples

    Training a machine by showing examples instead of programming it When the output is wrong, tweak the parameters of the machine PLANE CAR Works well for: Speech→words Image→categories Portrait→ name Photo→caption Text→topic …. AI today is mostly deep learning Slide adapted from Yann LeCun Data-hungry: needs a lot of data to be e ff ective Computationally expensive to train Once trained, very fast to compute f(x)
  16. = f(GPS coords, ) ⃗ v 🚗 ’s ) my

    next = f(age, location, , # of Breakthroughs in many fi elds incl. astronomy “this is a dog” = f( 🐕 ) next ̋ move = f( ) Many kinds of complex functions can be learned from (empirical or simulated) data
  17. Many kinds of complex functions can be learned from (empirical

    or simulated) data “this is a dog” = f( 🐕 ) next ̋ move = f( ) your next = f(age, location, photo, # of ) Breakthroughs in many fi elds incl. astronomy AI holy grail: competence in any goal = f(GPS coords, ) ⃗ v 🚗 ’s arti fi cial general intelligence or strong AI
  18. More DL breakthroughs AI mastered all board games, Quake III

    Arena, StarCraft, poker Chip design Weather nowcasting Protein structure fi nder: AlphaFold Silver+2018; Schrittwieser+2019; Vinyals+ 2019; Brown & Sandholm 2019 Chattopadhyay+2020; Ravuri+2021 Tunyasuvunakool+2021 Goldie & Mirhoseini 2021 Games as a testbed of intelligence AI fi gures out in hours what takes engineers years More precise weather forecasting 50-year biology grand challenge fi nally solved
  19. DL applications: astronomy Number of applications growing as we speak

    Classi fi cation: Source and spectra Fast interpolation on grid of models Approximate simulations (e.g. cosmology) Filter / recover image features beyond deconvolution limit Barchi+ 2019; Hausen & Robertson 2019; Sharma+ 2019 Almeida+ 2021 (arXiv:2102.05809) Zhang+ 2019; Perraudin+ 2019; Peek and Burkhart 2019; Breen+2020 Schawinski+ 2017; George & Huerta 2017
  20. Encontros AIA Seminários de pesquisadores em AIA Sextas às 13h,

    bissemanais Organização: Natali de Santi, Roberta Duarte AI em Astronomia http://tiny.cc/aia-slack Google Meet http://tiny.cc/aia-meet
  21. RIAF Thin disk synchrotron inverse Compton Nemmen 2009; 2014 Downside:

    takes minute per black hole Di ff i cult to do statistics ≈ 1 Challenge: compute electromagnetic spectrum of ⚫ accretion disks
  22. Almeida, Duarte & N 2021 (arXiv:2102.05809) AI as a turbocharged

    interpolator AGNNES: Deep learning Bayesian inference for low-luminosity active galactic nuclei spectra Downside: takes minute per black hole Di ff i cult to do statistics ≈ 1 Now takes 0.1 ms per spectrum Possible to do Bayesian inference (105 times faster) Ivan Almeida Roberta Duarte
  23. Almeida, Duarte & N 2021 (arXiv:2102.05809) AI as a turbocharged

    interpolator AGNNES: Deep learning Bayesian inference for low-luminosity active galactic nuclei spectra Downside: takes minute per black hole Di ff i cult to do statistics ≈ 1 Now takes 0.1 ms per spectrum Possible to do Bayesian inference Figure 5. Posterior distributions of the fitted parameters for M87. The vertical lines delimit the 1 and 2 regions. parameter space of the training set. Some sources require the explo- 413 ration of values outside the set, such as Sgr A*. In this particular 414 case, the mass accretion rate is too small and outside the € M-range 415 of the present work. As such, AGNNES is unable to fit Sgr A*’s SED. 416 AGNNES best reproduces LLAGN SEDs with (⌫L⌫)peak > 1036 417 erg/s. In the future, we may explore expanded training sets with 418 broader ranges of accretion rates and black hole masses. 419 5 SUMMARY 420 We have developed a deep learning method capable of com- 421 puting radio-to-X-rays spectra of RIAFs and relativistic jets much 422 faster than previous approaches. Combining this fast and accurate 423 8 Almeida, Duarte & Nemmen (a) RIAF parameters (b) Jet parameters Figure 7. NGC 315 SED posterior distribution of the fitted parameters. Conventions are the same as in Figure 5. DATA AVAILABILITY 470 The data underlying this article will be shared on reasonable 471 request to the corresponding author. 472 Ivan Almeida NGC 315 (105 times faster) log(frequency/Hz) log(luminosity/erg/s) 8 Almeida, Duarte & Nemmen (a) RIAF parameters (b) Jet parameters Figure 7. NGC 315 SED posterior distribution of the fitted parameters. Conventions are the same as in Figure 5. DATA AVAILABILITY 470 The data underlying this article will be shared on reasonable 471 Roberta Duarte
  24. Can deep neural nets learn chaos and fl uid dynamics?

    Predicting the future evolution of very large spatiotemporally chaotic systems from data or Movie credit: Jon Mckinney
  25. 5 10 15 20 5 10 15 20 -2 0

    2 (a) (b) (c) 50 100 150 200 50 100 150 200 50 100 (a) (b) (c) PHYSICAL REVIEW LETTERS 120, 024102 (2018) (units of Lyapunov time) Pathak+2018, PRL 2468 P. G. Breen et al. x1 x2 Model-free prediction; No access during training to equations Machine learning predicts future of simple chaotic systems Breen+2020; Tamayo+2020 yt = − yyx − yxx − yxxxx + μ cos ( 2πx λ ) 3-body problem 5 10 15 20 5 10 15 20 -2 0 2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) G. 2. Prediction of a KS equation with L ¼ 22, μ ¼ 0 using a gle reservoir of size Dr ¼ 5000. (a) Actual data from the KS 50 100 150 200 50 100 150 200 50 100 150 200 2 4 6 8 10 12 14 16 50 100 150 200 (a) (b) (c) (d) FIG. 4. Prediction of KS equation PHYSICAL REVIEW LETTERS 120, 024102 (2018)
  26. 5 10 15 20 5 10 15 20 -2 0

    2 (a) (b) (c) 50 100 150 200 50 100 150 200 50 100 (a) (b) (c) PHYSICAL REVIEW LETTERS 120, 024102 (2018) Actual numerical solutions (units of Lyapunov time) Pathak+2018, PRL 2468 P. G. Breen et al. x1 x2 Model-free prediction; No access during training to equations Machine learning predicts future of simple chaotic systems Breen+2020; Tamayo+2020 yt = − yyx − yxx − yxxxx + μ cos ( 2πx λ ) 3-body problem 5 10 15 20 5 10 15 20 -2 0 2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) G. 2. Prediction of a KS equation with L ¼ 22, μ ¼ 0 using a gle reservoir of size Dr ¼ 5000. (a) Actual data from the KS 50 100 150 200 50 100 150 200 50 100 150 200 2 4 6 8 10 12 14 16 50 100 150 200 (a) (b) (c) (d) FIG. 4. Prediction of KS equation PHYSICAL REVIEW LETTERS 120, 024102 (2018)
  27. 5 10 15 20 5 10 15 20 -2 0

    2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) 50 100 150 200 50 100 150 200 50 100 150 200 50 100 150 (a) (b) (c) (d) PHYSICAL REVIEW LETTERS 120, 024102 (2018) ML predictions (units of Lyapunov time) Pathak+2018, PRL 2468 P. G. Breen et al. x1 x2 Model-free prediction; No access during training to equations Machine learning predicts future of simple chaotic systems Breen+2020; Tamayo+2020 yt = − yyx − yxx − yxxxx + μ cos ( 2πx λ ) 3-body problem 5 10 15 20 5 10 15 20 -2 0 2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) G. 2. Prediction of a KS equation with L ¼ 22, μ ¼ 0 using a gle reservoir of size Dr ¼ 5000. (a) Actual data from the KS 50 100 150 200 50 100 150 200 50 100 150 200 2 4 6 8 10 12 14 16 50 100 150 200 (a) (b) (c) (d) FIG. 4. Prediction of KS equation PHYSICAL REVIEW LETTERS 120, 024102 (2018) deviations begin here
  28. 5 10 15 20 5 10 15 20 -2 0

    2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) 50 100 150 200 50 100 150 200 50 100 150 200 50 100 150 (a) (b) (c) (d) PHYSICAL REVIEW LETTERS 120, 024102 (2018) ML predictions (units of Lyapunov time) Pathak+2018, PRL 2468 P. G. Breen et al. x1 x2 Model-free prediction; No access during training to equations Machine learning predicts future of simple chaotic systems Breen+2020; Tamayo+2020 yt = − yyx − yxx − yxxxx + μ cos ( 2πx λ ) 3-body problem 5 10 15 20 5 10 15 20 -2 0 2 2 4 6 8 10 12 5 10 15 20 (a) (b) (c) G. 2. Prediction of a KS equation with L ¼ 22, μ ¼ 0 using a gle reservoir of size Dr ¼ 5000. (a) Actual data from the KS 50 100 150 200 50 100 150 200 50 100 150 200 2 4 6 8 10 12 14 16 50 100 150 200 (a) (b) (c) (d) FIG. 4. Prediction of KS equation PHYSICAL REVIEW LETTERS 120, 024102 (2018) Does this work for astrophysical fl uids?
  29. f( ,t0 ) = , t > t0 This work:

    use deep learning to predict the future of accreting ⚫s deep neural network Can we make models faster? Can deep neural nets learn fl uid dynamics?
  30. Training dataset: Black hole accretion disk simulations Almeida & N

    2020 Let a Schwarzschild ⚫ accrete a torus of ga s Solve hydrodynamic equation s Angular momentum transport; turbulence develops Ivan Almeida
  31. Teaching machines to simulate black hole weather Winds and Feedba

    training test cross- validation Deep learning to make black hole weather forecasting⇤ Roberta Duarte† and Rodrigo Nemmen‡ Universidade de São Paulo, Instituto de Astronomia, Geofísica e Ciências Atmosféricas, Departamento de Astronomia, São Paulo, SP, Brazil João Paulo Peçanha NVIDIA (Dated: December 10, 2019) XXXXXXXXXXXXXXXXXXXXXXXX ABSTRACT XXXXXXXXXXXXXXXXXXXXXXXXXX I. INTRODUCTION Black holes (BHs) are astronomical objects that has the strongest gravitational field. The escape velocity in event horizon is equal the speed of light (c) (Misner et al. 1973). If matter gets close in this gravitational field it gets trapped in orbit around them. This matter forms a torus structure composed of fluids due to angular momen- tum conservation. When it comes to accretion the fluid needs to lose angular momentum otherwise this would be orbiting around the black hole continually. The angular momentum transport can be due to viscosity (Shakura and Sunyaev 1973, Stone et al. 1999) or magnetic stress (Balbus 2003), this transport allows matter to fall inside the black hole creating a disk-like structure called accre- tion disk composed of fluids. The flow of incompressible fluids is described by the Navier-Stokes equations. Navier-Stokes equations are a set of partial differential equations which has to be solved numerically with the exception of a few simple cases (Temam 1977). Numerical simulations allow us to see solutions of parameters in time and space. Density, pres- sure and velocity fields are some of solutions possible due to numerical simulations. In Figure 1, it is possible to see a density field ⇢(r, t = 257387GM/c3) from a numerical solution using Navier-Stokes to describe a hydrodynamic fluid trapped inside a black hole’s gravity (Almeida and Nemmen 2019). Depending on the complexity of the problem we want to study, numerical simulations may become very expen- sive computationally and time consuming, even for visu- alization (Kapferer and Riser 2008). This is the reason why Astronomy deals with a challenge using traditional methods to analyze the amount of data available (George and Huerta 2018) and to have more complex and longer simulations (He et al. 2019, Siemiginowska et al. 2019). He et al. (2019) show us that deep learning may be a method to overcome those challenges with numerical sim- FIG. 1. Example of hydrodynamic simulations from Almeida and Nemmen (2019) without magnetic field using PLUTO code (Mignone et al. 2007). It is shown the logarithm of the density field ⇢(r, t = 257387GM/c3). s. It is Euclidean-coordinates with Rs ⌘ GM/c2 unit. Neural networks are a deep learning method based on the way biological neurons communicate between them- selves (McCulloch and Pitts 1943). With the rise of neu- ral networks new approaches are being build, one exam- ple is convolutional neural networks (CNNs) that enable us to extract information from a 2D image (LeCun 1989). The use of CNNs for image extraction and video process- ing are shown promising results (Chen et al. 2016, Karpa- thy et al. 2014, Liu 2018, Shi et al. 2016). CNNs allow us to extract information and analyse them in a practical way, this is why CNNs are widely used. CNN are showing -400 400 z[2GM/c2] R[2GM/c2] log ρ each point in timeseries: simulation snapshot 400 Roberta Duarte João Paulo Navarro PhD student Black hole mass accretion rate (code units) 1 3 45 60 N+, arXiv:2011.1281 9 Duarte+, arXiv:2102.06242
  32. Winds and Feedb Black hole mass accretion rate (code units)

    training test cross- validation 1 3 45 60 N+, arXiv:2011.1281 9 Duarte+, arXiv:2102.06242 Teaching machines to simulate black hole weather João Paulo Peçanha NVIDIA (Dated: December 10, 2019) XXXXXXXXXXXXXXXXXXXXXXXX ABSTRACT XXXXXXXXXXXXXXXXXXXXXXXXXX I. INTRODUCTION Black holes (BHs) are astronomical objects that has the strongest gravitational field. The escape velocity in event horizon is equal the speed of light (c) (Misner et al. 1973). If matter gets close in this gravitational field it gets trapped in orbit around them. This matter forms a torus structure composed of fluids due to angular momen- tum conservation. When it comes to accretion the fluid needs to lose angular momentum otherwise this would be orbiting around the black hole continually. The angular momentum transport can be due to viscosity (Shakura and Sunyaev 1973, Stone et al. 1999) or magnetic stress (Balbus 2003), this transport allows matter to fall inside the black hole creating a disk-like structure called accre- tion disk composed of fluids. The flow of incompressible fluids is described by the Navier-Stokes equations. Navier-Stokes equations are a set of partial differential equations which has to be solved numerically with the exception of a few simple cases (Temam 1977). Numerical simulations allow us to see solutions of parameters in time and space. Density, pres- sure and velocity fields are some of solutions possible due to numerical simulations. In Figure 1, it is possible to see a density field ⇢(r, t = 257387GM/c3) from a numerical solution using Navier-Stokes to describe a hydrodynamic fluid trapped inside a black hole’s gravity (Almeida and Nemmen 2019). Depending on the complexity of the problem we want to study, numerical simulations may become very expen- sive computationally and time consuming, even for visu- alization (Kapferer and Riser 2008). This is the reason why Astronomy deals with a challenge using traditional methods to analyze the amount of data available (George and Huerta 2018) and to have more complex and longer simulations (He et al. 2019, Siemiginowska et al. 2019). He et al. (2019) show us that deep learning may be a method to overcome those challenges with numerical sim- ulations. Also there is a enthusiasm to use deep learning to study fluid dynamics (Mohan et al. 2019, Pathak et al. 2018). ⇤ A footnote to the article title † [email protected] FIG. 1. Example of hydrodynamic simulations from Almeida and Nemmen (2019) without magnetic field using PLUTO code (Mignone et al. 2007). It is shown the logarithm of the density field ⇢(r, t = 257387GM/c3). s. It is Euclidean-coordinates with Rs ⌘ GM/c2 unit. Neural networks are a deep learning method based on the way biological neurons communicate between them- selves (McCulloch and Pitts 1943). With the rise of neu- ral networks new approaches are being build, one exam- ple is convolutional neural networks (CNNs) that enable us to extract information from a 2D image (LeCun 1989). The use of CNNs for image extraction and video process- ing are shown promising results (Chen et al. 2016, Karpa- thy et al. 2014, Liu 2018, Shi et al. 2016). CNNs allow us to extract information and analyse them in a practical way, this is why CNNs are widely used. CNN are showing better results even when compared with time-series neu- ral networks such as recurrent neural networks (RNNs) and long short term memory (LSTMs) (Bai et al. 2018, Zhang et al. 2016). Here, we used an U-Net architecture which is a U-shape architecture composed with layers of convolutions and pooling. U-Net was first introduced for
  33. The blue pill “Everything is wonderful” “Relax, AI will solve

    all problems for us” I know the steak does not exist The nice side of The Matrix
  34. Neural network’s imagination versus actual data Duarte, N, Navarro, arXiv:2102.06242

    Figure 6. Comparison between the target and iterative predictions (one-sim case) at three di erent times, corresponding to 125, 250 and 500 ti the last snapshot of cross-validation at C = 607570 ". z (2GM/c2) R (2GM/c2) Black Hole Deep log ρ Δt = 2.5 × 104M = 0.1tvisc “Nowcasting”
  35. Figure 7. Evolution of mass fluctuations based on iterat The

    same conventions used in Figure 5 are adopted here Time ( ) × 105GM/c3 Ground truth Deep learning prediction Residual This is entirely imagined by the NN (forecast) Reality Figure 7. Evolution of mass fluctuations based on iterative one-sim model. The same conventions used in Figure 5 are adopted here. 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459
  36. Neural network imagines the future well for a long time!

    DL promising for solving multidimensional chaotic systems 104x faster than 200 CPUs in parallel Figure 7. Evolution of mass fluctuations based on iterat The same conventions used in Figure 5 are adopted here Time ( ) × 105GM/c3 Ground truth Deep learning prediction Residual Figure 7. Evolution of mass fluctuations based on iterative one-sim model. The same conventions used in Figure 5 are adopted here. 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 Neural net learns turbulence and imagines the future surprisingly well Δt = 6 × 104GM/c3 = 0.3tvisc Duarte, N, Navarro, arXiv:2102.06242
  37. Figure 7. Evolution of mass fluctuations based on iterative one-sim

    model. The same conventions used in Figure 5 are adopted here. Time ( ) × 105GM/c3 Ground truth Deep learning prediction Residual
  38. Figure 7. Evolution of mass fluctuations based on iterative one-sim

    model. The same conventions used in Figure 5 are adopted here. Onset of “arti fi cial Alzheimer” After , neural net’s imagination drifts exponentially from reality Δt = 6 × 104GM/c3 Time ( ) × 105GM/c3 Ground truth Deep learning prediction Residual Duarte, N, Navarro, arXiv:2102.06242
  39. Figure 6. Comparison between the target and iterative predictions (one-sim

    case) at three di erent times, corresponding to 125, 250 and 500 time steps after the last snapshot of cross-validation at C = 607570 ". z (2GM/c2) R (2GM/c2) Black Hole Deep Learning log ρ Δt = 2.5 × 104M = 0.1tvisc Black Hole Deep Learning
  40. Figure 6. Comparison between the target and iterative predictions (one-sim

    case) at three di erent times, corresponding to 125, 250 and 500 time steps after the last snapshot of cross-validation at C = 607570 ". R (2GM/c2) Δt = 5 × 104M = 0.2tvisc Black Hole Deep Learning z (2GM/c2)
  41. Figure 6. Comparison between the target and iterative predictions (one-sim

    case) at three di erent times, corresponding to 125, 250 and 500 time steps after the last snapshot of cross-validation at C = 607570 ". R (2GM/c2) Δt = 105M = 0.4tvisc AI injects mass in the domain like there is no tomorrow Black Hole Deep Learning z (2GM/c2)
  42. Improve training dataset Steps to cure AI’s Alzheimer 3D GRMHD

    simulations Include dynamics (turbulent fi eld) Improve AI methods Recurrent NN, GAN, transformers, self- attention Add velocity fi eld to training set
  43. The red pill of AI My own view: Deep learning

    is fancy curve- fi tting (w/ millions of free parameters) ML has no accomplishments in astronomy. What important result could only have been obtained with ML? None so far “E ff ective fi eld theory” of deep neural nets is terra incognita. Choosing NN architectures etc is more art than science. No understanding of errors
  44. Most serious issue is philosophical AI does not solve the

    underlying physics equations (“equation-free, data-driven”) Even if it learns well, can we trust its predictions? What about new phenomena predicted? Like a baby that can stand up, but does not understand gravity. Would you trust the baby for making physics predictions?
  45. Moves AlphaGo would play Moves human players would play AlphaGo’s

    evaluation of the move AI is giving us profound insights (but not yet in astrophysics)
  46. Summary: Deep learning and ⚫s Deep neural nets: Good approximators

    for very complex empirical function s Train machines by showing data (vs programming them) Future: AI replace numerical solvers of fl uids? 1 year of work in 1 hour: science less restricted by hardware ? But lots of open questions to be solved rodrigonemmen.com blackholegroup.org AI seems to learn black hole accretion physics; blazing fast predictions arXiv:2011.12819, 2102.06242 Serious issues with mass continuity
  47. If also interested in : Black hole physic s High-energy

    astrophysic s GPUs and HPC Get in touch! 🙂 [email protected] @nemmen blackholegroup.org @BlackHolesUSP
  48. If also interested in : High-energy astrophysic s Simulations: accretion

    / jet s AGN feedback (winds, jet) Get in touch! 🙂 [email protected] @nemmen blackholegroup.org @BlackHolesUSP
  49. If also interested in : Simulations: accretion / jet s

    AGN feedback (winds, jet ) GPU acceleration / AI Get in touch! 🙂 [email protected] @nemmen blackholegroup.org @BlackHolesUSP