Slide 1

Slide 1 text

OPEN SOFTWARE FOR ASTRONOMICAL DATA ANALYSIS by Dan Foreman-Mackey

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

open software for astrophysics 0

Slide 4

Slide 4 text

credit: Adrian Price-Whelan / / data: SAO/NASA ADS

Slide 5

Slide 5 text

7

Slide 6

Slide 6 text

many fundamental software packages have a shockingly small number of maintainers.

Slide 7

Slide 7 text

7 credit: Adrian Price-Whelan

Slide 8

Slide 8 text

* astronomical software can be very high impact * we should think about career trajectories & mechanisms for supporting this work

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

case study: gaussian processes 1

Slide 11

Slide 11 text

°0.6 °0.3 0.0 0.3 0.6 raw [ppt] 0 5 10 15 20 25 time [days] °0.30 °0.15 0.00 de-trended [ppt] N = 1000 reference: DFM+ (2017)

Slide 12

Slide 12 text

°0.6 °0.3 0.0 0.3 0.6 raw [ppt] 0 5 10 15 20 25 time [days] °0.30 °0.15 0.00 de-trended [ppt] N = 1000 reference: DFM+ (2017)

Slide 13

Slide 13 text

reference: Aigrain & DFM (2022)

Slide 14

Slide 14 text

reference: Aigrain & DFM (2022)

Slide 15

Slide 15 text

reference: Aigrain & DFM (2022) ignoring correlated noise accounting for correlated noise

Slide 16

Slide 16 text

reference: Aigrain & DFM (2022)

Slide 17

Slide 17 text

a Gaussian Process is a drop - in replacement for chi - squared

Slide 18

Slide 18 text

more details: Aigrain & Foreman-Mackey (2023) arXiv:2209.08940

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

7 [1] model building [2] computational cost

Slide 21

Slide 21 text

reference: Luger, DFM, Hedges (2021)

Slide 22

Slide 22 text

[2] computational cost

Slide 23

Slide 23 text

7 [1] bigger/better computers [2] exploit matrix structure [3] approximate linear algebra [4] etc.

Slide 24

Slide 24 text

1 3 2

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

1 3 2

Slide 28

Slide 28 text

°0.6 °0.3 0.0 0.3 0.6 raw [ppt] 0 5 10 15 20 25 time [days] °0.30 °0.15 0.00 de-trended [ppt] N = 1000 reference: DFM+ (2017)

Slide 29

Slide 29 text

reference: Gordon, Agol, DFM (2020) / tinygp.readthedocs.io

Slide 30

Slide 30 text

* a Gaussian Process is a drop - in replacement for chi squared * model building & computational cost are (solvable!) challenges * you should check out tinygp!

Slide 31

Slide 31 text

case study: probabilistic inference 2

Slide 32

Slide 32 text

have: physics = > data

Slide 33

Slide 33 text

want: data = > physics

Slide 34

Slide 34 text

7 [1] physical models [2] legacy code

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

number of parameters patience required a few tenish not outrageously many reference: DFM (priv. comm.)

Slide 37

Slide 37 text

number of parameters patience required emcee a few tenish not outrageously many reference: DFM (priv. comm.)

Slide 38

Slide 38 text

number of parameters patience required emcee a few tenish not outrageously many how things should be reference: DFM (priv. comm.)

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

3.0 3.5 4.0 4.5 5.0 Wavelength [micron] 2.05 2.10 2.15 2.20 2.25 2.30 Transit Depth [%] Alderson et al. 2023 Joint Fit (N = 50) reference: Soichiro Hattori, Ruth Angus, DFM, . . . (in prep) WASP-39b / NIRSpec

Slide 44

Slide 44 text

reference: Soichiro Hattori, Ruth Angus, DFM, . . . (in prep) showing 23 of the 404 parameters (8 per channel + 4 shared)

Slide 45

Slide 45 text

how?

Slide 46

Slide 46 text

d(physics = > data) / dphysics

Slide 47

Slide 47 text

automatic differentiation aka “backpropagation”

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

7 [1] physical models [2] legacy code

Slide 50

Slide 50 text

7 [1] domain - specif i c libraries [2] emulation

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

* gradient - based inference using autodiff can improve eff i ciency * there are practical challenges with these methods in astro * of interest: domain - specif i c libraries & emulation

Slide 53

Slide 53 text

aside: JAX 3

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

import numpy as np def linear_least_squares(x, y) : A = np.vander(x, 2) return np.linalg.lstsq(A, y)[0]

Slide 56

Slide 56 text

import jax.numpy as jnp def linear_least_squares(x, y) : A = jnp.vander(x, 2) return jnp.linalg.lstsq(A, y)[0]

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

open research practices 4

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

open software is foundational to astrophysics research there are opportunities at the interface of astro & applied f i elds there are ways you can participate & benef i t right away

Slide 67

Slide 67 text

7 I want to chat about… [1] your data analysis problems [2] building astronomical software [3] writing documentation & tutorials

Slide 68

Slide 68 text

get in touch! dfm.io github.com/dfm