Spectral Timing, Bayesian Inference and You Daniela Huppenkothen SRON Netherlands Institute for Space Research ! Tiana_Athriel " dhuppenkothen !

These slides: dhuppenkothen/spectral-timing- bayesian-inference-and-you Documentation: Tutorials: notebooks Iconic 60s Theme Song:

How do black holes accrete matter?

General Relativity plasma physics + magnetic fields jet physics + particle acceleration stellar winds

Time [s] Frequenc Time Malzac 2008 Photon Energy [keV] Photon Energy * Flux [detector units] “soft state” “hard state” high-frequency/low-frequency X-ray brightness total brightness Spectral States Credit: Sera Markoff

Yamaoka et al (2010); Huppenkothen et al. (2019) Variability: GX 339-4 Fast Fourier Transform stochastic variability (red noise) quasi-periodic oscillation (QPO) white noise

X-ray reverberation around accreting black holes 5 Fig. 3 Time lag (8–13 keV relative to 2–4 keV) versus frequency for a hard state obser- vation of Cyg X-1 obtained by RXTE in December 1996. The trend can be very roughly approximated with a power-law of slope −0.7, but note the clear step-like features, which correspond roughly to different Lorentzian features in the power spectrum (Nowak 2000). Cygnus X-1: Nowak, 2000 1 10 0.5 2 5 0.5 Energy (keV) Fig. 9 The ratio spectrum of 1H0707-495 to a continuum model (Fabian et al. 2009). T broad iron K and iron L band are clearly evident in the data. The origin of the soft exce below 1 keV in this source had been debatable, but in this work was found to be dominat by relativistically broadened emission lines. 10−5 10−4 10−3 0.01 −50 0 50 100 150 200 Lag (s) Temporal Frequency (Hz) Fig. 10 The frequency-dependent lags in 1H0707-495 between the continuum dominat hard band at 1–4 keV and the reflection dominated soft band at 0.3–1 keV. and found significant high-frequency soft lags in 15 sources. Plotting the am 1H0707-495: Uttley et al, 2014 Time Lags Temporal variations and energy spectra are intricately linked

Problem: infer accretion physics from spectral timing representations statistics! stingray!

Why Stingray? (I needed some software to do timing)

• Limited functionality • Last updated 2009 (?) • Designed for spectroscopy • has some timing capabilities • + closed-source packages maintained by individual groups

Slide 13

Slide 13 text Huppenkothen et al (2019) • spectral timing classes + functions • building blocks • simulation, pulsars, modeling • command-line interface built on stingray • quick-look (spectral) timing analysis • graphical user interface • interactive data analysis

Top-level functionality Events Lightcurve Powerspectrum / Crossspectrum AveragedPowerspectrum / AveragedCrossspectrum Crosscorrelation / Autocorrelation Bispectrum Lag-Energy Spectra supporting functionality: statistics, good time intervals, I/O Sub-modules pulse modeling simulator deadtime

Where to get help join the slack!

Current + Future Work • rework the modeling interface + autodiff: current GSoC project • multi-taper periodograms: current GSoC project • update to DAVE GUI: current GSOC project • fix bugs (there are always bugs) • improve API (aka: what were we thinking?!) • improve documentation (there is never enough documentation) • performance + memory optimization • better integration with current X-ray missions • better integration with spectral modeling packages • better integration with astropy.timeseries and lightkurve • higher-order Fourier products

Get Involved! Stingray is a project for the community, but there’s only so much we as maintainers can do …

How to get involved (please do!) • find bugs (and report them as a GitHub Issue) • fix bugs (as a GitHub Pull Request) • make feature requests (also via GitHub Issue) • implement new features (also via GitHub Pull Request) • test documentation/tutorials (and report mistakes/fix bugs etc) • … Don’t know where to start? • “Good First Issue” tag on GitHub • join the slack + ask us! We’ll help :)

Please cite the ApJ paper if you use stingray!

Statistical Inference

All models are wrong, but some are useful. — George Box

Nature is complex!

… so is our data (collection)!

Energy [keV] Brightness Corona reflection Corona Iron Line

Maximum Likelihood Estimation “How probable is it that my data D came from a model M with parameters ?”

log(p(D|θ)) = log ( n ∏ i=1 p(d i |θ) ) = − n 2 log(2πσ2) − 1 2σ2 n ∑ i=1 (d i − f(x i , θ))2

log(p(D|θ)) = log ( n ∏ i=1 p(d i |θ) ) = − n 2 log(2πσ2) − 1 2σ2 n ∑ i=1 (d i − f(x i , θ))2 “chi-square fitting” caution: this is not a probability of the parameters

When this might not be enough … • your likelihood is not unimodal • your likelihood might be skewed or otherwise complex • you might have useful prior information • you might be interested in the properties of a population • you might have a complex model/data collection that you can simulate, but not write down an analytical function for

Complex/Weird Likelihoods

Useful Prior Information data parameters θ = D = p(D|θ) = n ∏ i=1 p(d i |θ) p(θ|D) ∝ p(θ) n ∏ i=1 p(d i |θ) p(θ)

Useful Prior Information data parameters θ = D = p(D|θ) = n ∏ i=1 p(d i |θ) p(θ|D) ∝ p(θ) n ∏ i=1 p(d i |θ) p(θ)

Useful Prior Information data parameters θ = D = p(D|θ) = n ∏ i=1 p(d i |θ) p(θ|D) ∝ p(θ) n ∏ i=1 p(d i |θ) p(θ) You can do this with stingray.modeling

Simulation-Based Inference

The Chandra ABC Guide to Pile-Up

The Chandra ABC Guide to Pile-Up There are many effects or models that you can (and should) simulate, but that are hard to take into account in model fitting.

Yamaoka et al (2010); Huppenkothen et al. (2019) Variability: GX 339-4 Fast Fourier Transform We understand how to do this, unless the sources is very bright

NuSTAR Credit: NuSTAR Observatory Guide Adapted from Chaplin et al (2012) dead time ms

without dead time with dead time

Huppenkothen et al (2017) Figure 5. Left panel: averaged periodogram of the part of Chandra observation 17696 containing the QPOs at 73 mHz and 1.03 H periodogram of the two Fermi/GBM triggers simultaneous with the Chandra data (right panel). In blue, we show the logarithmically data sets, we show the MAP model with four (Chandra) or two ( Fermi/GBM) Lorentzian components in purple and the combined m The Astrophysical Journal, 834:90 (17pp), 2017 January 1 Problem: (Bayesian) parameter inference

p(θ|D) ∝ p(D|θ)p(θ) data D D θ θ θ parameters θ = D = likelihood prior posterior intractable! ☹

*Also known as: Approximate Bayesian Computation Likelihood-Free Inference simulation-based inference*: replace an intractable likelihood by a (physics) simulator

Sadegh + Vrugt, 2014, see also: Brehmer et al (2018a, 2018b), Cranmer et al (2020) Step 1: draw parameters from prior Step 2: simulate data sets Step 3: compare simulated to observed data Step 4: keep parameters that produce simulations similar to the data simulator

Tejero-Cantero et al (2020), Greenberg et al (2019) (Sequential) Neural Posterior Estimation

Accurate Timing with SBI 7 Simulated Data Huppenkothen & Bachetti (under review)

Figure 17. Marginalized posterior probability distributions for the red noise Huppenkothen & Bachetti (under review)

GRS 1915+105 Huppenkothen & Bachetti (under review)

Huppenkothen & Bachetti (under review)

GRS 1915+105 Huppenkothen & Bachetti (under review)

What are the properties of the evolution of the QPO centroid frequency? p(θ|D) ∝ n ∏ i=1 p(D i |θ)p(θ) p({θ}n i=1 , α|D) ∝ n ∏ i=1 [p(D i |θ i )p(θ i |α)] p(α)

What are the properties of the evolution of the QPO centroid frequency? p(θ|D) ∝ n ∏ i=1 p(D i |θ)p(θ) p({θ}n i=1 , α|D) ∝ n ∏ i=1 [p(D i |θ i )p(θ i |α)] p(α) Hierarchical Bayesian Modeling is ideal for population-level problems

Things I’m curious about • Where else should be be using simulation-based inference? • What do you think are the next big challenges in spectral timing? • What population-level problems should we be addressing with hierarchical inference?

• Use stingray! It’s fun! Conclusions • Also, help us build stingray! That’s also fun! • Bayesian statistics allows you to go beyond standard fitting problems to take into account • SBI enables principled inference when models are complex, numerical and/or generate data • New statistical methods and software tools will set us up for answering complex questions with complex models, using current and future instrumentation

Questions? Comments? Complaints?