Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Science for X-ray Timing

Data Science for X-ray Timing

In this talk I summarize current challenges in high-energy astronomy related to current and future data sets both at optical and X-ray wavelengths, and propose some possible solutions for how to deal with complex data sets, in particularly unevenly sampled data. I also touch upon machine learning methods and how they can help us in the future understand the complicated time series we observe from black holes and neutron stars.

Daniela Huppenkothen

July 21, 2018
Tweet

More Decks by Daniela Huppenkothen

Other Decks in Science

Transcript

  1. (Bayesian) Inference for X-ray Timing Daniela Huppenkothen DIRAC, University of

    Washington eScience Institute, University of Washington
  2. (Bayesian) Inference for X-ray Timing Daniela Huppenkothen DIRAC, University of

    Washington eScience Institute, University of Washington Data Science
  3. • periodograms, cross spectra • time lags • coherence •

    bispectra, bicoherence • covariance spectra • lag-energy spectra • …
  4. • periodograms, cross spectra • time lags • coherence •

    bispectra, bicoherence • covariance spectra • lag-energy spectra • … }Fourier methods
  5. How do we deal with complex data? How do we

    deal with large volumes of data? Where (if at all) does machine learning fit in? How can we avoid reinventing the wheel? How can we make our research reproducible and open? How can we train the next generation of scientists?
  6. How do we deal with complex data? How do we

    deal with large volumes of data? Where (if at all) does machine learning fit in? How can we avoid reinventing the wheel? How can we make our research reproducible and open? How can we train the next generation of scientists?
  7. + extremely flexible + can produce many PSD shapes +

    can work with uneven sampling + deterministic + stochastic processes + probabilistic generative model — requires Gaussian uncertainties — computationally expensive O(N3) Rasmussen & Williams (2006)
  8. The Astrophysical Journal, 788:33 (18pp), 2014 June 10 Kelly et

    al. Figure 2. PSD for the light curve shown in Figure 1. The true PSD is given by the solid black line, the periodogram by the orange circles, the PSD from the maximum-likelihood estimate assuming a CARMA(5, 1) model (chosen to minimize AICc) by the blue dashed line, and the blue region contains 95% of the probability on the PSD assuming a CARMA(5, 1) model. There is a −1 Figure 4. Simulated light curve from a CARMA(5, 3) process irregularly sampled over three observing seasons.The black line denotes the true values, and the blue dots denote the measured values. Also shown are interpolated and forecasted values, based on the best-fitting CARMA(5, 1) process; a CARMA(5, 1) model had the minimum AICc value. The solid blue line and cyan region denote the expected value and 1σ error bands of the interpolated Continuous Auto-Regressive Moving Average Process Kelly et al (2014)
  9. Bachetti et al (2015), Huppenkothen & Bachetti (2018) The Astrophysical

    Journal, 800:109 (12pp), 2015 February 20 (a) (b) (c) (d) Figure 1. Left: the cospectrum and the PDS are compared in the case of pure Poisson noise, without (a) and with 225 cts s−1. The cospectrum mean is always zero. In these plots, it has been increased by two for display purpo √ Dead Time: White Noise
  10. Bachetti et al (2015), Huppenkothen & Bachetti (2018) The Astrophysical

    Journal, 800:109 (12pp), 2015 February 20 (a) (b) (c) (d) Figure 1. Left: the cospectrum and the PDS are compared in the case of pure Poisson noise, without (a) and with 225 cts s−1. The cospectrum mean is always zero. In these plots, it has been increased by two for display purpo √ Caution! The power spectrum and the cospectrum do not have the same statistical distribution! Dead Time: White Noise
  11. Huppenkothen & Bachetti (2018) Dead Time: White Noise 4 Fig.

    1.— Distribution of Leahy-normalized cospectral densities (left) and power spectral densities (right), respectively, for the simulated data. In dark grey, we show fine-grained histograms of the simulated powers. In red we plot the theoretical probability distribution the simulated powers should follow: A Laplace distribution with µ = 0 and = 2 for the cospectral densities and a 2-distribution with 2
  12. What about stochastic variability? There is currently no closed-form statistical

    distribution for the cospectrum with red noise! :-(
  13. “Machine Learning, a subfield of computer science, involves the development

    of mathematical algorithms that discover knowledge from specific data sets, and then "learn" from the data in an iterative fashion that allows predictions to be made.” — NYAS Machine Learning Symposium
  14. GRS 1915+105 Huppenkothen et al (2017) 0.0 0.2 0.4 0.6

    0.8 1.0 Time in MJD 0.0 0.2 0.4 0.6 0.8 1.0 Count rate [counts/s] 50200 50300 50400 50500 50600 100 200 300 50700 50800 50900 51000 51100 100 200 300 51200 51300 51400 51500 51600 100 200 300 51700 51800 51900 52000 52100 100 200 300 52200 52300 52400 52500 52600 100 200 300 52700 52800 52900 53000 53100 100 200 300 53200 53300 53400 53500 53600 100 200 300 53700 53800 53900 54000 54100 100 200 300 54200 54300 54400 54500 54600 100 200 300 54700 54800 54900 55000 55100 100 200 300 55200 55300 55400 55500 55600 100 200 300
  15. GRS 1915+105 Belloni et al (2000), Klein-Wolt et al (2002),

    Hannikainen et al (2004) 274 T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 274 T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and the CDs correspond to the same points. The class name and the observation number are indicated on each panel. quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. in Vilhu & Nevalainen 1998, where data with lower time resolution were considered). • class ν There are two main differences between observations in this class and those of class ρ. The first is that they are considerably more irregular in the light curve, and at times they show a long quiet interval, where the source moves to the right part of the CD (see Fig. 2s,t). The second is that, at Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and the CDs correspond to the same points. The class name and the observation number are indicated on each panel. F f te th c in quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. (1997b), is the same, only with shorter typical time scales (Fig. 2o,p). In the CD, an additional cloud between the two branches is visible (see Belloni et al. 1997b). • class ρ Taam et al. (1997) and Vilhu & Nevalainen in Vilhu & Nevalain resolution were cons • class ν There are two m in this class and thos considerably more ir they show a long qui the right part of the C 1s time resolution, th of the ‘flares’, notabl (see Fig. 17b). • class α Light curves o Fig fro tex the cla ind quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. (1997b), is the same, only with shorter typical time scales (Fig. 2o,p). In the CD, an additional cloud between the two branches is visible (see Belloni et al. 1997b). • class ρ Taam et al. (1997) and Vilhu & Nevalainen in Vilhu & Nevalainen resolution were consid • class ν There are two m in this class and those considerably more irre they show a long quiet the right part of the CD 1s time resolution, the of the ‘flares’, notably (see Fig. 17b). • class α Light curves of T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 275 al.: A model-independent analysis of the variability of GRS 1915+105 275 F in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, some of which can be seen within other classes. What iden- tifies class β however, is the presence in the CD of a char- acteristic straight elongated branch stretching diagonally. Thenumberoftheclassespresentedabovecouldbereduced, GRS 1915+105 as: (i) tr (ii) a smaller number of c describe our observation The point of our work w very complex behavior i versal laws”. Summarizi the “occupation times” o Noice that class χ is by f F in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, GRS 1915+105 as: (i) tr (ii) a smaller number of describe our observation Fig in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, GRS 1915+105 as: (i) tran (ii) a smaller number of cl describe our observations, Fig in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, some of which can be seen within other classes. What iden- tifies class β however, is the presence in the CD of a char- acteristic straight elongated branch stretching diagonally. Thenumberoftheclassespresentedabovecouldbereduced, GRS 1915+105 as: (i) tran (ii) a smaller number of cla describe our observations, The point of our work wil very complex behavior in f versal laws”. Summarizing the “occupation times” of Noice that class χ is by far
  16. GRS 1915+105 Belloni et al (2000), Klein-Wolt et al (2002),

    Hannikainen et al (2004) 274 T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 274 T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and the CDs correspond to the same points. The class name and the observation number are indicated on each panel. quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. in Vilhu & Nevalainen 1998, where data with lower time resolution were considered). • class ν There are two main differences between observations in this class and those of class ρ. The first is that they are considerably more irregular in the light curve, and at times they show a long quiet interval, where the source moves to the right part of the CD (see Fig. 2s,t). The second is that, at Fig. 2a – l. One example light curve and CD from each of the 12 classes described in the text. The light curves have a 1s bin size, and the CDs correspond to the same points. The class name and the observation number are indicated on each panel. F f te th c in quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. (1997b), is the same, only with shorter typical time scales (Fig. 2o,p). In the CD, an additional cloud between the two branches is visible (see Belloni et al. 1997b). • class ρ Taam et al. (1997) and Vilhu & Nevalainen in Vilhu & Nevalain resolution were cons • class ν There are two m in this class and thos considerably more ir they show a long qui the right part of the C 1s time resolution, th of the ‘flares’, notabl (see Fig. 17b). • class α Light curves o Fig fro tex the cla ind quiet, high-variable and oscillating parts described in Bel- loni et al. (1997a). In the CD, a C-shaped distribution is evident, with the lower-right branch slightly detached from the rest, and corresponding to the low count rate intervals (typically a few hundred seconds long). • class κ Very similar to the previous class are observations in class λ. The timing structure, as shown by Belloni et al. (1997b), is the same, only with shorter typical time scales (Fig. 2o,p). In the CD, an additional cloud between the two branches is visible (see Belloni et al. 1997b). • class ρ Taam et al. (1997) and Vilhu & Nevalainen in Vilhu & Nevalainen resolution were consid • class ν There are two m in this class and those considerably more irre they show a long quiet the right part of the CD 1s time resolution, the of the ‘flares’, notably (see Fig. 17b). • class α Light curves of T. Belloni et al.: A model-independent analysis of the variability of GRS 1915+105 275 al.: A model-independent analysis of the variability of GRS 1915+105 275 F in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, some of which can be seen within other classes. What iden- tifies class β however, is the presence in the CD of a char- acteristic straight elongated branch stretching diagonally. Thenumberoftheclassespresentedabovecouldbereduced, GRS 1915+105 as: (i) tr (ii) a smaller number of c describe our observation The point of our work w very complex behavior i versal laws”. Summarizi the “occupation times” o Noice that class χ is by f F in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, GRS 1915+105 as: (i) tr (ii) a smaller number of describe our observation Fig in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, GRS 1915+105 as: (i) tran (ii) a smaller number of cl describe our observations, M achine learning Fig in Fig. 2v) like those of classes ρ and ν, and the flare as a curved trail of soft (low HR2 ) points (Fig. 2u). • class β Thisclassshowscomplexbehaviorinthelightcurves, some of which can be seen within other classes. What iden- tifies class β however, is the presence in the CD of a char- acteristic straight elongated branch stretching diagonally. Thenumberoftheclassespresentedabovecouldbereduced, GRS 1915+105 as: (i) tran (ii) a smaller number of cla describe our observations, The point of our work wil very complex behavior in f versal laws”. Summarizing the “occupation times” of Noice that class χ is by far