and direc on of trend strength of seasonality ming of peak seasonality spectral entropy Called “features” in the machine learning literature. Biggish me series data Visualizing many me series 5 John W Tukey Cognos cs Computer-produced diagnos cs (Tukey and Tukey, 1985).
and direc on of trend strength of seasonality ming of peak seasonality spectral entropy Called “features” in the machine learning literature. Biggish me series data Visualizing many me series 5 John W Tukey Cognos cs Computer-produced diagnos cs (Tukey and Tukey, 1985).
and direc on of trend strength of seasonality ming of peak seasonality spectral entropy Called “features” in the machine learning literature. Biggish me series data Visualizing many me series 5 John W Tukey Cognos cs Computer-produced diagnos cs (Tukey and Tukey, 1985).
space to: ¯ Generate new me series with similar features to exis ng series ¯ Generate new me series where there are “holes” in the feature space. Let {PC1, PC2, . . . , PCn } be a “popula on” of me series of specified length and period. Gene c algorithm uses a process of selec on, crossover and muta on to evolve the popula on towards a target point Ti. Op mize: Fitness (PCj ) = − (|PCj − Ti |2). Ini al popula on random with some series in neighbourhood of Ti. Biggish me series data Visualizing many me series 17
space to: ¯ Generate new me series with similar features to exis ng series ¯ Generate new me series where there are “holes” in the feature space. Let {PC1, PC2, . . . , PCn } be a “popula on” of me series of specified length and period. Gene c algorithm uses a process of selec on, crossover and muta on to evolve the popula on towards a target point Ti. Op mize: Fitness (PCj ) = − (|PCj − Ti |2). Ini al popula on random with some series in neighbourhood of Ti. Biggish me series data Visualizing many me series 17
Miles (2017). Visualising forecas ng al- gorithm performance using me series instance spaces. Interna onal Journal of Forecas ng 33(2), 345–358 RJ Hyndman et al. (2017a). Mcomp: Data from the M-Compe ons. Version 2.6. https://CRAN.R- project.org/package=Mcomp Biggish me series data Visualizing many me series 21
one-hour intervals over one month. Consis ng of several server metrics (e.g. CPU usage and paging views) from many server farms globally. Aim: find unusual (anomalous) me series. Biggish me series data Finding weird me series 23
Yt−1 ) Strength of trend and seasonality based on STL Trend linearity and curvature Size of seasonal peak and trough Spectral entropy Lumpiness: variance of block variances (block size 24). Spikiness: variances of leave-one-out variances of STL remainders. Level shi : Maximum difference in trimmed means of consecu ve moving windows of size 24. Variance change: Max difference in variances of consecu ve moving windows of size 24. Flat spots: Discre ze sample space into 10 equal-sized intervals. Find max run length in any interval. Number of crossing points of mean line. Kullback-Leibler score: Maximum of DKL (P Q) = P(x) ln P(x)/Q(x)dx where P and Q are es mated by kernel density es mators applied to consecu ve windows of size 48. Change index: Time of maximum KL score Biggish me series data Finding weird me series 25
(2015). Large-scale unusual me se- ries detec on. In: Proceedings of the IEEE Interna onal Conference on Data Mining. Atlan c City, NJ, USA. 14–17 November 2015 RJ Hyndman, E Wang, and N Laptev (2017). anomalous: Unusual Time Series Detec on. Version 0.1.0. https://github.com/ robjhyndman/anomalous Biggish me series data Finding weird me series 28
of me series that need forecas ng at least monthly. 2 Forecasts are o en required by people who are untrained in me series analysis. Specifica ons For a given me series, an automa c forecas ng algorithm must: ¯ determine an appropriate me series model; ¯ es mate the parameters; ¯ compute the forecasts with predic on intervals. Biggish me series data Automa cally forecas ng me series 30
of me series that need forecas ng at least monthly. 2 Forecasts are o en required by people who are untrained in me series analysis. Specifica ons For a given me series, an automa c forecas ng algorithm must: ¯ determine an appropriate me series model; ¯ es mate the parameters; ¯ compute the forecasts with predic on intervals. Biggish me series data Automa cally forecas ng me series 30
where L is the model likelihood and k is the number of es mated parameters in the model. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asympto cally equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via me series cross-valida on. AICc a bias-corrected small-sample version. AICc much faster than CV Biggish me series data Automa cally forecas ng me series 32
where L is the model likelihood and k is the number of es mated parameters in the model. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asympto cally equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via me series cross-valida on. AICc a bias-corrected small-sample version. AICc much faster than CV Biggish me series data Automa cally forecas ng me series 32
where L is the model likelihood and k is the number of es mated parameters in the model. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asympto cally equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via me series cross-valida on. AICc a bias-corrected small-sample version. AICc much faster than CV Biggish me series data Automa cally forecas ng me series 32
where L is the model likelihood and k is the number of es mated parameters in the model. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asympto cally equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via me series cross-valida on. AICc a bias-corrected small-sample version. AICc much faster than CV Biggish me series data Automa cally forecas ng me series 32
series 33 0.25 0.50 0.75 1.00 1.25 1995 2000 2005 2010 Year Total scripts (millions) Monthly cortecosteroid drug sales in Australia fit <- auto.arima(h02) fcast <- forecast(fit) autoplot(fcast)
series 34 0.25 0.50 0.75 1.00 1.25 1995 2000 2005 2010 Year Total scripts (millions) Monthly cortecosteroid drug sales in Australia fit <- ets(h02) fcast <- forecast(fit) autoplot(fcast)
series 35 7 8 9 10 1995 2000 2005 Year Thousands of barrels per day Forecasts from TBATS(1, {0,0}, 1, {<52.1785714285714,9>}) fit <- tbats(gasoline) fcast <- forecast(fit) plot(fcast)
for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including mul ple and non-integer periods) Handles non-integer seasonality, regular mul ple seasonal periods, but not public holidays. En rely automated Predic on intervals o en too wide Very slow on long series Biggish me series data Automa cally forecas ng me series 37
Combina on of STL decomposi on with ETS or ARIMA NNETAR recurrent ANN inputs are lagged values and exogenous variables Prophet (from Facebook) for daily and sub-daily me series. handles public holiday effects. FASSTER model (Fast Addi ve Seasonal Switching with Trend and Exogenous Regressors) Generaliza on of structural model for mul ple seasonal me series with switching seasonality. handles public holiday effects, daily data, sub-daily data, weekly data, etc. Biggish me series data Automa cally forecas ng me series 38
space framework for automa c forecas ng using exponen al smoothing meth- ods. Interna onal Journal of Forecas ng 18(3), 439–454 RJ Hyndman and Y Khandakar (2008). Automa c me series forecas ng : the forecast package for R. . Journal of Sta s cal So ware 26(3), 1–22. AM De Livera, RJ Hyndman, and RD Snyder (2011). Fore- cas ng me series with complex seasonal pa erns using exponen al smoothing. Journal of the American Sta s - cal Associa on 106(496), 1513–1527. RJ Hyndman et al. (2017b). forecast: Forecas ng Func ons for Time Series and Linear Models. Version 8.1. https://CRAN.R-project.org/package=forecast Biggish me series data Automa cally forecas ng me series 39
period t at age x, t = 1, . . . , n. yt,x = ft (x) + σt (x)εt,x ft (x) = µ(x) + K k=1 βt,k φk (x) + et (x) Biggish me series data Forecas ng func onal me series 43 Es mate ft (x) using penalized regression splines. Es mate µ(x) as mean ft (x) across years. Es mate βt,k and φk (x) using func onal principal components.
period t at age x, t = 1, . . . , n. yt,x = ft (x) + σt (x)εt,x ft (x) = µ(x) + K k=1 βt,k φk (x) + et (x) Biggish me series data Forecas ng func onal me series 43 Es mate ft (x) using penalized regression splines. Es mate µ(x) as mean ft (x) across years. Es mate βt,k and φk (x) using func onal principal components.
σt (x)εt,x ft (x) = µ(x) + K k=1 βt,k φk (x) + et (x) The eigenfunc ons φk (x) show the main regions of varia on. The scores {βt,k } are uncorrelated by construc on. So we can forecast each βt,k using a univariate me series model. Biggish me series data Forecas ng func onal me series 45
func onal me series 47 0 20 40 60 80 100 −10 −8 −6 −4 −2 0 Australia: male death rates forecasts (2010 and 2059) Age Log death rate 80% prediction intervals
forecas ng of mortality and fer lity rates: A func onal data approach. Computa onal Sta s cs & Data Analysis 51(10), 4942– 4956. RJ Hyndman and HL Shang (2009). Forecas ng func onal me series (with discussion). Journal of the Korean Sta s- cal Society 38(3), 199–221. HL Shang and RJ Hyndman (2017). Grouped func onal me series forecas ng: an applica on to age-specific mortality rates. Journal of Computa onal and Graphical Sta s cs 26(2), 330–343 RJ Hyndman and HL Shang (2017). sa: Func onal Time Series Analysis. Version 4.8. https://CRAN.R-project. org/package=ftsa RJH with contribu ons from Heather Booth, L Tickle, and J Maindonald. (2017). demography: Forecas ng Mortality, Fer lity, Migra on and Popula on Data. Version 1.20. https://CRAN.R-project.org/package=demography Biggish me series data Forecas ng func onal me series 52