Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analysing sub-daily time series data

Analysing sub-daily time series data

Talk given at the Melbourne Users of R Network

Rob J Hyndman

October 12, 2017
Tweet

More Decks by Rob J Hyndman

Other Decks in Programming

Transcript

  1. Pedestrian counts 0 1000 2000 3000 4000 Jul 2016 Oct

    2016 Jan 2017 Apr 2017 Date Pedestrians counted Hourly pedestrian traffic at Southern Cross Station 0 1000 2000 3000 4000 Apr 01 Apr 15 May 01 May 15 Jun 01 Date Pedestrians counted Hourly pedestrian traffic at Southern Cross Station 4
  2. Pedestrian counts Weekday Weekend 00 AM 06 AM 12 PM

    18 PM 00 AM 00 AM 06 AM 12 PM 18 PM 00 AM 0 1000 2000 3000 4000 Time Total pedestrians counted Seasonality in pedestrian traffic at Southern Cross Station 5
  3. Call volume 0 100 200 300 400 1 3 5

    7 9 11 13 15 17 19 21 23 25 27 29 31 33 Weeks Call volume 5 minute call volume at North American bank 100 200 300 400 1 2 3 4 Weeks Call volume 6
  4. Electricity demand 4 6 8 Jan 2014 Apr 2014 Jul

    2014 Oct 2014 Jan 2015 Date Electricity demanded (GW) Half−hourly electricity demand for Victoria 3 4 5 6 Sep 01 Sep 15 Oct 01 Oct 15 Nov 01 Date Electricity demanded (GW) Half−hourly electricity demand for Victoria 7
  5. Challenges Visualization Even plotting a single time series comprising one

    year of data, it is hard to see the interesting features. 8
  6. Challenges Visualization Even plotting a single time series comprising one

    year of data, it is hard to see the interesting features. R classes The ts, zoo, xts and other time series classes do not work well with sub-daily data. Newer packages (timetk and tibbletime) do not play nicely with modelling functions. 8
  7. Challenges Visualization Even plotting a single time series comprising one

    year of data, it is hard to see the interesting features. R classes The ts, zoo, xts and other time series classes do not work well with sub-daily data. Newer packages (timetk and tibbletime) do not play nicely with modelling functions. 10
  8. Challenges Visualization Even plotting a single time series comprising one

    year of data, it is hard to see the interesting features. R classes The ts, zoo, xts and other time series classes do not work well with sub-daily data. Newer packages (timetk and tibbletime) do not play nicely with modelling functions. Forecasting Most time series modelling frameworks handle sub-daily data poorly. Available models include tbats and prophet, but they have limitations. 10
  9. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for

    heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Handles non-integer seasonality, multiple seasonal periods. Entirely automated Prediction intervals often too wide Very slow on long series No exogenous predictors 11
  10. TBATS model library(forecast) calls %>% tbats %>% forecast %>% autoplot(include=2500)

    0 200 400 600 800 31 32 33 34 35 36 Time . level 80 95 Forecasts from TBATS(0.555, {0,0}, −, {<169,6>, <845,4>}) 12
  11. prophet Additive regression model developed at Facebook yt = gt

    + st + ht + εt yt = time series. gt = piecewise linear growth function st = Fourier seasonal terms: daily, weekly and/or yearly ht = holiday effect. εt = error (can be ARMA errors). Estimated as a Bayesian regression using Stan 13
  12. Daily blog traffic Daily pageviews for the Hyndsight blog (2014−2015)

    Month Pageviews 500 1000 1500 2000 2500 May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr 14
  13. prophet example library(prophet) m <- prophet(hyndsight) future <- make_future_dataframe(m, periods

    = 365) forecast <- predict(m, future) plot(m, forecast) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 1000 2000 3000 2014−07 2015−01 2015−07 2016−01 ds y 15
  14. prophet pros and cons Pros Completely automatic including changepoints Handles

    multiple seasonality and holiday effects Cons Seems to overfit annual seasonality Number of Fourier terms is hard-coded 16