Surfing Silver: Dynamic Bayesian Forecasting for Fun and Profit

Surfing Silver: Dynamic Bayesian Forecasting for Fun and Profit

2008 was a historic year in many ways, perhaps the most prominent being the election of the first African American president. But 2008 also saw an unlikely hero emerge amongst the record setting presidential race... Nate Silver and his astonishingly accurate prediction of its results. More important than Nate's remarkable result however was the attention it drew to the potential of data and the importance of uncertainty (through bayesian statistics). And it was in that moment that our modern incarnation of data journalism was born (though ironically the field dates back to an attempt to predict the 1952 presidential election) with Nate's (now famous) 538 blog.

In this talk I will walk through the approach that made Nate so successful in 2008, test its efficacy in predicting the early 2016 primary results, and show how these (relatively) simple concepts can be applied in novel ways to tangential fields to great effect (for fun and profit) by estimating the time to failure for industrial machines in our connected world of the IoT.

689aaa70b7ade5fe2e9b0cc5289d6069?s=128

Jonathan Dinu

April 13, 2016
Tweet

Transcript

  1. SURFING SILVER DYNAMIC BAYESIAN FORECASTING FOR FUN AND PROFIT Jonathan

    Dinu // April 13th, 2016 // @clearspandex
  2. whoami Jonathan Dinu // April 13th, 2016 // @clearspandex

  3. whoami Jonathan Dinu // April 13th, 2016 // @clearspandex

  4. Jonathan Dinu // April 13th, 2016 // @clearspandex

  5. THE 2008 ELECTION let me tell you a little story...

    Jonathan Dinu // April 13th, 2016 // @clearspandex
  6. SPOILER ALERT... IT'S BEEN DONE BEFORE Jonathan Dinu // April

    13th, 2016 // @clearspandex
  7. > Nate Silver > Drew Linzer > Josh Putnam >

    Simon Jackman Jonathan Dinu // April 13th, 2016 // @clearspandex
  8. ANDREW GELMAN Jonathan Dinu // April 13th, 2016 // @clearspandex

  9. ANDREW GELMAN (1995...) Jonathan Dinu // April 13th, 2016 //

    @clearspandex
  10. THE THEORY BEHIND THE MAGIC Courtesy of 538 and Drew

    Linzer (Votamatic) Jonathan Dinu // April 13th, 2016 // @clearspandex
  11. CHALLENGES > Historical Predictions susceptible to Uncertainty > Sparse pre-election

    Poll Data > Sampling Error and House Effects Bias Polls Jonathan Dinu // April 13th, 2016 // @clearspandex
  12. WHAT DREW (AND NATE) DID DIFFERENTLY > State level vs.

    National Polls > Online Updates as more data become available > Not All Polls are Created Equal (weights/averages) > (Probabilistic) Forecasting in addition to Estimation Jonathan Dinu // April 13th, 2016 // @clearspandex
  13. DYNAMIC BAYESIAN FORECASTING2 National: State: Forecasts: Not shown here: informative

    priors based on historical predictions Jonathan Dinu // April 13th, 2016 // @clearspandex
  14. SO WHY AM I TELLING YOU THIS THEN? Jonathan Dinu

    // April 13th, 2016 // @clearspandex
  15. STRUCTURED PREDICTION SUPERVISED LEARNING ON SEQUENCES Jonathan Dinu // April

    13th, 2016 // @clearspandex
  16. TRADITIONALLY Jonathan Dinu // April 13th, 2016 // @clearspandex

  17. TRADITIONALLY Jonathan Dinu // April 13th, 2016 // @clearspandex

  18. STATES + TIME + TRANSITIONS Jonathan Dinu // April 13th,

    2016 // @clearspandex
  19. GRAPHICAL MODELS > Assess Risk (uncertainty) as Probability of Failure

    > Unobservable (hidden) Failure States > Proactive/Early Prediction > Interpretable Latent Properties > Online Algorithm (iteratively improve) Jonathan Dinu // April 13th, 2016 // @clearspandex
  20. KEY IDEAS > Uncertainty > Point vs. Distribution (or confidence

    intervals) > Bayesian vs. Frequentists methods > Temporal variability All models are wrong, but some models are useful... or something Jonathan Dinu // April 13th, 2016 // @clearspandex
  21. KEY IDEAS (APPLIED) Jonathan Dinu // April 13th, 2016 //

    @clearspandex
  22. IOT IMPACT: DETECTING MACHINE FAILURES > Historical Structural Predictions susceptible

    to Uncertainty (Supervised Learning) > Sparse pre-election Poll Data (costly to measure) > Sampling Error Biases Polls Inspections (prediction in the absence of data) > Online Updates as more data become available > Not All Polls sensors are Created Equal (weights/averages) > (Probabilistic) Forecasting in addition to Estimation Jonathan Dinu // April 13th, 2016 // @clearspandex
  23. REMEMBER THIS... National: State: Forecasts: Jonathan Dinu // April 13th,

    2016 // @clearspandex
  24. REMEMBER THIS... National: State: Forecasts: Jonathan Dinu // April 13th,

    2016 // @clearspandex
  25. REMEMBER THIS... National: State: Forecasts: Jonathan Dinu // April 13th,

    2016 // @clearspandex
  26. INDUSTRIAL MACHINES3 HTTP://WWW.CITEMASTER.NET/GET/8BD1ACC0-F04B-11E3-BBAF-00163E009CC7/SALFNER05PREDICTING.PDF Jonathan Dinu // April 13th, 2016 //

    @clearspandex
  27. MORE INTERPRETABLE WE HAVE TO ACTUALLY FIX THE MACHINES AFTER

    ALL... Jonathan Dinu // April 13th, 2016 // @clearspandex
  28. LATENT FACTORS Jonathan Dinu // April 13th, 2016 // @clearspandex

  29. CAUSALITY! Jonathan Dinu // April 13th, 2016 // @clearspandex

  30. REFERENCES > The Signal and the Noise > Data Journalism

    Handbook > Dynamic Bayesian Forecasting of Presidential Elections in the States (Drew A. Linzer) > Time for Change model (Alan Abramowitz) > Baysian Data Analysis Gelman > Causality Judea Pearl > 538: How we are Forecasting the 2016 Primaries > Predicting Time-to-Failure of Industrial Machines with Temporal Data Mining Jonathan Dinu // April 13th, 2016 // @clearspandex