Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Nowcasting Business Performance

Nowcasting Business Performance

talk by Francois Bouet at Data Science London, 28/11/12

Data Science London

November 28, 2012
Tweet

More Decks by Data Science London

Other Decks in Technology

Transcript

  1. GROWTH INTELLIGENCE What we do Classification of companies Revenue estimation

    What we use Machine Learning Times Series methods miércoles, 28 de noviembre de 12
  2. NOWCASTING Estimating the value of a time series not readily

    available at present present miércoles, 28 de noviembre de 12
  3. NOWCASTING Previously called short-term forecasting forecasting More an approach and

    a goal than a different theory and field miércoles, 28 de noviembre de 12
  4. WEATHER NOWCASTING Simplified model that is applied quickly Uses weather

    models Forecast at location x given weather at y → Not applicable to other fields miércoles, 28 de noviembre de 12
  5. SEARCH-BASED NOWCASTING Popularized by Google Recent successes Flu predictions Consumer

    behaviour travel, movies and products Based on Google’s data, simple AR models Only used to study what people are searching for miércoles, 28 de noviembre de 12
  6. GDP NOWCASTING Field with the most generic research Major research

    since the 90's GDP released quarterly with further revisions 1000's of signals for GDP nowcasting Industrial production, unemployment, confidence surveys, retail sales, ... miércoles, 28 de noviembre de 12
  7. GDP NOWCASTING Vector auto-regression and the “jagged edge” Present Different

    frequencies, different lag, missing data miércoles, 28 de noviembre de 12
  8. Search results LinkedIn info Web traffic Patents Advertisement spending Liabilities

    Tweets Assets Website updates Press miércoles, 28 de noviembre de 12
  9. TIE WITH “BIG DATA” Need to gather signals in large

    quantity Machine learning as a pre-processing step and to integrate discrete events Example: companies in a sector which receive investment miércoles, 28 de noviembre de 12
  10. TIE WITH ESTIMATION THEORY Beneath all this: Getting to a

    variable not directly observable with the help of measured signals Replacing probability distribution from physical models with machine learned knowledge miércoles, 28 de noviembre de 12
  11. METHODOLOGIES Vector auto-regression Challenge with large number of signals (predictors):

    Curse of dimensionality when applying VAR Machine Learning approach Own solution: ziggurat miércoles, 28 de noviembre de 12
  12. TIME SERIES + MACHINE LEARNING Δrevenue avg, std dev, model

    params miércoles, 28 de noviembre de 12
  13. OUR PIPELINE FOR NOWCASTING Clustering companies in sets (ML) Signals

    gathering Time Series processing ML with model for each cluster > Revenue for each company and each cluster miércoles, 28 de noviembre de 12