Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MOM! My algorithms SUCK

Abe Stanway
September 19, 2013

MOM! My algorithms SUCK

Given at Monitorama.eu 2013 in Berlin. http://vimeo.com/75183236

Abe Stanway

September 19, 2013
Tweet

More Decks by Abe Stanway

Other Decks in Programming

Transcript

  1. this works because humans are excellent visual pattern matchers* *there

    are, of course, many advanced statistical applications where signal cannot be determined from noise just by looking at the data.
  2. can we teach software to be as good at simple

    anomaly detection as humans are?
  3. “if a datapoint is not within reasonable bounds, more or

    less, of what usually happens, it’s an anomaly” the human definition:
  4. so, in math speak, a metric is anomalous if the

    absolute value of latest datapoint is over three standard deviations above the mean
  5. if you’ve got a normal distribution, chances are you’ve got

    an exchangeable, stationary series produced by independent random variables
  6. μ 34.1% 13.6% 2.1% 34.1% 13.6% 2.1% μ - σ

    if your datapoint is in here, it’s an anomaly.
  7. a fundamental state change in the process means a different

    probability distribution function that describes the process
  8. skewed distributions! less than 99.73% of all values lie within

    3σ, so breaching 3σ is not necessarily bad 3σ possibly normal range
  9. the dirty secret: using SPC-based algorithms results in lots and

    lots of false positives, and probably lots of false negatives as well
  10. ...after all, as long as the *errors* from the model

    are normally distributed, we can use 3σ
  11. possible to implement a class of ML algorithms that determine

    models based on distribution of errors, using Q-Q plots
  12. Q-Q plots can also be used to determine if the

    PDF has changed, although hard to do with limited sample size
  13. ...and treat it as a way of building noisy situational

    awareness, not absolute directives (alerts)...