Unsupervised Anomaly Detection And Forecasting For Enterprise Time Series
Time Series are an omnipresent type of data. How can one predict the future and detect anomalies in an online setting on a large set of time series. Talk held at the data science meetup Kempten.
Series are an omnipresent type of data. How can one predict the future and detect anomalies in an online setting on a large set of time series. Wolfertschwenden, 16.05.18 Joachim Rosskopf, Dr. Simon Müller
struggling to finish his PhD in theo. physics, mainly doing data analysis and optimization algorithms there. He is lead of the DataTeam at Zoi GmbH, where he tries to combine his experience from software architecture and technology with business analytics and data science to build interesting, valuable data solutions.
MACHINE LEARNING? IS THAT JUSTIFIED? ▪ Example: Facebook image annotation ▪ Computer vision (CV) ▪ Millions of images per day. Mostly invisible to the user. ▪ A huge source of information about the user, and for advertisement. 5 A photo from “Mercedes Benz Deutschland”’s Facebook page
MACHINE LEARNING? IS THAT JUSTIFIED? ▪ Example: YouTube autogenerated subtitles. ▪ Speech Recognitions / Speech 2 Text ▪ Millions of hours of speech per day. ▪ Cross pollination with Android ecosystem and other services. ▪ Big players have a ML strategy. 6 A Barack Obama speech on YouTube with auto generated subtitles.
MACHINE LEARNING? IS THAT JUSTIFIED? ▪ Example: YouTube autogenerated subtitles. ▪ Speech Recognitions / Speech 2 Text ▪ Millions of hours of speech per day. ▪ Cross pollination with Android ecosystem and other services. ▪ Big players have a ML strategy. 7 A Barack Obama speech on YouTube with auto generated subtitles. AI WILL UNDER IMPRESS IN THE SHORT TERM, BUT BE TRANSFORMATIVE IN THE LONG TERM.
SPEECH & TEXT, IMAGES OR VIDEOS? ▪ For manufacturing companies a lot of the blockbuster advancements don’t apply directly. ▪ There is a lot of uncertainty, where investments will gain business value in future. ▪ Companies should not just follow, what's hot in news and works for Google or Facebook.
blockbuster advancements don’t apply directly. ▪ There is a lot of uncertainty, where investments will gain business value in future. ▪ Companies should not just follow, what's hot in news and works for Google or Facebook. OUR COMPANIES ALSO COLLECT A LOT OF DATA. THIS DATA IS MAINLY TIME SERIES DATA. WHAT IF GERMAN MANUFACTURING IS NOT RELYING ON SPEECH & TEXT, IMAGES OR VIDEOS?
Forecasting and anomaly detection with different methods on smart meter / IoT data: ◦ The machine learning way ◦ With Autoregressive Models ◦ With recurrent neural networks (RNN) ▪ Time to event/failure prediction on JetEngine data: ◦ Survival statistics as basics ◦ A nice fusion of RNN and the Weibull distribution ▪ Practical realization with Streaming, Open Source and the Cloud.
BUSINESS PROCESSES ▪ A sequence of data points indexed by a time dimension. ▪ In most cases the sequence is discrete sampled at equally spaced points in time. ▪ Common time series consist of real-valued univariate dataset. But also multivariate series or series of categorical data. 11 The household MAC002321 from the London Smart Meter Data Set (Cluster 3)
BUSINESS PROCESSES ▪ A sequence of data points indexed by a time dimension. ▪ In most cases the sequence is discrete sampled at equally spaced points in time. ▪ Common time series consist of real-valued univariate dataset. But also multivariate series or series of categorical data. 12 The households MAC002321 and MAC000034 from the London Smart Meter Data Set (Cluster 3 and 2)
combinatorics LoB and IoT systems produce a lot of time series. ▪ People spend a lot of time monitoring, interpreting and predicting time-series. ▪ But doing that for a large scale of series, in a timely fashion is laborious and error prone. ▪ It get’s even more challenging, if one wants to base business models or product functionality on these features (London smart meter dynamic pricing).
Sales IoT Interaction Inventory Ad Type Medium Campaign Device Category Channel Country Region Device Type Action Location Customer Replenish. Time Class. Order Point Material Turnover Unfolding dimension in a traditional data warehouse leads to a multitude of time series of the respective measures. Predicting them is of great value!
is able to work on all series equally. A challenge is to do the right preprocessing and algorithm selection. More General Model 1 2 3 5 Neural Networks (e.g. Autoencoder, LSTM, GRU) Generalized Autoregressive Conditional Heteroscedastic (GARCH) Autoregressive Model with Integrated Moving Average (ARIMA) Exponential Smoothing (ETS) 4 Regression Models (e.g. Decision Tree Regression)
Smart Meter London Explain and explore the dataset https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2F01_ Smart%20Meter%20London%20-%20Exploration.ipynb Quantile Random Forest - Smart Meter London Use Random Forest Regression for Time Series Prediction https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2F02_ Smart%20Meter%20London%20-%20Quantile%20Random%20Forest.ipynb ARIMA, ETS, and GARCH - Time Series Prediction Use Random Forest Regression for Time Series Prediction https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2F03_ Smart%20Meter%20London%20-%20ARIMA%2C%20ETS%2C%20and%20GARCH.ipynb Deep Learning - Time Series Prediction Train and predict with an simple RNN https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2F04_ Smart%20Meter%20London%20-%20LSTM.ipynb Outlier Detection - Smart Meter London Use an autoencoder together and extreme value theory to mark unlikely events as anomalies https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2F05_ Smart%20Meter%20London%20-%20Outlier_Detection.ipynb
▪ The target variable in many enterprise time-series is not continuous, but rather an event. ▪ This is a typical setting in event time analysis, where we want to predict the remaining lifetime or death of an individual. ▪ Dependent on the domain, this point in time depends on different features, like usage, blood pressure, oil temperature, etc. 17
▪ In event time analysis typically not all events are observed. ▪ So we know how old we are, but not, when we will die. Age is a right censored datapoint. ◦ Events are known up to a certain point in time. ◦ After this event, we haven’t observed a new event yet. But we still gather data on the features. During this time the event is censored. 18
engines of the same model. ▪ Starts with different unknown degrees of initial wear and manufacturing variation. ▪ Degrades over time until a predefined, unknown failure threshold is reached. ▪ 24 features, 1 event = failure at end of each time series. ▪ Predict from any point in time until maintenance.
of the same model. ▪ Starts with different unknown degrees of initial wear and manufacturing variation. ▪ Degrades over time until a predefined, unknown failure threshold is reached. ▪ 24 features, 1 event = failure at end of each time series. ▪ Predict from any point in time until maintenance. 20 Kaplan Meier fit on the training data of the 100 engines, median (199 time units)
KNOWN. ▪ Characteristics and benefits of the weibull distribution: Continuous or discrete closed form ▪ Occurs in nature, e.g. in event time analysis, reliability engineering and failure analysis, industrial engineering to represent manufacturing and delivery times. ▪ There exists literature with practical examples, e.g. for regularization. 22
Failure Analysis Explain and explore the dataset https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2 F10_JetEngine%20Failures%20-%20Exploration%20%26%20Basics.ipynb Time to event prediction - NASA JetEngine Failure Analysis Train and evaluate the RNN with the adapted Weibull likelihood https://mybinder.org/v2/gh/anofox/m3_konferenz/master?filepath=notebooks%2 F11_JetEngine%20Failures%20-%20WTTE-RNN.ipynb
tolerant ▪ Exactly once ▪ Event time based ▪ Stateful ▪ Distributed ▪ Parallel 26 Processing Time Event Time Source Source Algo. fit Algo. predict State Algo. fit Algo. predict State Algo. fit Algo. transfor m Sink Sink
with experienced minds at our locations in Stuttgart and Berlin ▪ We combine new technologies, tools and methods with our strong competence to implement and the challenges of our customers. ▪ We are computer scientists, electrical engineers, mathematicians, physics, biologists, business economics. ▪ Our technological drive is unbroken: We use part of our working time trying out new technologies. ▪ Zoi is a 100% digital subsidiary of Kaercher. ZOI IS THE ABBREVIATION FOR ZERO ONE INFINITY: OUR DIGITAL DNA
▪ Scalable, unsupervised, online anomaly detection and time-series prediction on business and IoT data. ▪ Quickly deployable as building block into the virtual private clouds of customers. We come, where your data & processing happens! ▪ We rely on cloud services, open source software and modern data science methods. At its core we rely on battle tested data analysis. For higher level intelligence we utilize state of the art machine learning research. Predict usage behavior of simple IoT devices. E.g. when will the user use/activate a function in a product next. Predict inventory or parts demand. Focus on high granularity, meaning erratic demands (e.g. spare parts)
type of data, which is especially interesting for business and IoT applications. ▪ There exist powerful algorithms to detect anomalies or predict future data points in an unsupervised setting. ▪ We demonstrated on two different datasets how continuous time series and events can be treated. ▪ Spark Streaming, Open Source and the Cloud are a decent environment for building streaming anomaly detection and prediction applications. ▪ If you want details, examples or see some math or code have a look at the notebooks or feel free to reach us after the talk or via email/twitter.
IDEAS! Unsupervised Anomaly Detection And Forecasting for Enterprise Time Series Joachims Email: [email protected] Joachims Twitter: @jrosskopf Simons Email: [email protected] Simons Twitter: @datamue WE ARE HIRING! [email protected]