Basics of Machine Learning & Its Application In Predictive Maintenance

Machine Learning & Its Application In Predictive Maintenance Arnab Biswas
[email protected] arnabbiswas1

Table of Content • Basics of Machine Learning • Classical
Programming vs Machine Learning • Types of Machine Learning • Types of Supervised Learning • Application of ML in Predictive Maintenance (PdM) • Types of Maintenance • Goals & Use Cases for PdM • Data Science For PdM

What is Machine Learning? Task : Predict the price of
an apartment in Bangalore

Classical Programming / Software 1.0 • Take help of a
domain expert • Survey existing apartments in Bangalore • Identify factors contributing to the price of an apartment • Area • Size • Number of Bedrooms, Bathrooms • Name of the builder • etc. • Write a program which outputs the price based on the attributes identified Reference : https://medium.com/@karpathy/software-2-0-a64152b37c35

Classical Programming / Software 1.0 Software 1.0 Data Rule Answer

Machine Learning/Software 2.0 • First Step: Collect data (as much
as possible) Reference : https://www.kaggle.com/amitabhajoy/bengaluru-house-price-data

Machine Learning/Software 2.0 Learning Algorithm Data Rule Answer *No explicit
Programming!

Software 1.0 vs 2.0 Software 2.0 Data Rule Answer Software
1.0 Data Rule Answer

Features Labels Observation

Training vs Prediction Learning Algorithm Model Label Feature Training Prediction

ML Works Better When… • Problems for which classical programming
requires long list of rules which is difficult to maintain. ML can simplify the code. • ML “automatically” discovers change in data. Classical Programming needs manual update in the rules. • ML performs better for complex problems (Image, Text, Audio etc.) • Humans can gain insights from ML models

Humans can gain insights from ML models • Stages of
Cancer • Medical textbooks decides based on number of “yes” to the questions: 1. Has the cancer affected more than one lymph node? 2. Are the cancerous lymph nodes both above & below the bottom of the rib cage? 3. Is the cancer found in organs outside lymphatic system (in patient's bone marrow)? • A 2018 Research paper (University of Modena & Reggio Emilia) • Analyzed 15 variables, identifying 5 features • Due to limited cognitive ability, humans need a handful of most obvious signifiers/features • ML/AI decides based on hundreds if not thousands distinct features • May include traditional as well as less intuitive features

Machine Learning : Formal Definition • A Machine is Learning
when it improves at a task based on experience at that task, but without explicit programming. Reference : https://cloud.google.com/products/ai/ml-comic-1/

AI vs ML • AI: Quest for developing non-biological systems
that exhibit human-like forms of intelligence. Reference: https://sebastianraschka.com/blog/2020/intro-to-dl-ch01.html

Examples of Machine Learning • Recommending a video/song (Recommender System)
• Detecting cancer based on X-Ray Image (Computer Vision) • Forecasting company’s revenue based on various factors (Time Series Forecasting) • Summarizing long document into smaller, meaningful text (Language Processing) • Writing HTML, SQL, Unix code based on human language (Language Processing - GTP-3)

Types of ML Systems • Whether or not trained with
human supervision • Supervised Learning • Unsupervised Learning • Reinforcement Learning • Whether learning is incremental • Online Learning • Batch Learning • Instance based vs Model based learning

Supervised Learning • User provides the algorithm with inputs (features)
and desired outputs (labels) • The algorithm can create an output for an unseen input • User (Teacher) is supervising the algorithm to learn Input Output

Unsupervised Learning • Only input data is known & passed
to algorithm • Output data is unknown • Often used in understanding data better before solving a supervised learning problem • Usually harder to understand and evaluate • Applications • Segmenting readers based on their reading habits • Identifying topics of news articles • Anomaly Detection • Dimensionality Reduction • Clustering Input

Unsupervised Learning : Clustering • Each dot on plot represents
a research article on COVID Reference: https://maksimekin.github.io/COVID19-Literature-Clustering/plots/t-sne_covid-19_interactive.html

Reinforcement Learning • Steps • Learning system (agent) observes an
environment • Selects & performs actions • Gets rewarded or punished for actions • Learning system must learn by itself the best strategy (policy) to win most reward over time. • Examples • Robotics • AlphaGo Program • Energy Efficiency Reference: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Supervised Machine Learning • Regression: Goal is to predict a
continuous number • Classification: Goal is to predict a class label Label: Continuous Number Label: Distinct Values Reference: https://sebastianraschka.com/blog/2020/intro-to-dl-ch01.html

Predictive Maintenance

Types of Maintenance • Reactive Maintenance • Parts of an
equipment are replaced only on failure • Doesn’t waste part’s life, but results in downtime, unscheduled maintenance • Preventive Maintenance • Replaces a part after pre-determined useful lifespan, before it fails • Avoids unscheduled maintenance • Under utilization of parts • Predictive Maintenance • Replaces only the parts close to their failure (Just in time replacement) • Extends part’s lifespan • Reduce unscheduled maintenance Reference: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook https://arxiv.org/pdf/1912.07383.pdf

Predictive Maintenance (PdM) : Goals • Predict if an equipment
is going to fail in near future • Predict days to failure • Helps in scheduling a maintenance • Predict most probable root cause of a failure • Helps in identifying part(s) to repair/replace

Sample Use Cases • Failure of engine parts in an
aircraft • HVAC equipment failure • Elevators door failure • Wind turbine failure • Failure of wheels of train

Data Science For Predictive Maintenance • Steps • Convert Business
Problem into Data Science problem • Understand Data • Prepare Data • Building Model • Evaluate Model • Deploy Model • Monitor/Maintain Model Reference:https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining

Business problem into Data Science problem • Binary Classification •
Predict probability for an equipment to fail within a future time period • Regression • Predict amount of time that an equipment is operational before next failure • Multi-class classification • Predict probability for an equipment to fail within next ..3X, 2X, X unit of time • Predict probability for an equipment to fail within a future time period for a particular root cause

Binary Classification • Goal: Predict probability of failure within next
X unit of time • Labels (Discrete Number) • Failure within X time unit (1) • Healthy (0) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook

Regression • Goal: Predict remaining useful life (RUL) of the
equipment • Label: Time for which an asset is operational before next failure (RUL) • Continuous Number • Disadvantage • Equipment without any failures cannot be used for modeling

Multi-class Classification (1) • Goal: Predict the probability of failure
within next …, 3X, 2X, X units of time • Labels (Discrete Number) • Healthy (0) • Failure within 3X time unit (3Z) • Failure within 2X time unit (2Z) • Failure within X time unit (Z)

Multi-class Classification (2) • Goal: Predict probability of failure next
X units of time due to root cause Pi ? • Labels • Failure due to different root causes (P1, P2, P3, .. ) • Healthy (0)

Time Series Classification • If business permits, Classification is preferred
over Regression

Data Requirement • Relevant Data • Discuss with domain expert
• Sufficient Data • Duration (Year, Month, Day..) • Larger number of failures • Different types of failures • Quality of data • Garbage In, Garbage Out Reference: Google : Hidden Technical Debt in Machine Learning Systems

Data Collection • Data Source • Temporal Data • Equipment’s
Health • Example: Vibration, Voltage, Temperature, Humidity, Pressure etc. • Collected using IoT sensors • Temporal features reflecting aging pattern & anomalies • Represents normal & faulty behaviors over time • Maintenance history • Example: Dates of Repair activities, Components replaced etc. • Captures degradation patterns • Failure history • Weather • Usage (Load) of the equipment • Static Data • Equipment Metadata • Manufacturer, Make, Model • Manufacture Date, Installation Date, Age • Geographical Location

Data Exploration & Validation • Goal : Visualize & Validate
• Data is relevant • Data includes expected patterns • In case of no obvious patterns, add more features Reference: https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii

Data Pre-Processing • Structure data from various sources into tabular
format • Each row represents state of an equipment at any particular point of time accompanied with a label • Up-Sampling/Down-Sampling • Data Collection frequency may not match with prediction frequency • Data may be collected hourly, but, failure may be predicted at the day level

Data Pre-Processing • Missing Value Handling • Temporal Data (Examples)
• Forward Filling • Interpolation • Domain Specific • Fill missing value of pressure of an equipment on 1 PM, Tuesday • with last Tuesday 1 PM’s value • with Tuesday 1 PM’s value averaged over last 1 month • etc. • Strategy should be validated using cross-validation • Removal of duplicates

Feature Engineering • Goal: Extracts valuable information from raw data
which the algorithm can’t see

Feature Engineering (Temporal Data) • Aggregation • Data over individual
time units (e.g. days) is noisy • Needs to be smoothened by aggregating over time windows • Examples • Temperature: Fluctuating. Average value over day may rise with degradation • Vibration: May increase drastically before failure. Max over day could be a good feature https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook

• “How far in future the model has to predict”
influences “how far in past the model has to look back” to make predictions • Lag Features • “Looking back” period is called “Lag” • Rolling Aggregate (Examples) • Rolling Average of temperature over last 7, 15, 21 days • Rolling Max of vibration over last 7, 15, 21 days • Rolling count of alarms over last 1, 3, 5, 7 days Feature Engineering (Temporal Data) Rolling Aggregate

Feature Engineering (Temporal Data) • Functions For Aggregation • Count
• Average • Maximum • Minimum • Median • Standard Deviation • Variance • Count • Sum • Cumulative Sum • Derivate • 2nd Derivate • Count of outliers

Feature Engineering • Date • Day • Week • Weekday/Weekend
• Month • Quarter • Year • etc. • Maintenance Data • Days since last failure • Days since last failure because of specific root cause • Days since specific part replaced • Days since last maintenance • Static Data • Age of the equipment

Model Architecture & Algorithms Binary Classification Multi-class Classification Regression RNN,
LSTM RNN, LSTM RNN, LSTM DNN DNN DNN GBM Random Forest SVM (etc.) GBM Random Forest SVM Hidden Markov Chain (etc.) GBM RF Regression (etc.) Reference: https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii

Cross Validation • Goal • Validates a model during &
at the end of training • Reduces Overfitting • Generalizes well with unknown data https://scikit-learn.org/stable/modules/cross_validation.html

Time Series Cross Validation • In PdM, data is ordered
following time • Training, Validation, Test data must be split in Time dependent manner. • Validation data must be in future compared to training data Reference: https://eng.uber.com/forecasting-introduction/

Split between Training & Test Data • Split by Time
• Separate Train & Test data by the window size (“Look ahead time in future”) • Split by Equipment • Better performance with new equipment

Model Evaluation (Binary Classification) • Goal: What metric to optimize
for? • Determining Factors • Imbalanced Data • High Cost of False Alarm • Performance Metrics • Accuracy: Not Suitable • Precision: Lower value corresponds to higher rate of false alarms • Recall: Higher value corresponds to successful identification of true failures. • F1 Score: Harmonic average of precision and recall • RoC (Receiver Operating Characteristics) Curve

Model Serving/Prediction • Goal: Deploy the model in production, so
that it starts making prediction on new, unseen data • Need • Data must be pre-processed & engineered exactly the same way as the model training • Suggested Approach : Batch Scoring • Model’s decision is not needed immediately • Example : Once in a day predict equipment those are going to fail in next 7 days

Model Monitoring/Maintenance • Evaluate model’s performance in production • Compare
predictions vs ground truths • Did the failures really happened as predicted by model? • Was the equipment healthy when predicted? • Degradation of model’s performance may indicate need for retraining Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-mlconcepts.html

References • Machine Learning • A visual introduction to machine
learning • Introduction to Machine Learning and Deep Learning by Sebastian Raschka • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow • Predictive Maintenance • Azure AI guide for predictive maintenance solutions • A process for implementing industrial predictive maintenance • A Survey of Predictive Maintenance: Systems, Purposes and Approaches

Question? arnabbiswas1 [email protected]

Basics of Machine Learning & Its Application In...

Basics of Machine Learning & Its Application In Predictive Maintenance

More Decks by Arnab Biswas

Other Decks in Technology

Featured

Transcript