Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Basics of Machine Learning & Its Application In Predictive Maintenance

Basics of Machine Learning & Its Application In Predictive Maintenance

This presentation introduces the concept of Machine Learning and then discusses how Machine Learning is being used in the Predictive Maintenance domain

Arnab Biswas

August 01, 2020

More Decks by Arnab Biswas

Other Decks in Technology


  1. Table of Content • Basics of Machine Learning • Classical

    Programming vs Machine Learning • Types of Machine Learning • Types of Supervised Learning • Application of ML in Predictive Maintenance (PdM) • Types of Maintenance • Goals & Use Cases for PdM • Data Science For PdM
  2. Classical Programming / Software 1.0 • Take help of a

    domain expert • Survey existing apartments in Bangalore • Identify factors contributing to the price of an apartment • Area • Size • Number of Bedrooms, Bathrooms • Name of the builder • etc. • Write a program which outputs the price based on the attributes identified Reference : https://medium.com/@karpathy/software-2-0-a64152b37c35
  3. Machine Learning/Software 2.0 • First Step: Collect data (as much

    as possible) Reference : https://www.kaggle.com/amitabhajoy/bengaluru-house-price-data
  4. ML Works Better When… • Problems for which classical programming

    requires long list of rules which is difficult to maintain. ML can simplify the code. • ML “automatically” discovers change in data. Classical Programming needs manual update in the rules. • ML performs better for complex problems (Image, Text, Audio etc.) • Humans can gain insights from ML models
  5. Humans can gain insights from ML models • Stages of

    Cancer • Medical textbooks decides based on number of “yes” to the questions: 1. Has the cancer affected more than one lymph node? 2. Are the cancerous lymph nodes both above & below the bottom of the rib cage? 3. Is the cancer found in organs outside lymphatic system (in patient's bone marrow)? • A 2018 Research paper (University of Modena & Reggio Emilia) • Analyzed 15 variables, identifying 5 features • Due to limited cognitive ability, humans need a handful of most obvious signifiers/features • ML/AI decides based on hundreds if not thousands distinct features • May include traditional as well as less intuitive features
  6. Machine Learning : Formal Definition • A Machine is Learning

    when it improves at a task based on experience at that task, but without explicit programming. Reference : https://cloud.google.com/products/ai/ml-comic-1/
  7. AI vs ML • AI: Quest for developing non-biological systems

    that exhibit human-like forms of intelligence. Reference: https://sebastianraschka.com/blog/2020/intro-to-dl-ch01.html
  8. Examples of Machine Learning • Recommending a video/song (Recommender System)

    • Detecting cancer based on X-Ray Image (Computer Vision) • Forecasting company’s revenue based on various factors (Time Series Forecasting) • Summarizing long document into smaller, meaningful text (Language Processing) • Writing HTML, SQL, Unix code based on human language (Language Processing - GTP-3)
  9. Types of ML Systems • Whether or not trained with

    human supervision • Supervised Learning • Unsupervised Learning • Reinforcement Learning • Whether learning is incremental • Online Learning • Batch Learning • Instance based vs Model based learning
  10. Supervised Learning • User provides the algorithm with inputs (features)

    and desired outputs (labels) • The algorithm can create an output for an unseen input • User (Teacher) is supervising the algorithm to learn Input Output
  11. Unsupervised Learning • Only input data is known & passed

    to algorithm • Output data is unknown • Often used in understanding data better before solving a supervised learning problem • Usually harder to understand and evaluate • Applications • Segmenting readers based on their reading habits • Identifying topics of news articles • Anomaly Detection • Dimensionality Reduction • Clustering Input
  12. Unsupervised Learning : Clustering • Each dot on plot represents

    a research article on COVID Reference: https://maksimekin.github.io/COVID19-Literature-Clustering/plots/t-sne_covid-19_interactive.html
  13. Reinforcement Learning • Steps • Learning system (agent) observes an

    environment • Selects & performs actions • Gets rewarded or punished for actions • Learning system must learn by itself the best strategy (policy) to win most reward over time. • Examples • Robotics • AlphaGo Program • Energy Efficiency Reference: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
  14. Supervised Machine Learning • Regression: Goal is to predict a

    continuous number • Classification: Goal is to predict a class label Label: Continuous Number Label: Distinct Values Reference: https://sebastianraschka.com/blog/2020/intro-to-dl-ch01.html
  15. Types of Maintenance • Reactive Maintenance • Parts of an

    equipment are replaced only on failure • Doesn’t waste part’s life, but results in downtime, unscheduled maintenance • Preventive Maintenance • Replaces a part after pre-determined useful lifespan, before it fails • Avoids unscheduled maintenance • Under utilization of parts • Predictive Maintenance • Replaces only the parts close to their failure (Just in time replacement) • Extends part’s lifespan • Reduce unscheduled maintenance Reference: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook https://arxiv.org/pdf/1912.07383.pdf
  16. Predictive Maintenance (PdM) : Goals • Predict if an equipment

    is going to fail in near future • Predict days to failure • Helps in scheduling a maintenance • Predict most probable root cause of a failure • Helps in identifying part(s) to repair/replace
  17. Sample Use Cases • Failure of engine parts in an

    aircraft • HVAC equipment failure • Elevators door failure • Wind turbine failure • Failure of wheels of train
  18. Data Science For Predictive Maintenance • Steps • Convert Business

    Problem into Data Science problem • Understand Data • Prepare Data • Building Model • Evaluate Model • Deploy Model • Monitor/Maintain Model Reference:https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
  19. Business problem into Data Science problem • Binary Classification •

    Predict probability for an equipment to fail within a future time period • Regression • Predict amount of time that an equipment is operational before next failure • Multi-class classification • Predict probability for an equipment to fail within next ..3X, 2X, X unit of time • Predict probability for an equipment to fail within a future time period for a particular root cause
  20. Binary Classification • Goal: Predict probability of failure within next

    X unit of time • Labels (Discrete Number) • Failure within X time unit (1) • Healthy (0) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook
  21. Regression • Goal: Predict remaining useful life (RUL) of the

    equipment • Label: Time for which an asset is operational before next failure (RUL) • Continuous Number • Disadvantage • Equipment without any failures cannot be used for modeling
  22. Multi-class Classification (1) • Goal: Predict the probability of failure

    within next …, 3X, 2X, X units of time • Labels (Discrete Number) • Healthy (0) • Failure within 3X time unit (3Z) • Failure within 2X time unit (2Z) • Failure within X time unit (Z)
  23. Multi-class Classification (2) • Goal: Predict probability of failure next

    X units of time due to root cause Pi ? • Labels • Failure due to different root causes (P1, P2, P3, .. ) • Healthy (0)
  24. Data Requirement • Relevant Data • Discuss with domain expert

    • Sufficient Data • Duration (Year, Month, Day..) • Larger number of failures • Different types of failures • Quality of data • Garbage In, Garbage Out Reference: Google : Hidden Technical Debt in Machine Learning Systems
  25. Data Collection • Data Source • Temporal Data • Equipment’s

    Health • Example: Vibration, Voltage, Temperature, Humidity, Pressure etc. • Collected using IoT sensors • Temporal features reflecting aging pattern & anomalies • Represents normal & faulty behaviors over time • Maintenance history • Example: Dates of Repair activities, Components replaced etc. • Captures degradation patterns • Failure history • Weather • Usage (Load) of the equipment • Static Data • Equipment Metadata • Manufacturer, Make, Model • Manufacture Date, Installation Date, Age • Geographical Location
  26. Data Exploration & Validation • Goal : Visualize & Validate

    • Data is relevant • Data includes expected patterns • In case of no obvious patterns, add more features Reference: https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii
  27. Data Pre-Processing • Structure data from various sources into tabular

    format • Each row represents state of an equipment at any particular point of time accompanied with a label • Up-Sampling/Down-Sampling • Data Collection frequency may not match with prediction frequency • Data may be collected hourly, but, failure may be predicted at the day level
  28. Data Pre-Processing • Missing Value Handling • Temporal Data (Examples)

    • Forward Filling • Interpolation • Domain Specific • Fill missing value of pressure of an equipment on 1 PM, Tuesday • with last Tuesday 1 PM’s value • with Tuesday 1 PM’s value averaged over last 1 month • etc. • Strategy should be validated using cross-validation • Removal of duplicates
  29. Feature Engineering (Temporal Data) • Aggregation • Data over individual

    time units (e.g. days) is noisy • Needs to be smoothened by aggregating over time windows • Examples • Temperature: Fluctuating. Average value over day may rise with degradation • Vibration: May increase drastically before failure. Max over day could be a good feature https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predictive-maintenance-playbook
  30. • “How far in future the model has to predict”

    influences “how far in past the model has to look back” to make predictions • Lag Features • “Looking back” period is called “Lag” • Rolling Aggregate (Examples) • Rolling Average of temperature over last 7, 15, 21 days • Rolling Max of vibration over last 7, 15, 21 days • Rolling count of alarms over last 1, 3, 5, 7 days Feature Engineering (Temporal Data) Rolling Aggregate
  31. Feature Engineering (Temporal Data) • Functions For Aggregation • Count

    • Average • Maximum • Minimum • Median • Standard Deviation • Variance • Count • Sum • Cumulative Sum • Derivate • 2nd Derivate • Count of outliers
  32. Feature Engineering • Date • Day • Week • Weekday/Weekend

    • Month • Quarter • Year • etc. • Maintenance Data • Days since last failure • Days since last failure because of specific root cause • Days since specific part replaced • Days since last maintenance • Static Data • Age of the equipment
  33. Model Architecture & Algorithms Binary Classification Multi-class Classification Regression RNN,

    LSTM RNN, LSTM RNN, LSTM DNN DNN DNN GBM Random Forest SVM (etc.) GBM Random Forest SVM Hidden Markov Chain (etc.) GBM RF Regression (etc.) Reference: https://cloud.google.com/blog/products/data-analytics/a-process-for-implementing-industrial-predictive-maintenance-part-ii
  34. Cross Validation • Goal • Validates a model during &

    at the end of training • Reduces Overfitting • Generalizes well with unknown data https://scikit-learn.org/stable/modules/cross_validation.html
  35. Time Series Cross Validation • In PdM, data is ordered

    following time • Training, Validation, Test data must be split in Time dependent manner. • Validation data must be in future compared to training data Reference: https://eng.uber.com/forecasting-introduction/
  36. Split between Training & Test Data • Split by Time

    • Separate Train & Test data by the window size (“Look ahead time in future”) • Split by Equipment • Better performance with new equipment
  37. Model Evaluation (Binary Classification) • Goal: What metric to optimize

    for? • Determining Factors • Imbalanced Data • High Cost of False Alarm • Performance Metrics • Accuracy: Not Suitable • Precision: Lower value corresponds to higher rate of false alarms • Recall: Higher value corresponds to successful identification of true failures. • F1 Score: Harmonic average of precision and recall • RoC (Receiver Operating Characteristics) Curve
  38. Model Serving/Prediction • Goal: Deploy the model in production, so

    that it starts making prediction on new, unseen data • Need • Data must be pre-processed & engineered exactly the same way as the model training • Suggested Approach : Batch Scoring • Model’s decision is not needed immediately • Example : Once in a day predict equipment those are going to fail in next 7 days
  39. Model Monitoring/Maintenance • Evaluate model’s performance in production • Compare

    predictions vs ground truths • Did the failures really happened as predicted by model? • Was the equipment healthy when predicted? • Degradation of model’s performance may indicate need for retraining Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-mlconcepts.html
  40. References • Machine Learning • A visual introduction to machine

    learning • Introduction to Machine Learning and Deep Learning by Sebastian Raschka • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow • Predictive Maintenance • Azure AI guide for predictive maintenance solutions • A process for implementing industrial predictive maintenance • A Survey of Predictive Maintenance: Systems, Purposes and Approaches