Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Maxim Nikiforov. Machine Learning in Non-Functional Testing

Maxim Nikiforov. Machine Learning in Non-Functional Testing

We invite you to join the International Summer School on Data Science in Software Engineering. The summer school will be held online on 12-16 July, 2021, organized by the Laboratory of Software Testing, Tomsk Polytechnic University.
Participation is free. The official language of the school is English.

Students, young researchers and practitioners interested in applications of modern data science methods to the development and testing of complex software systems are invited to join. Follow the link to learn the full program and register your participation: https://itr-tpu.timepad.ru/event/1629835/
____
To learn more about Exactpro, visit our website https://exactpro.com/

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro
Facebook https://www.facebook.com/exactpro/
Instagram https://www.instagram.com/exactpro/
Vkontakte https://vk.com/exactpro_llc

Subscribe to Exactpro YouTube channel https://www.youtube.com/c/exactprosystems

Exactpro

July 14, 2021
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Build Software to Test Software exactpro.com Early Detection of Frozen

    Processes in Loaded Systems July 2021 Maxim Nikiforov, Danila Gorkavchenko, Alexey Chistov, Andrey Novikov, Murad Mamedov
  2. 2 Build Software to Test Software exactpro.com 1. Introduction: a.

    What system we are testing b. Purpose and objectives 2. Initial data preparation: a. Preprocessing data and converting it into tabular form b. Providing access to the data 3. Data preparation and modelling: a. Two types of datasets b. Processing attempts, results, etc c. Proposed DLC of the tool 4. Results and next steps Structure
  3. 4 Build Software to Test Software exactpro.com Target Testing System

    • Testing system contains 350+ processes • 200+ processes are active during the business day • Components are deployed over 30 servers • They perform various real-time activities and more than 100 batch activities of different types Application Servers Components 1 30 19 1 Inactive during BAU run Active Components
  4. 5 Build Software to Test Software exactpro.com Activities in Target

    System There are the following types of activities: • Events that run several times per day • Daily events • Ad-hoc events • Weekly/Monthly events How to describe an activity: • Unique ID • Activity type • Start time • End time • Completion status
  5. 6 Build Software to Test Software exactpro.com Statistical Information Logs

    1. Special process in the testing platform monitors all processes and gets all the information from them. 2. Then this process writes this information to log files with a 1-10 second interval. Initial format of the logs after collection: Comp. 1 Comp. 2 Comp. N ... Stat. Logs Collector Each line contains: • Timestamp • Component ID • Group ID of the parameters • List of parameters in a custom format There are 2 groups of parameters: • Critical documented parameters < 1% • Other parameters
  6. 7 Build Software to Test Software exactpro.com Current Monitoring of

    Events and Statistics 1. Target System monitoring limitations: • Administrative frontend with text information only • No graphic representation • It is not possible to view the history of updates • The system provides Interfaces for external monitoring infrastructure 2. Additional Monitoring for QA purposes • It is implemented in Grafana • Only a small part of all values is monitored (critical parameters), but the number of them is too high to fit into one dashboard. • Real-time activity visualization Examples of QA activities and Critical parameters monitoring:
  7. 8 Build Software to Test Software exactpro.com Goal • To

    develop an automated approach to predicting deviations before they become obvious • It should be possible to adapt it to other systems • Logs with statistical parameters will be used for the analysis • Ideally, it should indicate the root cause of the problem to the operational user (QA)
  8. 10 Build Software to Test Software exactpro.com Easy Access to

    the Data Raw SUT logs SUT logs in csv format Activity Data Joint Logs and Activity Data suitable for further analysis Converter from Raw logs to CSV Framework that provides easy access to necessary information about the activities and logs
  9. 11 Build Software to Test Software exactpro.com Log Converter to

    CSV Format 08:01:00 : Comp1: Group1: Param1=10, Param2=99, Param3=4 08:01:01 : Comp1: Group1: Param1=11, Param2=98, Param3=4 Raw Log Format: • The conversion is executed on a daily basis and takes 15 minutes to convert daily logs (30gb) • Size of logs is reduced by 93 % • Data is unified now • Avg. number of columns is 7.5k Comp1 - Group1 - Param1 Comp1 - Group1 - Param2 Comp1 - Group1 - Param3 8:01:00 10 99 4 8:01:01 11 98 4 CSV format:
  10. 12 Build Software to Test Software exactpro.com Data Processing Framework

    • Pandas dataframe as output • Now it is very easy to access the data: one-line command to get a dataframe • Doesn’t upload unnecessary information to the memory, supports additional filtering Data Collection for the period of time Data Collection for the certain activity
  11. 14 Build Software to Test Software exactpro.com High Level Plan

    of the Research Predict activity duration Dataset v1 “Activity-based DataSet” Dataset v2 “Time-based Dataset” Each row in the dataset contains data for a run of sum activity. It can be easily prepared based on the historical data. It is for the research purposes only. Each row of the dataset contains data for some period of time for several activities, as they can be executed simultaneously in the system. The dataset can be created in real time and is supposed to be used in production.
  12. 15 Build Software to Test Software exactpro.com 1 2 3

    … N Run id Timestamp 00:00:01 00:00:02 00:00:03 … Activity Duration Time until activity is completed Aggregated parameters (max, min, mean for each value in system logs) Activity parameters: Start time, End time, ID, Status Datasets in Details Aggregated parameters (max, min, mean for each value in system logs) Time since start for each activity, if activity is started Dataset v1: Dataset v2: Target feature
  13. 16 Build Software to Test Software exactpro.com Dataset v1 Preparation

    Pipeline Data collected from CSV files Dataset with numbers only - Handle object columns - Handle unique ids Dataset v1 Drop constant columns Drop Correlated Columns All statistical parameters are aggregated using a set of functions (min, mean, max) 1 2 3 4 5 2.5M x 7.5k 2.5M x 8.3k 11k x 25k 11k x 4.5k 11k x 2.3k Dataset shape:
  14. 17 Build Software to Test Software exactpro.com Training a Model

    for 1 Activity Type Data = One Activity Type Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 202 sec STD = 45 sec
  15. 18 Build Software to Test Software exactpro.com Training a Single

    Model for All Activities Data = All activities Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 767 sec
  16. 19 Build Software to Test Software exactpro.com Problem 1: Different

    Number of Runs For Activities We can see that the count of runs is not the same for different types of activities: Min: 1, Max: 2000+ Additional RMSE is produced by activities with a low number of runs. Solution: • Use activities with the number of runs of more than 5
  17. 20 Build Software to Test Software exactpro.com • if we

    use the default train-test split function, some activity types can be missed either in the train or the test dataset Solution: • Change the approach and make sure that all activities are in both sets with the expected ratio Problem 2: Default Train-Test Split Function
  18. 21 Build Software to Test Software exactpro.com Problem 3: Avg.

    Duration of Different Types of Activities is Different • Longer activities produce bigger RMSE, even if the model has good performance for short activities • It is required to normalise the target value before training Solution: • Logarithmic target value • Stop using the absolute value of RMSE in seconds and calculate it based on the target value
  19. 22 Build Software to Test Software exactpro.com Attempts to Diminish

    the RMSE Value In this case, if we normalize the target column values by the average value, the related model predicts the target value more accurately. Also, if we limit the maximum value of a certain activity duration by 75 percentile, it decreases the RMSE value. However, it eliminates values for bad activity runs. Therefore, the model trained only on good runs and looks like it is an inappropriate approach.
  20. 23 Build Software to Test Software exactpro.com Comparison of Different

    Models • Accuracy of the model, if we train it using RandomForest, is always better than when we train model using DecisionTree; • It takes much longer time to train a model using RandomForest.
  21. 24 Build Software to Test Software exactpro.com Checking if it

    is Possible to Predict the Duration of an Activity 100% 75% 50% 25 % Timeline, sec Avg. duration of activity
  22. 25 Build Software to Test Software exactpro.com Prediction Based on

    Joint Dataset with Time Marker 100% 75% 50% 25 % Timeline, sec Avg. duration of activity • Dataset becomes 3 times bigger • Performance improvement is due to time reference field • In the dataset, we have several records of data for a single activity run. The approach can be used further for activities with low numbers of runs
  23. 26 Build Software to Test Software exactpro.com Checking Model Accuracy

    when a Model is Trained on Successful Runs and Testing On Failed Runs In this case, we train the model only on successful runs and compare the accuracy of the model when we test it on successful runs and failed runs. • Average RMSE in case of testing on successful runs: 10,7 %; • Average RMSE in case of testing on failed runs: 326 %
  24. 27 Build Software to Test Software exactpro.com Overnight Business Hours

    Proposed Daily Life Cycle Collect logs and activity data at EOD Multiply data Is data enough? No Preprocess data and train model Listen to logs in real time and predict activity duration Yes
  25. 29 Build Software to Test Software exactpro.com Summary • We

    created a framework to access the statistics logs data, which itself can be useful for regular QA tasks • We applied different approaches and models to increase the quality of training for Dataset v1, however, the performance is still too low, and there is a lot of room for improvement • We prepared a time-based Dataset • We started a PoC of the tool that allows to monitor the target system in real time
  26. 30 Build Software to Test Software exactpro.com Next Steps •

    Fine tune the model by altering aggregator functions • Find out how much data we need to start acceptable prediction in production runs • Check whether the results for DS v1 are applicable for DS v2 • Complete the PoC of the product • Predict failures of the activities