Maxim Nikiforov. Machine Learning in Non-Functional Testing

Build Software to Test Software exactpro.com Early Detection of Frozen
Processes in Loaded Systems July 2021 Maxim Nikiforov, Danila Gorkavchenko, Alexey Chistov, Andrey Novikov, Murad Mamedov

2 Build Software to Test Software exactpro.com 1. Introduction: a.
What system we are testing b. Purpose and objectives 2. Initial data preparation: a. Preprocessing data and converting it into tabular form b. Providing access to the data 3. Data preparation and modelling: a. Two types of datasets b. Processing attempts, results, etc c. Proposed DLC of the tool 4. Results and next steps Structure

3 Build Software to Test Software exactpro.com Part 1 Introduction

4 Build Software to Test Software exactpro.com Target Testing System
• Testing system contains 350+ processes • 200+ processes are active during the business day • Components are deployed over 30 servers • They perform various real-time activities and more than 100 batch activities of different types Application Servers Components 1 30 19 1 Inactive during BAU run Active Components

5 Build Software to Test Software exactpro.com Activities in Target
System There are the following types of activities: • Events that run several times per day • Daily events • Ad-hoc events • Weekly/Monthly events How to describe an activity: • Unique ID • Activity type • Start time • End time • Completion status

6 Build Software to Test Software exactpro.com Statistical Information Logs
1. Special process in the testing platform monitors all processes and gets all the information from them. 2. Then this process writes this information to log files with a 1-10 second interval. Initial format of the logs after collection: Comp. 1 Comp. 2 Comp. N ... Stat. Logs Collector Each line contains: • Timestamp • Component ID • Group ID of the parameters • List of parameters in a custom format There are 2 groups of parameters: • Critical documented parameters < 1% • Other parameters

7 Build Software to Test Software exactpro.com Current Monitoring of
Events and Statistics 1. Target System monitoring limitations: • Administrative frontend with text information only • No graphic representation • It is not possible to view the history of updates • The system provides Interfaces for external monitoring infrastructure 2. Additional Monitoring for QA purposes • It is implemented in Grafana • Only a small part of all values is monitored (critical parameters), but the number of them is too high to fit into one dashboard. • Real-time activity visualization Examples of QA activities and Critical parameters monitoring:

8 Build Software to Test Software exactpro.com Goal • To
develop an automated approach to predicting deviations before they become obvious • It should be possible to adapt it to other systems • Logs with statistical parameters will be used for the analysis • Ideally, it should indicate the root cause of the problem to the operational user (QA)

9 Build Software to Test Software exactpro.com Part 2 Initial
Data Preparation

10 Build Software to Test Software exactpro.com Easy Access to
the Data Raw SUT logs SUT logs in csv format Activity Data Joint Logs and Activity Data suitable for further analysis Converter from Raw logs to CSV Framework that provides easy access to necessary information about the activities and logs

11 Build Software to Test Software exactpro.com Log Converter to
CSV Format 08:01:00 : Comp1: Group1: Param1=10, Param2=99, Param3=4 08:01:01 : Comp1: Group1: Param1=11, Param2=98, Param3=4 Raw Log Format: • The conversion is executed on a daily basis and takes 15 minutes to convert daily logs (30gb) • Size of logs is reduced by 93 % • Data is unified now • Avg. number of columns is 7.5k Comp1 - Group1 - Param1 Comp1 - Group1 - Param2 Comp1 - Group1 - Param3 8:01:00 10 99 4 8:01:01 11 98 4 CSV format:

12 Build Software to Test Software exactpro.com Data Processing Framework
• Pandas dataframe as output • Now it is very easy to access the data: one-line command to get a dataframe • Doesn’t upload unnecessary information to the memory, supports additional filtering Data Collection for the period of time Data Collection for the certain activity

13 Build Software to Test Software exactpro.com Part 3 Data
Preparation and Modelling

14 Build Software to Test Software exactpro.com High Level Plan
of the Research Predict activity duration Dataset v1 “Activity-based DataSet” Dataset v2 “Time-based Dataset” Each row in the dataset contains data for a run of sum activity. It can be easily prepared based on the historical data. It is for the research purposes only. Each row of the dataset contains data for some period of time for several activities, as they can be executed simultaneously in the system. The dataset can be created in real time and is supposed to be used in production.

15 Build Software to Test Software exactpro.com 1 2 3
… N Run id Timestamp 00:00:01 00:00:02 00:00:03 … Activity Duration Time until activity is completed Aggregated parameters (max, min, mean for each value in system logs) Activity parameters: Start time, End time, ID, Status Datasets in Details Aggregated parameters (max, min, mean for each value in system logs) Time since start for each activity, if activity is started Dataset v1: Dataset v2: Target feature

16 Build Software to Test Software exactpro.com Dataset v1 Preparation
Pipeline Data collected from CSV files Dataset with numbers only - Handle object columns - Handle unique ids Dataset v1 Drop constant columns Drop Correlated Columns All statistical parameters are aggregated using a set of functions (min, mean, max) 1 2 3 4 5 2.5M x 7.5k 2.5M x 8.3k 11k x 25k 11k x 4.5k 11k x 2.3k Dataset shape:

17 Build Software to Test Software exactpro.com Training a Model
for 1 Activity Type Data = One Activity Type Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 202 sec STD = 45 sec

18 Build Software to Test Software exactpro.com Training a Single
Model for All Activities Data = All activities Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 767 sec

19 Build Software to Test Software exactpro.com Problem 1: Different
Number of Runs For Activities We can see that the count of runs is not the same for different types of activities: Min: 1, Max: 2000+ Additional RMSE is produced by activities with a low number of runs. Solution: • Use activities with the number of runs of more than 5

20 Build Software to Test Software exactpro.com • if we
use the default train-test split function, some activity types can be missed either in the train or the test dataset Solution: • Change the approach and make sure that all activities are in both sets with the expected ratio Problem 2: Default Train-Test Split Function

21 Build Software to Test Software exactpro.com Problem 3: Avg.
Duration of Different Types of Activities is Different • Longer activities produce bigger RMSE, even if the model has good performance for short activities • It is required to normalise the target value before training Solution: • Logarithmic target value • Stop using the absolute value of RMSE in seconds and calculate it based on the target value

22 Build Software to Test Software exactpro.com Attempts to Diminish
the RMSE Value In this case, if we normalize the target column values by the average value, the related model predicts the target value more accurately. Also, if we limit the maximum value of a certain activity duration by 75 percentile, it decreases the RMSE value. However, it eliminates values for bad activity runs. Therefore, the model trained only on good runs and looks like it is an inappropriate approach.

23 Build Software to Test Software exactpro.com Comparison of Different
Models • Accuracy of the model, if we train it using RandomForest, is always better than when we train model using DecisionTree; • It takes much longer time to train a model using RandomForest.

24 Build Software to Test Software exactpro.com Checking if it
is Possible to Predict the Duration of an Activity 100% 75% 50% 25 % Timeline, sec Avg. duration of activity

25 Build Software to Test Software exactpro.com Prediction Based on
Joint Dataset with Time Marker 100% 75% 50% 25 % Timeline, sec Avg. duration of activity • Dataset becomes 3 times bigger • Performance improvement is due to time reference field • In the dataset, we have several records of data for a single activity run. The approach can be used further for activities with low numbers of runs

26 Build Software to Test Software exactpro.com Checking Model Accuracy
when a Model is Trained on Successful Runs and Testing On Failed Runs In this case, we train the model only on successful runs and compare the accuracy of the model when we test it on successful runs and failed runs. • Average RMSE in case of testing on successful runs: 10,7 %; • Average RMSE in case of testing on failed runs: 326 %

27 Build Software to Test Software exactpro.com Overnight Business Hours
Proposed Daily Life Cycle Collect logs and activity data at EOD Multiply data Is data enough? No Preprocess data and train model Listen to logs in real time and predict activity duration Yes

28 Build Software to Test Software exactpro.com Part 4 Results
and Next Steps

29 Build Software to Test Software exactpro.com Summary • We
created a framework to access the statistics logs data, which itself can be useful for regular QA tasks • We applied different approaches and models to increase the quality of training for Dataset v1, however, the performance is still too low, and there is a lot of room for improvement • We prepared a time-based Dataset • We started a PoC of the tool that allows to monitor the target system in real time

30 Build Software to Test Software exactpro.com Next Steps •
Fine tune the model by altering aggregator functions • Find out how much data we need to start acceptable prediction in production runs • Check whether the results for DS v1 are applicable for DS v2 • Complete the PoC of the product • Predict failures of the activities

31 Build Software to Test Software exactpro.com Thank you!

Maxim Nikiforov. Machine Learning in Non-Functi...

Maxim Nikiforov. Machine Learning in Non-Functional Testing

Exactpro
PRO

More Decks by Exactpro

Other Decks in Technology

Featured

Transcript

Build Software to Test Software exactpro.com Early Detection of Frozen

2 Build Software to Test Software exactpro.com 1. Introduction: a.

3 Build Software to Test Software exactpro.com Part 1 Introduction

4 Build Software to Test Software exactpro.com Target Testing System

5 Build Software to Test Software exactpro.com Activities in Target

6 Build Software to Test Software exactpro.com Statistical Information Logs

7 Build Software to Test Software exactpro.com Current Monitoring of

8 Build Software to Test Software exactpro.com Goal • To

9 Build Software to Test Software exactpro.com Part 2 Initial

10 Build Software to Test Software exactpro.com Easy Access to

11 Build Software to Test Software exactpro.com Log Converter to

12 Build Software to Test Software exactpro.com Data Processing Framework

13 Build Software to Test Software exactpro.com Part 3 Data

14 Build Software to Test Software exactpro.com High Level Plan

15 Build Software to Test Software exactpro.com 1 2 3

16 Build Software to Test Software exactpro.com Dataset v1 Preparation

17 Build Software to Test Software exactpro.com Training a Model

18 Build Software to Test Software exactpro.com Training a Single

19 Build Software to Test Software exactpro.com Problem 1: Different

20 Build Software to Test Software exactpro.com • if we

21 Build Software to Test Software exactpro.com Problem 3: Avg.

22 Build Software to Test Software exactpro.com Attempts to Diminish

23 Build Software to Test Software exactpro.com Comparison of Different

24 Build Software to Test Software exactpro.com Checking if it

25 Build Software to Test Software exactpro.com Prediction Based on

26 Build Software to Test Software exactpro.com Checking Model Accuracy

27 Build Software to Test Software exactpro.com Overnight Business Hours

28 Build Software to Test Software exactpro.com Part 4 Results

29 Build Software to Test Software exactpro.com Summary • We

30 Build Software to Test Software exactpro.com Next Steps •

31 Build Software to Test Software exactpro.com Thank you!