TMPA-2021: Early Detection of Tasks With Uncommonly Long Run Duration in Post-Trade Systems

1 25-27 NOVEMBER SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS
ANALYSIS Early Detection of Tasks With Uncommonly Long Run Duration in Post-Trade Systems Maxim Nikiforov, Danila Gorkavchenko, Andrey Novikov, Murad Mamedov, Nikita Pushchin

2 Build Software to Test Software exactpro.com Target Testing System
• 20M trades per day • 200+ running components • Components are deployed over 30 servers • 100+ scheduled activities Application Servers Components 1 30 19 1 Inactive during BAU run Active Components

3 Build Software to Test Software exactpro.com Goal • To
develop an automated approach to predicting deviations before they become obvious • It should be possible to adapt it to other systems • Logs with statistical parameters will be used for the analysis • Ideally, it should indicate the root cause of the problem to the operational user (QA)

4 Build Software to Test Software exactpro.com Scheduled Events Attributes
which describe scheduled events: • Unique identifier; • Type of activity; • Start time; • End time; • Completion status

5 Build Software to Test Software exactpro.com Telemetry Logs 08:01:00
: Comp1: Group1: Param1=10, Param2=99, Param3=4 08:01:01 : Comp1: Group1: Param1=11, Param2=98, Param3=4 Raw Log Format: • Stored in textual format • Not structured • Hundred of gigabytes of logs Comp1 - Group1 - Param1 Comp1 - Group1 - Param2 Comp1 - Group1 - Param3 8:01:00 10 99 4 8:01:01 11 98 4 CSV Format: • The data is structured now • Size of logs is reduced by 93%

8 Build Software to Test Software exactpro.com Dataset Preparation Activity
1 Telemetry params tim eline start end Activity 2 start end Activity N start end 1 2 3 … N Run id Activity Duration Aggregated parameters from the telemetry logs Activity parameters: Start time, End time, ID, Status Data collected from CSV files Dataset with numbers only - Handle object columns - Handle unique ids Activity-based Data Set Drop constant columns Drop Correlated Columns All statistical parameters are aggregated using a set of functions (min, mean, max) 1 2 3 4 5 2.5M x 7.5k 2.5M x 8.3k 11k x 25k 11k x 4.5k 11k x 2.5k Dataset shape: Reducing data size pipeline:

9 Build Software to Test Software exactpro.com Training a Model
for 1 Activity Type Data = One Activity Type Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 202 sec STD = 45 sec

10 Build Software to Test Software exactpro.com Training a Single
Model for All Activities Data = All activities Model = Decision Tree Metrics = Root Mean Squared Error Results: RMSE = 767 sec

11 Build Software to Test Software exactpro.com Model Performance Improvements
• Logarithmic target value • Exclude rare activities • Stop using the absolute value of RMSE in seconds and calculate it in relation to the target value

12 Build Software to Test Software exactpro.com Comparison of Diﬀerent
Models • Accuracy of the model, if we train it using RandomForest, is always better than when we train model using Decision Tree;

13 Build Software to Test Software exactpro.com Checking if it
is Possible to Predict the Duration of an Activity 100% 75% 50% 25 % Timeline, sec Avg. duration of activity

14 Build Software to Test Software exactpro.com Prediction Based on
Joint Dataset with Time Marker 100% 75% 50% 25 % Timeline, sec Avg. duration of activity • Dataset becomes 3 times bigger • Performance improvement is due to time reference field Results: RMSE = 25.6 % STD = 0.58 %

15 Build Software to Test Software exactpro.com Results • The
approach to prepare the dataset of reasonable size was found • The approach for data augmentation was developed • The experiments shown that Random Forest Regressor model predicts activity with acceptable performance (RMSE = 25.6%, STD = 0.58%), but there is a room for improvement Future work • Find out how much data we need to start acceptable prediction in production runs • Prove that the similar performance can be reached for a time-based data set. • Find other ways for data aggregation to get better model performance • Predict failures of the activities

16 Thank you!

TMPA-2021: Early Detection of Tasks With Uncomm...

TMPA-2021: Early Detection of Tasks With Uncommonly Long Run Duration in Post-Trade Systems

Exactpro
PRO

More Decks by Exactpro

Other Decks in Technology

Featured

Transcript

1 25-27 NOVEMBER SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS

2 Build Software to Test Software exactpro.com Target Testing System

3 Build Software to Test Software exactpro.com Goal • To

4 Build Software to Test Software exactpro.com Scheduled Events Attributes

5 Build Software to Test Software exactpro.com Telemetry Logs 08:01:00

8 Build Software to Test Software exactpro.com Dataset Preparation Activity

9 Build Software to Test Software exactpro.com Training a Model

10 Build Software to Test Software exactpro.com Training a Single

11 Build Software to Test Software exactpro.com Model Performance Improvements

12 Build Software to Test Software exactpro.com Comparison of Diﬀerent

13 Build Software to Test Software exactpro.com Checking if it

14 Build Software to Test Software exactpro.com Prediction Based on

15 Build Software to Test Software exactpro.com Results • The

16 Thank you!