Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pre-processing Network Messages of Trading Systems into Event Logs for Process Mining

Exactpro
November 08, 2019

Pre-processing Network Messages of Trading Systems into Event Logs for Process Mining

Julio Carrasquel, Sergey Chuburov and Irina Lomazova

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/RHTQTwe4c88

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University 7-9 November,

    Tbilisi Ivane Javakhishvili Tbilisi State University Julio Carrasquel1, Sergey Chuburov2, Irina Lomazova1 1National Research University Higher School of Economics, Laboratory of Process-Aware Information Systems (PAIS Lab), Moscow, Russia 2QA Team Lead, Exactpro Systems, Pre-Processing Network Messages of Trading Systems into Event Logs for Process Mining
  2. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University 2 Outline

    1. Introduction • Analysis of Trading Systems • Process Mining • Research Scheme 2. Order Event Logs 3. Order Book Event Logs 4. Remarks and Future Work
  3. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University • Automated

    trade of securities between market participants, using buy or sell orders of different types to trade. • Analysis of system behavior → task of utmost importance for financial markets! Trading Systems 3
  4. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Log-based Validation

    of Trading Systems • Passive analysis → avoid testing instrumentation within system cores. ◦ Why? Keep minimized latency and overhead. ◦ We analyze system’s behavior based on system logs. • Recent literature works → based on data science techniques... ◦ I. Itkin, R. Yavorskiy. Overview of Applications of Passive Testing Techniques (2019). ◦ I. Itkin et al. User-Assisted Log Analysis for Quality Control of Distributed Fintech Applications (2019). • Typical data science techniques → do not focus on end-to-end processes. 4
  5. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Our Approach

    – Process Mining Discover, diagnose, and improve business & software processes, based on system event logs (observed behavior) and formal models. 5
  6. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University The Research

    Work Scheme Extraction and preparation of event logs by pre-processing data related to a system process is a crucial step! 6
  7. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Data Pre-Processing

    Work Results • Given as input, a set of FIX messages, captured during a trading day, we generate two kind of event logs: 1. Order Event Logs → Each case in the event log refers to all the events of an individual order • from an initial event when it is submitted, up to an event when it is discarded (because it traded, it was canceled, etc.) 2. Order Book Event Logs → Each case in the event log refers to all the events of all the orders in a single order book • where many orders interact, based on some priority scheme, for trading a same security. 7
  8. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University But… How

    do we obtain the event logs? How does FIX work? The Financial Information Exchange (FIX) protocol is divided in two layers: • A session layer → user connection management (logon, reject, heartbeat, logout, etc). • An application layer → application specific messages, i.e, management of orders. 8
  9. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University FIX Message

    Format - Example A user with ID User1 has sent a limit order to buy 40 stocks of VTB24 at a stock price of 100 A FIX message is a sequence of ASCII-encoded tag-value pairs. 9
  10. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Order Event

    Logs • In an order event log L, each case c ∈ L refers to the observed trace of an order. Each event e in a case c is structured as a tuple: e = (case_id, activity, state, timestamp, price, qty, side, [trade_id]) • Case Example – Trace of a buy order with stock size = 100 and price = 9.0 In this way, we can keep track of the evolution of the order, and its attributes through its lifetime in the system. 10
  11. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Order Event

    Logs – An Example An Order Event Log consisting of 5 cases (4 buy orders and 1 sell order) 11
  12. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Synthesizing behavior

    from an Order Event Log into a Directly-Follows Graph (DFG) A DFG generated using the tool Disco – observed behavior of 5 orders. 3 out of 5 orders were filled, whereas 2 orders were canceled. 12
  13. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Analyzing Order

    Event Logs – Remarks (1/2) • Synthesis of order behavior into automata. • The obtained model can be compared against some automata specifying the allowed behavior. • HOWEVER… Lot of valuable information in the log is lost in the process synthesis, i.e, order attributes (price, size, etc). 13
  14. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Analyzing Order

    Event Logs – Remarks (2/2) • So we work on enchantment of the approach to introduce: 1. Rule-based Checking: Given an order event log L and a set of rules R, i.e, defined in temporal logics, determine if each case c ∈ L holds each rule r ∈ R. (van der Aalst, 2005). • “Are all orders of a given class X (market, limit, etc.), satisfying all their class constraints?”. 2. Model Enhancement: To enrich discovered DFGs with additional event attributes (price, size, type, etc). 14
  15. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Order Book

    Event Logs • Order Event Logs are limited to diagnose behavior of individual orders, rather than capturing together how orders interact in an order book. • Order Book Event Logs → each case c ∈ L refers to sequence of orders’ events for all the orders on an order book. From a first to a last transaction during a trading day (or part of it) involving the trading of a specific security. Each event e in a case c is structured as a tuple: e = (secId, activity, timestamp, o 1 , o 2 ) where: secId is the security identifier (case identifier). o 1 and o 2 are orders of the form o = (orderId, state, qty, price, side) 15
  16. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Order Book

    Event Logs – An Example Each event displays the activity executed, and the orders involved, along with their new states after the execution of the event. An order book log consisting of just one case, in other words, it describes the history of a trading session in a single order book, where many orders interact. e = (secId, activity, timestamp, o 1 , o 2 ) 16
  17. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Order Book

    Event Logs – Replay We developed a graphical prototype to replay cases in an order book event log It provides a convenient visualization of order book dynamics, for example, for understanding the arrival of orders and their crossing. 17
  18. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Concluding Remarks

    and Future Work • We developed an approach for extracting well-structured event logs for process mining analysis, from FIX messages of trading systems. • Order Event Logs – each case is related to an order execution. — Synthesis of behavior in the log into automata. — Future work: rule-based checking, model enhancement. • Order Book Event Logs – each case is a trading session of a specific security in an order book. — Replay of order books supported by a graphical prototype. — Automata are not a suitable target model for synthesis. — Future work: to relate order book event logs with other models allowing to describe distributed resources, i.e, Petri nets. • Future work: Market Participant Logs? - Extract and analyze trader behavior. 18
  19. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University General Research

    Lines https://pais.hse.ru/research/projects/tradingsystems/ This talk was about a single line of our research on the analysis of trading systems with process mining and formal models. Our joint research work covers the following research lines 1. Data Pre-Processing: Techniques for extracting well-structured event logs related to different processes in trading systems. 2. Formal Modelling: Use and development of adequate formalisms for describing trading system behavior. Automata are not sufficient! 3. Simulation: Development of tools for simulating the models. Models can be run to generate artificial behavior! 4. Conformance Diagnosis: Development of methods for comparing event logs (observed behavior) and formal models (expected behavior). 19
  20. 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Pre-Processing Network

    Messages of Trading Systems into Event Logs for Process Mining Julio Carrasquel1, Sergey Chuburov2, Irina Lomazova1 1National Research University Higher School of Economics, Laboratory of Process-Aware Information Systems (PAIS Lab), Moscow, Russia 2QA Team Lead, Exactpro Systems, Kostroma, Russia Thank you!