Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pre-processing Network Messages of Trading Systems into Event Logs for Process Mining

Exactpro
PRO
November 08, 2019

Pre-processing Network Messages of Trading Systems into Event Logs for Process Mining

Julio Carrasquel, Sergey Chuburov and Irina Lomazova

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/RHTQTwe4c88

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro
PRO

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Julio Carrasquel1, Sergey Chuburov2, Irina Lomazova1
    1National Research University Higher School of Economics, Laboratory of Process-Aware Information Systems (PAIS Lab), Moscow, Russia
    2QA Team Lead, Exactpro Systems,
    Pre-Processing Network Messages of Trading
    Systems into Event Logs for Process Mining

    View Slide

  2. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    2
    Outline
    1. Introduction
    ● Analysis of Trading Systems
    ● Process Mining
    ● Research Scheme
    2. Order Event Logs
    3. Order Book Event Logs
    4. Remarks and Future Work

    View Slide

  3. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    ● Automated trade of securities between market participants, using buy or sell
    orders of different types to trade.
    ● Analysis of system behavior → task of utmost importance for financial markets!
    Trading Systems
    3

    View Slide

  4. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Log-based Validation of Trading Systems
    ● Passive analysis → avoid testing instrumentation within system cores.
    ○ Why? Keep minimized latency and overhead.
    ○ We analyze system’s behavior based on system logs.
    ● Recent literature works → based on data science techniques...
    ○ I. Itkin, R. Yavorskiy. Overview of Applications of Passive Testing Techniques (2019).
    ○ I. Itkin et al. User-Assisted Log Analysis for Quality Control of Distributed Fintech
    Applications (2019).
    ● Typical data science techniques → do not focus on end-to-end processes.
    4

    View Slide

  5. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Our Approach – Process Mining
    Discover, diagnose, and improve business & software processes, based on
    system event logs (observed behavior) and formal models.
    5

    View Slide

  6. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    The Research Work Scheme
    Extraction and preparation of event logs by pre-processing data related to a
    system process is a crucial step!
    6

    View Slide

  7. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Data Pre-Processing Work Results
    ● Given as input, a set of FIX messages, captured during a trading day, we
    generate two kind of event logs:
    1. Order Event Logs → Each case in the event log refers to all the events of
    an individual order
    ● from an initial event when it is submitted, up to an event when it is
    discarded (because it traded, it was canceled, etc.)
    2. Order Book Event Logs → Each case in the event log refers to all the events of
    all the orders in a single order book
    ● where many orders interact, based on some priority scheme, for
    trading a same security.
    7

    View Slide

  8. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    But… How do we obtain the event logs?
    How does FIX work?
    The Financial Information Exchange (FIX) protocol is divided in two layers:
    ● A session layer → user connection management (logon, reject, heartbeat, logout, etc).
    ● An application layer → application specific messages, i.e, management of orders.
    8

    View Slide

  9. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    FIX Message Format - Example
    A user with ID User1 has sent a limit order to buy 40 stocks
    of VTB24 at a stock price of 100
    A FIX message is a sequence of ASCII-encoded tag-value pairs.
    9

    View Slide

  10. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Order Event Logs
    ● In an order event log L, each case c ∈ L refers to the observed trace of an
    order. Each event e in a case c is structured as a tuple:
    e = (case_id, activity, state, timestamp, price, qty, side, [trade_id])
    ● Case Example – Trace of a buy order with stock size = 100 and price = 9.0
    In this way, we can keep track of the evolution of the order, and its
    attributes through its lifetime in the system.
    10

    View Slide

  11. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Order Event Logs – An Example
    An Order Event Log consisting of 5 cases
    (4 buy orders and 1 sell order)
    11

    View Slide

  12. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Synthesizing behavior from an Order Event Log
    into a Directly-Follows Graph (DFG)
    A DFG generated using the tool Disco – observed behavior of 5 orders.
    3 out of 5 orders were filled, whereas 2 orders were canceled.
    12

    View Slide

  13. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Analyzing Order Event Logs – Remarks (1/2)
    ● Synthesis of order behavior into automata.
    ● The obtained model can be compared against some automata
    specifying the allowed behavior.
    ● HOWEVER… Lot of valuable information in the log is lost in the process
    synthesis, i.e, order attributes (price, size, etc).
    13

    View Slide

  14. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Analyzing Order Event Logs – Remarks (2/2)
    ● So we work on enchantment of the approach to introduce:
    1. Rule-based Checking: Given an order event log L and a set of rules R, i.e,
    defined in temporal logics, determine if each case c ∈ L holds each rule r ∈ R.
    (van der Aalst, 2005).

    “Are all orders of a given class X (market, limit, etc.), satisfying all their
    class constraints?”.
    2. Model Enhancement: To enrich discovered DFGs with additional event
    attributes (price, size, type, etc).
    14

    View Slide

  15. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Order Book Event Logs
    ● Order Event Logs are limited to diagnose behavior of individual orders,
    rather than capturing together how orders interact in an order book.
    ● Order Book Event Logs → each case c ∈ L refers to sequence of orders’
    events for all the orders on an order book.
    From a first to a last transaction during a trading day (or part of it) involving the
    trading of a specific security.
    Each event e in a case c is structured as a tuple:
    e = (secId, activity, timestamp, o
    1
    , o
    2
    )
    where:
    secId is the security identifier (case identifier).
    o
    1
    and o
    2
    are orders of the form o = (orderId, state, qty, price, side)
    15

    View Slide

  16. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Order Book Event Logs – An Example
    Each event displays the activity executed, and the orders involved, along with their new states after
    the execution of the event.
    An order book log consisting of just one case, in other words, it describes the history of
    a trading session in a single order book, where many orders interact.
    e = (secId, activity, timestamp, o
    1
    , o
    2
    )
    16

    View Slide

  17. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Order Book Event Logs – Replay
    We developed a graphical prototype to replay cases in an order book event log
    It provides a convenient visualization of order book dynamics, for example,
    for understanding the arrival of orders and their crossing.
    17

    View Slide

  18. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Concluding Remarks and Future Work
    ● We developed an approach for extracting well-structured event logs for process
    mining analysis, from FIX messages of trading systems.
    ● Order Event Logs – each case is related to an order execution.
    — Synthesis of behavior in the log into automata.
    — Future work: rule-based checking, model enhancement.
    ● Order Book Event Logs – each case is a trading session of a specific security in an
    order book.
    — Replay of order books supported by a graphical prototype.
    — Automata are not a suitable target model for synthesis.
    — Future work: to relate order book event logs with other models allowing to
    describe distributed resources, i.e, Petri nets.
    ● Future work: Market Participant Logs? - Extract and analyze trader behavior.
    18

    View Slide

  19. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    General Research Lines
    https://pais.hse.ru/research/projects/tradingsystems/
    This talk was about a single line of our research on the analysis of trading systems
    with process mining and formal models.
    Our joint research work covers the following research lines
    1. Data Pre-Processing: Techniques for extracting well-structured event logs related
    to different processes in trading systems.
    2. Formal Modelling: Use and development of adequate formalisms for describing
    trading system behavior. Automata are not sufficient!
    3. Simulation: Development of tools for simulating the models. Models can be run
    to generate artificial behavior!
    4. Conformance Diagnosis: Development of methods for comparing event logs
    (observed behavior) and formal models (expected behavior).
    19

    View Slide

  20. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Pre-Processing Network Messages of
    Trading Systems into Event Logs for
    Process Mining
    Julio Carrasquel1, Sergey Chuburov2, Irina Lomazova1
    1National Research University Higher School of Economics,
    Laboratory of Process-Aware Information Systems (PAIS Lab), Moscow, Russia
    2QA Team Lead, Exactpro Systems, Kostroma, Russia
    Thank you!

    View Slide