Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multi-perspective Process Mining with Embedding Configurations into DB-based Event Logs

Exactpro
PRO
November 08, 2019

Multi-perspective Process Mining with Embedding Configurations into DB-based Event Logs

Sergey Shershakov

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/MW7KPzogNXA

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro
PRO

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Multi-Perspective Process Mining with Embedding
    Configurations into DB-based Event Logs
    Ivane Javakhishvili Tbilisi State University
    pais.hse.ru, cs.hse.ru
    PAIS
    lab

    View Slide

  2. Agenda
    • About process mining
    • Model and event log samples
    • Real event logs and their preprocessing
    – Event log standards
    • Converting raw event logs to DBs
    – Process perspectives and “flat” event logs
    • DB as an event log
    Multi-perspective process mining
    2

    View Slide

  3. Process Mining
    4
    Aalst, W.M.P. van der. Process Mining. Data Science in Action. 2nd ed, 2016.
    process-aware
    information
    system
    process
    models
    event
    logs
    models
    analyzes
    discovery
    records
    events, e.g.,
    messages,
    transactions,
    etc.
    specifies
    configures
    implements
    analyzes
    supports/
    controls
    enhancement
    conformance
    “world”

    View Slide

  4. Model samples: the handling of
    compensation requests
    5
    Aalst, W.M.P. van der. Process Mining. Data Science in Action. 2nd ed, 2016.

    View Slide

  5. Event log is an entry point for process mining
    • A big number of process mining techniques and
    the only single “entry point”
    • Event is a record about something that happened during
    a process execution
    – is represented as a set of (event) attributes
    • Trace is an ordered sequence of events
    – generally corresponds to a process instance (case)
    • Event log (EL) is a multiset of traces
    7

    View Slide

  6. Event logs: reality and an abstract representation
    8
    abcdef
    abcdeg
    abcdfe
    abcdfg
    abd
    abdg
    abdef
    abdeg
    Event log is a multiset of
    traces

    View Slide

  7. The model: the handling of compensation requests
    9
    • Case ID is an identifier of a process instance (PI)
    – one PI is normally represented by one trace in an EL
    • Activity determines an action within process execution
    • Timestamp is used to order events within traces

    View Slide

  8. A bit of real life…
    10

    View Slide

  9. Preprocessing of event data
    • Convert a low-level “raw” log into a standardized form
    – Python scripts for fast prototyping
    – C++ tool for fast conversion, better for repetitive converting of logs of
    the same type
    11
    Technical (raw) log
    Preprocessing
    Prepared event log
    Eve
    nt
    Log

    View Slide

  10. Event log standards
    • CSV is a universal format, suitable for many tools
    • XES is a special event data-related standard
    • MXML…
    12

    View Slide

  11. eXtensible Event Stream (XES)
    13
    • XML determines the “event” —
    “trace” — “log” structure
    • Event is a set of attributes
    • Another “activity” attribute can be
    chosen w/o log rebuilding
    • another process perspective
    • Remapping “caseid” and “timestamp”
    leads to changing of the log structure
    • the log must be rebuilt

    View Slide

  12. Classic approach for event log representation
    • “Flat” event logs:
    – single event log file (XES, XML) per one process perspective
    – single perspective is for single process model
    • varying a mining algorithm's parameters allows modification of the
    model
    14

    View Slide

  13. A problem: another process perspective
    • What if we need to look at a process from another
    point of view?
    15

    View Slide

  14. A model for a sample event log: View 1
    16
    Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.

    View Slide

  15. Another model for the same event log: View 2
    17
    Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.

    View Slide

  16. A problem: another process perspective
    • What if we need to look at a process from another
    point of view?
    – Now we have to rebuild an event log from scratch
    • (export event data as a new event log with another perspective)
    • load into RAM the initial log as an abstracted object model and
    save it as a new file
    18

    View Slide

  17. Preprocessing: convert raw data to a database
    19
    DB
    Technical (raw) log
    Preprocessing

    View Slide

  18. Exporting a process perspective from a DB as a
    “flat” event log
    • For event data represented as a database:
    1. map PM attributes (CaseID — Activity — Timestamp) to relational
    attributes of DB tables with event data;
    2. apply SQL-queries to obtain a desirable view (filtering, projection
    etc);
    3. export data from the view as a “flat” event log (XES)
    4. do mining with the “flat” log
    20

    View Slide

  19. Relating specific event log formats with
    the abstract representation
    21
    abcdef
    abcdeg
    abcdfe
    abcdfg
    abd
    abdg
    abdef
    abdeg
    Translation
    engine

    View Slide

  20. Database as an event log
    Perspective 1: Action → Activity, Start_timestamp — Timestamp
    Perspective 2: Unit_name → Activity, Complete_timestamp — Timestamp
    22

    View Slide

  21. Sample SQL queries for a DB event log
    • Query Q1 (id qryl_traces) extracting a set of traces:
    – SELECT Inv_ID FROM Events GROUP BY Inv_ID
    – результат: <17; 20; 21>
    • Query Q2 (id qryl_get_trace_events) extracting a single trace determined by
    a given parameter:
    – SELECT Inv_ID as CaseID, Action as Activity,
    Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1
    ORDER BY Timestamp
    • specific results for ?1 = 17:
    ‘decide’, 2015-05-19 14:45:27)>
    23

    View Slide

  22. Sample SQL queries for a DB event log
    • Query Q2 (id qryl_get_trace_events) extracting a single trace determined by
    a given parameter:
    – SELECT Inv_ID as CaseID, Action as Activity,
    Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1
    ORDER BY Timestamp
    – specific results for ?1 = 17:
    ‘decide’, 2015-05-19 14:45:27)>
    • Modified version of the query Q2 (only activity names considered):
    – SELECT Action as Activity FROM Events WHERE Inv_ID = ?1
    ORDER BY Start_timestamp
    – specific results for ?1 = 17: .
    24

    View Slide

  23. Configuring a process perspective
    • To implement all necessary operations applied to an abstract event log by
    a process mining algorithm one needs to define only 15 named queries
    and 3 parameters (attribute names for trace, activity и timestamp).
    25

    View Slide

  24. Using a DB event log in a mining tool
    • Implementation: LDOPA library and a tool for graphical
    modeling called VTMine4Visio
    An experiment model for transition systems discovery
    Parameters of an SQLite-EventLog DPM block
    26

    View Slide

  25. 27

    View Slide

  26. Many process perspectives — many configurations
    • Modified query Q2 for another view — Perspective 2:
    – SELECT Unit_name as Activity FROM Events WHERE Inv_ID = ?1
    ORDER BY Comlete_timestamp
    – specific results for ?1 = 17: ;.
    • Obtaining a chosen perspective configuration:
    – SELECT param, value FROM Config WHERE persp = 1
    28

    View Slide

  27. DB event logs vs. “flat” event logs
    • DB event logs:
    – Rapid event data extraction with DB indexing
    – Avoiding “flat” event logs problems related to rebuilding due to
    changing a process perspective
    – …
    29
    DB

    View Slide

  28. Multi-perspective process mining
    • Single perspective is represented by single configuration embedded
    into a DB
    • Instrumenting a DB-as-an-event-log with multiple perspective
    configurations
    • Switching between prepared process perspectives by a simple
    SQL query — at the side of a process mining software tool
    30

    View Slide

  29. References (1)
    1. van der Aalst W.M.P., Process Mining — Data Science in Action, 2nd Edition, Springer, 2016.
    2. van der Aalst, W.M.P.: Extracting Event Data from Databases to Unleash Process Mining, pp. 105–128.
    Springer International Publishing, Cham (2015).
    3. Dijkman, R., Gao, J., Syamsiyah, A., van Dongen, B., Grefen, P., ter Hofstede, A.: Enabling efficient process
    mining on large data sets: realizing an in-database process mining operator. Distributed and Parallel Databases
    (2019).
    4. van Dongen, B.F., Shabani, S.: Relational XES: Data management for process mining. pp. 169–176 (2015).
    5. Gonzalez Lopez de Murillas, E., Reijers, H.A., van der Aalst, W.M.P.: Connecting databases with process
    mining: A meta model and toolset. In: Schmidt, R., Guedria, W., Bider, I., Guerreiro, S. (eds.) Enterprise,
    Business-Process and Information Systems Modeling. pp. 231–249. Springer International Publishing, Cham
    (2016).
    6. de Murillas, E.G.L., van der Aalst, W.M.P., Reijers, H.A.: Process mining on databases: Unearthing historical
    data from redo logs. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) Business Process Management.
    pp. 367–385. Springer International Publishing, Cham (2015).
    7. Shershakov, S.A., Rubin, V.A.: System runs analysis with process mining. Modeling and Analysis of
    Information Systems 22(6), 818–833 (2015).
    32

    View Slide

  30. References (2)
    8. Shershakov, S.: Enhancing efficiency of process mining algorithms with a tailored library: Design principles
    and performance assessment. Tech. rep., National Research University Higher School of Economics (2018).
    9. Shershakov, S.A.: VTMine framework as applied to process mining modeling. International Journal of
    Computer and Communication Engineering 4(3), 166–179 (2015).
    10. Shershakov, S.A.: Multi-Perspective Process Mining with Embedding Configurations into DB-based Event
    Logs, CCIS (Proceedings of TMPA-2019), Springer. In press.
    33

    View Slide

  31. Thank you!

    View Slide