Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multi-perspective Process Mining with Embedding...

Exactpro
November 08, 2019

Multi-perspective Process Mining with Embedding Configurations into DB-based Event Logs

Sergey Shershakov

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/MW7KPzogNXA

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Multi-Perspective Process Mining with Embedding Configurations into DB-based Event Logs

    Ivane Javakhishvili Tbilisi State University pais.hse.ru, cs.hse.ru PAIS lab
  2. Agenda • About process mining • Model and event log

    samples • Real event logs and their preprocessing – Event log standards • Converting raw event logs to DBs – Process perspectives and “flat” event logs • DB as an event log Multi-perspective process mining 2
  3. Process Mining 4 Aalst, W.M.P. van der. Process Mining. Data

    Science in Action. 2nd ed, 2016. process-aware information system process models event logs models analyzes discovery records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls enhancement conformance “world”
  4. Model samples: the handling of compensation requests 5 Aalst, W.M.P.

    van der. Process Mining. Data Science in Action. 2nd ed, 2016.
  5. Event log is an entry point for process mining •

    A big number of process mining techniques and the only single “entry point” • Event is a record about something that happened during a process execution – is represented as a set of (event) attributes • Trace is an ordered sequence of events – generally corresponds to a process instance (case) • Event log (EL) is a multiset of traces 7
  6. Event logs: reality and an abstract representation 8 abcdef abcdeg

    abcdfe abcdfg abd abdg abdef abdeg Event log is a multiset of traces
  7. The model: the handling of compensation requests 9 • Case

    ID is an identifier of a process instance (PI) – one PI is normally represented by one trace in an EL • Activity determines an action within process execution • Timestamp is used to order events within traces
  8. Preprocessing of event data • Convert a low-level “raw” log

    into a standardized form – Python scripts for fast prototyping – C++ tool for fast conversion, better for repetitive converting of logs of the same type 11 Technical (raw) log Preprocessing Prepared event log Eve nt Log
  9. Event log standards • CSV is a universal format, suitable

    for many tools • XES is a special event data-related standard • MXML… 12
  10. eXtensible Event Stream (XES) 13 • XML determines the “event”

    — “trace” — “log” structure • Event is a set of attributes • Another “activity” attribute can be chosen w/o log rebuilding • another process perspective • Remapping “caseid” and “timestamp” leads to changing of the log structure • the log must be rebuilt
  11. Classic approach for event log representation • “Flat” event logs:

    – single event log file (XES, XML) per one process perspective – single perspective is for single process model • varying a mining algorithm's parameters allows modification of the model 14
  12. A problem: another process perspective • What if we need

    to look at a process from another point of view? 15
  13. A model for a sample event log: View 1 16

    Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.
  14. Another model for the same event log: View 2 17

    Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.
  15. A problem: another process perspective • What if we need

    to look at a process from another point of view? – Now we have to rebuild an event log from scratch • (export event data as a new event log with another perspective) • load into RAM the initial log as an abstracted object model and save it as a new file 18
  16. Exporting a process perspective from a DB as a “flat”

    event log • For event data represented as a database: 1. map PM attributes (CaseID — Activity — Timestamp) to relational attributes of DB tables with event data; 2. apply SQL-queries to obtain a desirable view (filtering, projection etc); 3. export data from the view as a “flat” event log (XES) 4. do mining with the “flat” log 20
  17. Relating specific event log formats with the abstract representation 21

    abcdef abcdeg abcdfe abcdfg abd abdg abdef abdeg Translation engine
  18. Database as an event log Perspective 1: Action → Activity,

    Start_timestamp — Timestamp Perspective 2: Unit_name → Activity, Complete_timestamp — Timestamp 22
  19. Sample SQL queries for a DB event log • Query

    Q1 (id qryl_traces) extracting a set of traces: – SELECT Inv_ID FROM Events GROUP BY Inv_ID – результат: <17; 20; 21> • Query Q2 (id qryl_get_trace_events) extracting a single trace determined by a given parameter: – SELECT Inv_ID as CaseID, Action as Activity, Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1 ORDER BY Timestamp • specific results for ?1 = 17: <(17, ‘init’, 2015-05-19 14:06:27); (17, ‘doc’, 2015-05-19 14:20:01); (17, ‘decide’, 2015-05-19 14:45:27)> 23
  20. Sample SQL queries for a DB event log • Query

    Q2 (id qryl_get_trace_events) extracting a single trace determined by a given parameter: – SELECT Inv_ID as CaseID, Action as Activity, Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1 ORDER BY Timestamp – specific results for ?1 = 17: <(17, ‘init’, 2015-05-19 14:06:27); (17, ‘doc’, 2015-05-19 14:20:01); (17, ‘decide’, 2015-05-19 14:45:27)> • Modified version of the query Q2 (only activity names considered): – SELECT Action as Activity FROM Events WHERE Inv_ID = ?1 ORDER BY Start_timestamp – specific results for ?1 = 17: <‘init’; ‘doc’; ‘decide’>. 24
  21. Configuring a process perspective • To implement all necessary operations

    applied to an abstract event log by a process mining algorithm one needs to define only 15 named queries and 3 parameters (attribute names for trace, activity и timestamp). 25
  22. Using a DB event log in a mining tool •

    Implementation: LDOPA library and a tool for graphical modeling called VTMine4Visio An experiment model for transition systems discovery Parameters of an SQLite-EventLog DPM block 26
  23. 27

  24. Many process perspectives — many configurations • Modified query Q2

    for another view — Perspective 2: – SELECT Unit_name as Activity FROM Events WHERE Inv_ID = ?1 ORDER BY Comlete_timestamp – specific results for ?1 = 17: <office1; acc2; office4>;. • Obtaining a chosen perspective configuration: – SELECT param, value FROM Config WHERE persp = 1 28
  25. DB event logs vs. “flat” event logs • DB event

    logs: – Rapid event data extraction with DB indexing – Avoiding “flat” event logs problems related to rebuilding due to changing a process perspective – … 29 DB
  26. Multi-perspective process mining • Single perspective is represented by single

    configuration embedded into a DB • Instrumenting a DB-as-an-event-log with multiple perspective configurations • Switching between prepared process perspectives by a simple SQL query — at the side of a process mining software tool 30
  27. References (1) 1. van der Aalst W.M.P., Process Mining —

    Data Science in Action, 2nd Edition, Springer, 2016. 2. van der Aalst, W.M.P.: Extracting Event Data from Databases to Unleash Process Mining, pp. 105–128. Springer International Publishing, Cham (2015). 3. Dijkman, R., Gao, J., Syamsiyah, A., van Dongen, B., Grefen, P., ter Hofstede, A.: Enabling efficient process mining on large data sets: realizing an in-database process mining operator. Distributed and Parallel Databases (2019). 4. van Dongen, B.F., Shabani, S.: Relational XES: Data management for process mining. pp. 169–176 (2015). 5. Gonzalez Lopez de Murillas, E., Reijers, H.A., van der Aalst, W.M.P.: Connecting databases with process mining: A meta model and toolset. In: Schmidt, R., Guedria, W., Bider, I., Guerreiro, S. (eds.) Enterprise, Business-Process and Information Systems Modeling. pp. 231–249. Springer International Publishing, Cham (2016). 6. de Murillas, E.G.L., van der Aalst, W.M.P., Reijers, H.A.: Process mining on databases: Unearthing historical data from redo logs. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) Business Process Management. pp. 367–385. Springer International Publishing, Cham (2015). 7. Shershakov, S.A., Rubin, V.A.: System runs analysis with process mining. Modeling and Analysis of Information Systems 22(6), 818–833 (2015). 32
  28. References (2) 8. Shershakov, S.: Enhancing efficiency of process mining

    algorithms with a tailored library: Design principles and performance assessment. Tech. rep., National Research University Higher School of Economics (2018). 9. Shershakov, S.A.: VTMine framework as applied to process mining modeling. International Journal of Computer and Communication Engineering 4(3), 166–179 (2015). 10. Shershakov, S.A.: Multi-Perspective Process Mining with Embedding Configurations into DB-based Event Logs, CCIS (Proceedings of TMPA-2019), Springer. In press. 33