Slide 1

Slide 1 text

Multi-Perspective Process Mining with Embedding Configurations into DB-based Event Logs Ivane Javakhishvili Tbilisi State University pais.hse.ru, cs.hse.ru PAIS lab

Slide 2

Slide 2 text

Agenda • About process mining • Model and event log samples • Real event logs and their preprocessing – Event log standards • Converting raw event logs to DBs – Process perspectives and “flat” event logs • DB as an event log Multi-perspective process mining 2

Slide 3

Slide 3 text

Process Mining 4 Aalst, W.M.P. van der. Process Mining. Data Science in Action. 2nd ed, 2016. process-aware information system process models event logs models analyzes discovery records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls enhancement conformance “world”

Slide 4

Slide 4 text

Model samples: the handling of compensation requests 5 Aalst, W.M.P. van der. Process Mining. Data Science in Action. 2nd ed, 2016.

Slide 5

Slide 5 text

Event log is an entry point for process mining • A big number of process mining techniques and the only single “entry point” • Event is a record about something that happened during a process execution – is represented as a set of (event) attributes • Trace is an ordered sequence of events – generally corresponds to a process instance (case) • Event log (EL) is a multiset of traces 7

Slide 6

Slide 6 text

Event logs: reality and an abstract representation 8 abcdef abcdeg abcdfe abcdfg abd abdg abdef abdeg Event log is a multiset of traces

Slide 7

Slide 7 text

The model: the handling of compensation requests 9 • Case ID is an identifier of a process instance (PI) – one PI is normally represented by one trace in an EL • Activity determines an action within process execution • Timestamp is used to order events within traces

Slide 8

Slide 8 text

A bit of real life… 10

Slide 9

Slide 9 text

Preprocessing of event data • Convert a low-level “raw” log into a standardized form – Python scripts for fast prototyping – C++ tool for fast conversion, better for repetitive converting of logs of the same type 11 Technical (raw) log Preprocessing Prepared event log Eve nt Log

Slide 10

Slide 10 text

Event log standards • CSV is a universal format, suitable for many tools • XES is a special event data-related standard • MXML… 12

Slide 11

Slide 11 text

eXtensible Event Stream (XES) 13 • XML determines the “event” — “trace” — “log” structure • Event is a set of attributes • Another “activity” attribute can be chosen w/o log rebuilding • another process perspective • Remapping “caseid” and “timestamp” leads to changing of the log structure • the log must be rebuilt

Slide 12

Slide 12 text

Classic approach for event log representation • “Flat” event logs: – single event log file (XES, XML) per one process perspective – single perspective is for single process model • varying a mining algorithm's parameters allows modification of the model 14

Slide 13

Slide 13 text

A problem: another process perspective • What if we need to look at a process from another point of view? 15

Slide 14

Slide 14 text

A model for a sample event log: View 1 16 Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.

Slide 15

Slide 15 text

Another model for the same event log: View 2 17 Shershakov and Rubin. System Runs Analysis with Process Mining, 2015.

Slide 16

Slide 16 text

A problem: another process perspective • What if we need to look at a process from another point of view? – Now we have to rebuild an event log from scratch • (export event data as a new event log with another perspective) • load into RAM the initial log as an abstracted object model and save it as a new file 18

Slide 17

Slide 17 text

Preprocessing: convert raw data to a database 19 DB Technical (raw) log Preprocessing

Slide 18

Slide 18 text

Exporting a process perspective from a DB as a “flat” event log • For event data represented as a database: 1. map PM attributes (CaseID — Activity — Timestamp) to relational attributes of DB tables with event data; 2. apply SQL-queries to obtain a desirable view (filtering, projection etc); 3. export data from the view as a “flat” event log (XES) 4. do mining with the “flat” log 20

Slide 19

Slide 19 text

Relating specific event log formats with the abstract representation 21 abcdef abcdeg abcdfe abcdfg abd abdg abdef abdeg Translation engine

Slide 20

Slide 20 text

Database as an event log Perspective 1: Action → Activity, Start_timestamp — Timestamp Perspective 2: Unit_name → Activity, Complete_timestamp — Timestamp 22

Slide 21

Slide 21 text

Sample SQL queries for a DB event log • Query Q1 (id qryl_traces) extracting a set of traces: – SELECT Inv_ID FROM Events GROUP BY Inv_ID – результат: <17; 20; 21> • Query Q2 (id qryl_get_trace_events) extracting a single trace determined by a given parameter: – SELECT Inv_ID as CaseID, Action as Activity, Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1 ORDER BY Timestamp • specific results for ?1 = 17: <(17, ‘init’, 2015-05-19 14:06:27); (17, ‘doc’, 2015-05-19 14:20:01); (17, ‘decide’, 2015-05-19 14:45:27)> 23

Slide 22

Slide 22 text

Sample SQL queries for a DB event log • Query Q2 (id qryl_get_trace_events) extracting a single trace determined by a given parameter: – SELECT Inv_ID as CaseID, Action as Activity, Start_timestamp as Timestamp FROM Events WHERE CaseID = ?1 ORDER BY Timestamp – specific results for ?1 = 17: <(17, ‘init’, 2015-05-19 14:06:27); (17, ‘doc’, 2015-05-19 14:20:01); (17, ‘decide’, 2015-05-19 14:45:27)> • Modified version of the query Q2 (only activity names considered): – SELECT Action as Activity FROM Events WHERE Inv_ID = ?1 ORDER BY Start_timestamp – specific results for ?1 = 17: <‘init’; ‘doc’; ‘decide’>. 24

Slide 23

Slide 23 text

Configuring a process perspective • To implement all necessary operations applied to an abstract event log by a process mining algorithm one needs to define only 15 named queries and 3 parameters (attribute names for trace, activity и timestamp). 25

Slide 24

Slide 24 text

Using a DB event log in a mining tool • Implementation: LDOPA library and a tool for graphical modeling called VTMine4Visio An experiment model for transition systems discovery Parameters of an SQLite-EventLog DPM block 26

Slide 25

Slide 25 text

27

Slide 26

Slide 26 text

Many process perspectives — many configurations • Modified query Q2 for another view — Perspective 2: – SELECT Unit_name as Activity FROM Events WHERE Inv_ID = ?1 ORDER BY Comlete_timestamp – specific results for ?1 = 17: ;. • Obtaining a chosen perspective configuration: – SELECT param, value FROM Config WHERE persp = 1 28

Slide 27

Slide 27 text

DB event logs vs. “flat” event logs • DB event logs: – Rapid event data extraction with DB indexing – Avoiding “flat” event logs problems related to rebuilding due to changing a process perspective – … 29 DB

Slide 28

Slide 28 text

Multi-perspective process mining • Single perspective is represented by single configuration embedded into a DB • Instrumenting a DB-as-an-event-log with multiple perspective configurations • Switching between prepared process perspectives by a simple SQL query — at the side of a process mining software tool 30

Slide 29

Slide 29 text

References (1) 1. van der Aalst W.M.P., Process Mining — Data Science in Action, 2nd Edition, Springer, 2016. 2. van der Aalst, W.M.P.: Extracting Event Data from Databases to Unleash Process Mining, pp. 105–128. Springer International Publishing, Cham (2015). 3. Dijkman, R., Gao, J., Syamsiyah, A., van Dongen, B., Grefen, P., ter Hofstede, A.: Enabling efficient process mining on large data sets: realizing an in-database process mining operator. Distributed and Parallel Databases (2019). 4. van Dongen, B.F., Shabani, S.: Relational XES: Data management for process mining. pp. 169–176 (2015). 5. Gonzalez Lopez de Murillas, E., Reijers, H.A., van der Aalst, W.M.P.: Connecting databases with process mining: A meta model and toolset. In: Schmidt, R., Guedria, W., Bider, I., Guerreiro, S. (eds.) Enterprise, Business-Process and Information Systems Modeling. pp. 231–249. Springer International Publishing, Cham (2016). 6. de Murillas, E.G.L., van der Aalst, W.M.P., Reijers, H.A.: Process mining on databases: Unearthing historical data from redo logs. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) Business Process Management. pp. 367–385. Springer International Publishing, Cham (2015). 7. Shershakov, S.A., Rubin, V.A.: System runs analysis with process mining. Modeling and Analysis of Information Systems 22(6), 818–833 (2015). 32

Slide 30

Slide 30 text

References (2) 8. Shershakov, S.: Enhancing efficiency of process mining algorithms with a tailored library: Design principles and performance assessment. Tech. rep., National Research University Higher School of Economics (2018). 9. Shershakov, S.A.: VTMine framework as applied to process mining modeling. International Journal of Computer and Communication Engineering 4(3), 166–179 (2015). 10. Shershakov, S.A.: Multi-Perspective Process Mining with Embedding Configurations into DB-based Event Logs, CCIS (Proceedings of TMPA-2019), Springer. In press. 33

Slide 31

Slide 31 text

Thank you!