Slide 1

Slide 1 text

Control-flow Discovery from Event Streams Andrea Burattin1, Alessandro Sperduti1, Wil M. P. van der Aalst2 1 University of Padua, Italy 2 Eindhoven University of Technology, The Netherlands July 11, 2014

Slide 2

Slide 2 text

Typical Process Mining Scenario Imagination Process Mining Incarnation / Environment Observation Operational Model Analytical Model Event Logs Information System Operational Incarnation support protocol / audit Discovery Conformance Extension control augment compare compare analyze mine basis create (re-)design implement describe Image source: Christian G¨ unther. Process mining in Flexible Environments, PhD thesis, Technische Universiteit Eindhoven, Eindhoven, 2009. 2 of 20

Slide 3

Slide 3 text

Typical Event Log Event # Activity Orig. Time . . . Case Id: C1 1 A U1 2014-01-01 . . . 2 B U1 2014-01-02 . . . 3 C U2 2014-01-03 . . . 4 E U2 2014-01-04 . . . Case Id: C2 1 A U1 2014-01-02 . . . 2 B U1 2014-01-03 . . . 3 D U3 2014-01-04 . . . 4 E U3 2014-01-05 . . . 3 of 20

Slide 4

Slide 4 text

Typical Event Log Event # Activity Orig. Time . . . Case Id: C1 1 A U1 2014-01-01 . . . 2 B U1 2014-01-02 . . . 3 C U2 2014-01-03 . . . 4 E U2 2014-01-04 . . . Case Id: C2 1 A U1 2014-01-02 . . . 2 B U1 2014-01-03 . . . 3 D U3 2014-01-04 . . . 4 E U3 2014-01-05 . . . Event # Case Id Activity Orig. Time . . . 1 C1 A U1 2014-01-01 . . . 2 C2 A U1 2014-01-02 . . . 3 C1 B U1 2014-01-02 . . . 4 C2 B U1 2014-01-03 . . . 5 C1 C U2 2014-01-03 . . . 6 C1 E U2 2014-01-04 . . . 7 C2 D U3 2014-01-04 . . . 8 C2 E U3 2014-01-05 . . . 3 of 20

Slide 5

Slide 5 text

Event Stream Representation of an event stream σ: E F E G F E F G G H I H H E F E G F E F G G H I H H σ Time Case: Cx Case: Cy Case: Cz Boxes represent events Background colors represent the case id Letters inside are the activity names 4 of 20

Slide 6

Slide 6 text

Streaming Process Discovery Events emi�ed over �me Stream miner instance ... Network communica�on Time ... A B B2 C A B C The stream miner continuously receives events and, using the latest observations, updates the process model (e.g., a Petri Net) 5 of 20

Slide 7

Slide 7 text

Stream Mining Peculiarities Peculiarities of the stream mining problem 6 of 20

Slide 8

Slide 8 text

Stream Mining Peculiarities Peculiarities of the stream mining problem 1 Cannot store the entire stream (approximation) 6 of 20

Slide 9

Slide 9 text

Stream Mining Peculiarities Peculiarities of the stream mining problem 1 Cannot store the entire stream (approximation) 2 Backtracking not feasible over streams (algorithms required to make one pass over data → scale linearly w.r.t. the number of processed items) 6 of 20

Slide 10

Slide 10 text

Stream Mining Peculiarities Peculiarities of the stream mining problem 1 Cannot store the entire stream (approximation) 2 Backtracking not feasible over streams (algorithms required to make one pass over data → scale linearly w.r.t. the number of processed items) 3 The approach must deal with variable system conditions, such as fluctuating stream rates 6 of 20

Slide 11

Slide 11 text

Stream Mining Peculiarities Peculiarities of the stream mining problem 1 Cannot store the entire stream (approximation) 2 Backtracking not feasible over streams (algorithms required to make one pass over data → scale linearly w.r.t. the number of processed items) 3 The approach must deal with variable system conditions, such as fluctuating stream rates 4 It is important to quickly adapt the model to cope with unusual data values (concept drifts) 6 of 20

Slide 12

Slide 12 text

Heuristics Miner Historical Background Our approaches are based on Heuristics Miner, quite old (∼ 2003) but still one of the most used algorithm 7 of 20

Slide 13

Slide 13 text

Heuristics Miner Historical Background Our approaches are based on Heuristics Miner, quite old (∼ 2003) but still one of the most used algorithm Fundamental metric is “dependency measure” between two activities (e.g. a, b): a ⇒ b = |a > b| − |b > a| |a > b| + |b > a| + 1 ∈ [−1, 1] Where: |a > b| is the number of times that a > b holds in the log a > b holds if a executed at time t and b at t + 1 7 of 20

Slide 14

Slide 14 text

Heuristics Miner (cont.) Given the dependency measure for all activity pairs and a threshold τdep, the algorithm builds a directed dependency graph 8 of 20

Slide 15

Slide 15 text

Heuristics Miner (cont.) Given the dependency measure for all activity pairs and a threshold τdep, the algorithm builds a directed dependency graph If both a ⇒ b > τdep and a ⇒ c > τdep then: a b c Relation ambiguity between b and c: XOR: either b or c is executed AND: both b and c are executed 8 of 20

Slide 16

Slide 16 text

Heuristics Miner (cont.) Given the dependency measure for all activity pairs and a threshold τdep, the algorithm builds a directed dependency graph If both a ⇒ b > τdep and a ⇒ c > τdep then: a b c Relation ambiguity between b and c: XOR: either b or c is executed AND: both b and c are executed Heuristics Miner proposes the “AND-measure” a ⇒ (b ∧ c) = |b > c| + |c > b| |a > b| + |a > c| + 1 ∈ [0, 1] If a ⇒ (b ∧ c) > τand then AND relation, XOR otherwise 8 of 20

Slide 17

Slide 17 text

Direct Following Matrix Basic data structure for HM is Direct Following Matrix Direct Following Matrix Given activities A, B, C, D and a log L, |a > b| is the number of times that a is directly followed by b (within the same process instance) in the log L A B C D A 0 52 64 91 B 52 0 24 87 C 64 24 0 13 D 91 87 13 0 9 of 20

Slide 18

Slide 18 text

Proposed Approaches Our Proposal We present three approaches, based on Heuristics Miner, for process discovery from event streams: (SW) Heuristics Miner with Sliding Window (as baseline) (LC) Heuristics Miner with Lossy Counting (LCB) Heuristics Miner with Lossy Counting with Budget 10 of 20

Slide 19

Slide 19 text

Proposed Approaches Our Proposal We present three approaches, based on Heuristics Miner, for process discovery from event streams: (SW) Heuristics Miner with Sliding Window (as baseline) (LC) Heuristics Miner with Lossy Counting (LCB) Heuristics Miner with Lossy Counting with Budget Fundamental Principle Recent observations are more important than older ones 10 of 20

Slide 20

Slide 20 text

Heuristics Miner with SW Basic idea is to iterate these steps 1 Collect events for a given time span 2 Generate a finite event log 3 Apply the “offline version” of the algorithm { Time frame considered Mining �me Log used for mining 11 of 20

Slide 21

Slide 21 text

Frequency Counting with Lossy Counting Given Max approximation error Variables: A , B , C Bucket size is w = 1 bcurrent = no. of observed items w Lossy Counting uses a data structure D = {(var, freq, max error)} 12 of 20

Slide 22

Slide 22 text

Frequency Counting with Lossy Counting Given Max approximation error Variables: A , B , C Bucket size is w = 1 bcurrent = no. of observed items w Lossy Counting uses a data structure D = {(var, freq, max error)} A A B A B C C B C A A B C Data sequence Buckets If B not present then (B, f = 1, Δ = bcurrent ‐ 1) else Update frequency f of B bcurrent Remove all elements s.t. f + Δ ≤ bcurrent bcurrent + 1 12 of 20

Slide 23

Slide 23 text

LC/LCB Demo Lossy Counting Demo f Δ → 0 0 0 0 0 f 0 0 ← Δ Time Remove if f + Δ ≤ 1 bcurrent = 1 End of bucket 1 Beginning of bucket 2 13 of 20

Slide 24

Slide 24 text

LC/LCB Demo Lossy Counting Demo f 0 0 ← Δ Time Beginning of bucket 2 f Δ → 0 0 1 1 f 0 0 1 ← Δ Remove if f + Δ ≤ 2 bcurrent = 2 End of bucket 2 Beginning of bucket 3 13 of 20

Slide 25

Slide 25 text

LC/LCB Demo Lossy Counting Demo Time f 0 0 1 ← Δ Beginning of bucket 3 f Δ → 0 0 1 2 f 0 0 1 2 ← Δ Remove if f + Δ ≤ 3 bcurrent = 3 End of bucket 3 Beginning of bucket 4 13 of 20

Slide 26

Slide 26 text

LC/LCB Demo Comparison between Lossy Counting frequencies and true frequencies f F Δ → 0 0 1 2 Es�mated frequencies True frequencies These inequalities hold: f ≤ F ≤ f + ∆ ≤ f + N 13 of 20

Slide 27

Slide 27 text

LC/LCB Demo Comparison between Lossy Counting frequencies and true frequencies f F Δ → 0 0 1 2 Es�mated frequencies True frequencies These inequalities hold: f ≤ F ≤ f + ∆ ≤ f + N Lossy Counting with Budget Idea New bucket when there is no more space, then = 1 bucket size 13 of 20

Slide 28

Slide 28 text

Adaptation of LC/LCB to HM To count direct following relations we need Drel Actual relations frequencies, tuples: (as, at, f , ∆) Dact Latest activity names, tuples: (a, f , ∆) Dcases Latest activity of a case, tuples: (c, a, f , ∆) 14 of 20

Slide 29

Slide 29 text

Adaptation of LC/LCB to HM To count direct following relations we need Drel Actual relations frequencies, tuples: (as, at, f , ∆) Dact Latest activity names, tuples: (a, f , ∆) Dcases Latest activity of a case, tuples: (c, a, f , ∆) With a certain periodicity, model update Activities from Dact Dependencies and AND/XOR rules from Drel 14 of 20

Slide 30

Slide 30 text

Adaptation of LC/LCB to HM To count direct following relations we need Drel Actual relations frequencies, tuples: (as, at, f , ∆) Dact Latest activity names, tuples: (a, f , ∆) Dcases Latest activity of a case, tuples: (c, a, f , ∆) With a certain periodicity, model update Activities from Dact Dependencies and AND/XOR rules from Drel Actually, we show that updates on D data structures affect only local parts of the model (incremental update of the process model) 14 of 20

Slide 31

Slide 31 text

Evaluation Datasets Artificial dataset characteristics: Three randomly generated processes (to simulate concept drifts) Most complex model has 3 splits (1 AND and 2 XOR) Longest process has 16 activities Stream with 17 265 events Real-world dataset (BPI Challenge 2012 log) characteristics Dutch Financial Institute 36 activities 262 198 events, among 13 087 process instances 15 of 20

Slide 32

Slide 32 text

Evaluation on Artificial Dataset Model-to-model Metric Assess the correspondence between original and discovered model 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Model‐to‐model similarity Observed events LCB (B = 300) LCB (B = 100) LC (ε = 0.000001) LC (ε = 0.001) SW (W = 300) SW (W = 100) 16 of 20

Slide 33

Slide 33 text

Evaluation on Artificial Dataset Space Requirements Space expressed as number of stored items 0 250 500 750 1000 1250 1500 1750 2000 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 No. stored items Observed events LCB (B = 300) LCB (B = 100) LC (ε = 0.000001) LC (ε = 0.001) SW (W = 300) SW (W = 100) 17 of 20

Slide 34

Slide 34 text

Evaluation on Artificial Dataset Time Requirements Time required to process each event 0 5 10 15 20 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Time per event (ms) Observed events LCB (B = 300) LCB (B = 100) LC (ε = 0.000001) LC (ε = 0.001) SW (W = 300) SW (W = 100) 18 of 20

Slide 35

Slide 35 text

Evaluation on BPI Challenge 2012 Precision Metric Precision of the discovered models 0 0.2 0.4 0.6 0.8 1 0*100 50*103 100*103 150*103 200*103 250*103 Precision Observed events SW (W = 1000) LC (ε = 0.00001) LCB (B = 1000) Time required to process an event: SW: 24.59ms LC: 5.68ms LCB: 2.56ms 19 of 20

Slide 36

Slide 36 text

Conclusions and Future Work Conclusions We addressed the problem of discovering process models from event streams Three approaches proposed, based on Heuristics Miner (with Sliding Window, with Lossy Counting, and with Lossy Counting with Budget) Experimental results on both artificial and real dataset, with improvements in terms of quality of the mined models, execution time, and space requirements as well Future Work Improve the process analyst to mine different perspectives Animations to point out process drifts locations 20 of 20

Slide 37

Slide 37 text

End. 20 of 20

Slide 38

Slide 38 text

Heuristics Miner with SW Input: S: event stream; M: memory; maxM : maximum memory size; perform mining: mining update periodicity 1 forever do 2 e ← observe(S) /* Observe a new event, where e = (ci , ai , ti ) */ /* Memory update */ 3 if size(M) = maxM then 4 shift(M) 5 end 6 insert(M, e) /* Mining update */ 7 if perform mining then 8 L ← convert(M) /* Conversion of the memory into an event log that can be used with Heuristics Miner */ 9 HeuristicsMiner(L) 10 end 11 end Algorithm 1: Heuristics Miner with SW 20 of 20

Slide 39

Slide 39 text

Heuristics Miner with Lossy Counting Input: S event stream; : approximation error 1 Initialize the data structure DA , DC , DR 2 N ← 1 3 w ← 1 /* Bucket size */ 4 forever do 5 e ← observe(S) /* Event e = (ci , ai , ti ) */ 6 bcurr = N w /* current bucket id */ /* Update the DA data structure */ 7 if ∃(a, f , ∆) ∈ DA such that a = ai then 8 Remove the entry (a, f , ∆) from DA 9 DA ← DA ∪ {(a, f + 1, ∆)} 10 else 11 DA ← DA ∪ {(ai , 1, bcurr − 1)} 12 end /* Update the DC data structure */ 13 if ∃(c, alast , f , ∆) ∈ DC such that c = ci then 14 Remove the entry (c, alast , f , ∆) from DC 15 DC ← DC ∪ {(c, ai , f + 1, ∆)} /* Update the DR data structure */ 16 Build relation ri as alast → ai 17 if ∃(r, f , ∆) ∈ DR such that r = ri then 18 Remove the entry (r, f , ∆) from DR 19 DR ← DR ∪ {(r, f + 1, ∆)} 20 else 21 DR ← DR ∪ {(ri , 1, bcurr − 1)} 22 end 23 else 24 DC ← DC ∪ {(ci , ai , 1, bcurr − 1)} 25 end /* Periodic cleanup */ 26 if N = 0 mod w then 27 foreach (a, f , ∆) ∈ DA s.t. f + ∆ ≤ bcurr do 28 Remove (a, f , ∆) from DA 29 end 30 foreach (c, a, f , ∆) ∈ DC s.t. f + ∆ ≤ bcurr do 31 Remove (c, a, f , ∆) from DC 32 end 33 foreach (r, f , ∆) ∈ DR s.t. f + ∆ ≤ bcurr do 34 Remove (r, f , ∆) from DR 35 end 36 end 37 N ← N + 1 38 Update the model as described in Section ??. For the directly follows relations, use the frequencies in DR . 39 end 20 of 20

Slide 40

Slide 40 text

Evaluation on BPI Challenge 2012 Precision vs Fitness Metric Precision (SW) 250500 1000 2000 Evalua�on log sizes 250 500 1000 2000 Window size 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 Fitness (SW) 250500 1000 2000 Evalua�on log sizes 250 500 1000 2000 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78 Precision (LCB) 250500 1000 2000 Evalua�on log sizes 250 500 1000 2000 Budget 0.74 0.76 0.78 0.8 0.82 0.84 0.86 Fitness (LCB) 250500 1000 2000 Evalua�on log sizes 250 500 1000 2000 0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 Precision (LC) 250500 1000 2000 Evalua�on log sizes 1e‐05 0.001 0.1 ε 0.75 0.8 0.85 0.9 0.95 1 Fitness (LC) 250500 1000 2000 Evalua�on log sizes 1e‐05 0.001 0.1 0.4 0.42 0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58 20 of 20

Slide 41

Slide 41 text

Evaluation on Artificial Dataset Space Distribution over Data Structures Space required by LCB (with B = 300) to store activities (DA), relations (DR) and cases (DC ) 0 50 100 150 200 250 300 0 2000 4000 6000 8000 10000 12000 14000 16000 No. sotred item Observed events Size of DA Size of DR Size of DC 20 of 20