Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Heuristics Miner for Time Intervals

Heuristics Miner for Time Intervals

Process Mining attempts to reconstruct the workflow of a business process from logs of activities. This task is quite important in business scenarios where there is not a well understood and structured definition of the business process performed by workers. Activities logs are thus mined in the attempt to reconstruct the actual business process. In this paper, we propose the generalization of a popular process mining algorithm, named Heuristics Miner, to time intervals. We show that the possibility to use, when available, time interval information for the performed activities allows the algorithm to produce better workflow models.

More info: http://andrea.burattin.net/publications/2010-esann

Andrea Burattin

April 28, 2010
Tweet

More Decks by Andrea Burattin

Other Decks in Science

Transcript

  1. Heuristics Miner for Time Intervals Andrea Burattin and Alessandro Sperduti

    Department of Pure and Applied Mathematics University of Padua, Italy April 28th, 2010 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  2. Slide 2 of 16 What is Process Mining I Business

    Process From the IEEE Glossary: “a sequence of steps performed for a given purpose; for example, the software development process”, that changes inputs into outputs. Order received Goods available Goods wrapping Shipping note Shipping Each performed action is registered into a log Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  3. Slide 3 of 16 What is Process Mining II Main

    process mining areas When the model of the process is not available: Control-flow discovery aims to build a model describing the behaviour of the process; When the model of the process is available: Conformance analysis tries to fit a log to the given process model. Independent from the process model availability: Organizational mining tries to extract a “social network” that establishes relations between actions’ authors; Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  4. Slide 4 of 16 What is Process Mining III Extracted

    models Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  5. Slide 5 of 16 Control–flow discovery example run # Activities

    Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  6. Slide 5 of 16 Control–flow discovery example run Order received

    # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  7. Slide 5 of 16 Control–flow discovery example run Order received

    Payment received # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 #1 > #2 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  8. Slide 5 of 16 Control–flow discovery example run Order received

    Payment received Goods available # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 #2 > #3 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  9. Slide 5 of 16 Control–flow discovery example run Order received

    Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 #3 > #4 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  10. Slide 5 of 16 Control–flow discovery example run Order received

    Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  11. Slide 5 of 16 Control–flow discovery example run Order received

    Payment reminder Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 #1 > #2 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  12. Slide 5 of 16 Control–flow discovery example run Order received

    Payment reminder Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 #2 > #3 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  13. Slide 5 of 16 Control–flow discovery example run Order received

    Payment reminder Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 #3 > #4 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  14. Slide 5 of 16 Control–flow discovery example run Order received

    Payment reminder Payment received Goods available Shipping # Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 #4 > #5 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  15. Slide 6 of 16 Control–flow discovery example run # Activities

    Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  16. Slide 6 of 16 Control–flow discovery example run # Activities

    Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  17. Slide 6 of 16 Control–flow discovery example run # Activities

    Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Goods available apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Payment reminder apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  18. Slide 7 of 16 Heuristics Miner, core behaviour I Heuristics

    Miner evaluates a “dependency function” between two activities (e.g. X, Y ), in order to decide if the relationship holds: X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 1 Where: X > Y holds if X executed at time t and Y at t + 1 |X > Y | is the number of times that X > Y holds in the log With all the relations above a threshold, the algorithm builds a directed graph with all the dependencies. Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  19. Slide 8 of 16 Heuristics Miner, core behaviour II We

    can have both X ⇒ Y and X ⇒ Z: X Y Z Y and Z can be executed in mutual exclusion (XOR) or in parallel (in no specific order; AND) X ⇒ (Y ∧ Z) = |Y > Z| + |Z > Y | |X > Y | + |X > Z| + 1 If the value is above a threshold than AND relation else XOR Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  20. Slide 9 of 16 Time intervals in the logs Many

    times, activities are stored inside the log in terms of many “sub-activities”, composing the main one: Start End Main ac�vity Sub‐ac�vity 1 Sub‐ac�vity 2 Sub‐ac�vity n‐1 Sub‐ac�vity n t Considering the first and the last sub-activity, we can build a time interval for the main activity. Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  21. Slide 10 of 16 Information on the intervals vs time

    spot Allen’s Interval Algebra: the “overlap relation” A B C D D C B A A B C D A B C D Events as �me intervals Instantaneous events Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  22. Slide 11 of 16 New definitions, with time intervals support

    I Heuristics Miner a > b direct succession of points Heuristics Miner++ a > b direct succession of intervals A B C C B A A > B holds A > B does not hold A > C holds Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  23. Slide 12 of 16 New definitions, with time intervals support

    II Direct succession, A > B A B ti tk tj C Parallelism (overlap relation), A B A B ti tu tj Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  24. Slide 13 of 16 New definitions, with time intervals support

    III Dependency function, for time intervals X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 2 · |X Y | + 1 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  25. Slide 13 of 16 New definitions, with time intervals support

    III Dependency function, for time intervals X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 2 · |X Y | + 1 AND function, for time intervals X ⇒ (Y ∧ Z) = |Y > Z| + |Z > Y | + 2 · |Y Z| |X > Y | + |X > Z| + 1 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  26. Slide 14 of 16 Results for a real case Result

    for Heuristics Miner Activity 0 Activity 4 Activity 20 Activity 10 Activity 30 Activity 23 Activity 11 Activity 22 Activity 32 Activity 31 Activity 21 Log composed of 1465 log traces (in H.M., considering only the starting event) Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  27. Slide 14 of 16 Results for a real case Result

    for Heuristics Miner Activity 0 Activity 4 Activity 20 Activity 10 Activity 30 Activity 23 Activity 11 Activity 22 Activity 32 Activity 31 Activity 21 Result for Heuristics Miner++ Activity 11 Activity 4 Activity 21 Activity 0 Activity 10 Activity 30 Activity 20 Activity 22 Activity 31 Activity 32 Activity 23 Log composed of 1465 log traces (in H.M., considering only the starting event) Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  28. Slide 15 of 16 Results for an artificial dataset 0

    10 20 30 40 50 60 70 80 90 100 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 Percentage of activities as intervals F1 measure average, min and max F1 = 2 · p · r p + r p = tp tp + fp r = tp tp + fn Test data: 100 random processes logs with 6 activities tp: correctly mined dependencies fp: dependencies present in the original model but not in the mined one fn: dependencies present in mined model but not in the original one Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals
  29. Slide 16 of 16 Conclusions and future works What we

    achieved: We considered each activity as a time interval Added the “notion” of time intervals into the Heuristics Miner algorithm The new version of the algorithm is “backward compatible” Possible future works: Test of the algorithm against more (and bigger) processes Autonomous identification of best parameters’ values Support for “noise” into the time intervals, example: equal to Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals