Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Heuristics Miner for Time Intervals

Heuristics Miner for Time Intervals

Process Mining attempts to reconstruct the workflow of a business process from logs of activities. This task is quite important in business scenarios where there is not a well understood and structured definition of the business process performed by workers. Activities logs are thus mined in the attempt to reconstruct the actual business process. In this paper, we propose the generalization of a popular process mining algorithm, named Heuristics Miner, to time intervals. We show that the possibility to use, when available, time interval information for the performed activities allows the algorithm to produce better workflow models.

More info: http://andrea.burattin.net/publications/2010-esann

Andrea Burattin

April 28, 2010
Tweet

More Decks by Andrea Burattin

Other Decks in Science

Transcript

  1. Heuristics Miner for Time Intervals
    Andrea Burattin and Alessandro Sperduti
    Department of Pure and Applied Mathematics
    University of Padua, Italy
    April 28th, 2010
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  2. Slide 2 of 16
    What is Process Mining I
    Business Process
    From the IEEE Glossary: “a sequence of steps performed for a
    given purpose; for example, the software development process”,
    that changes inputs into outputs.
    Order received
    Goods
    available
    Goods wrapping
    Shipping note
    Shipping
    Each performed action is registered into a log
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  3. Slide 3 of 16
    What is Process Mining II
    Main process mining areas
    When the model of the process is not available:
    Control-flow discovery aims to build a model describing the
    behaviour of the process;
    When the model of the process is available:
    Conformance analysis tries to fit a log to the given process model.
    Independent from the process model availability:
    Organizational mining tries to extract a “social network” that
    establishes relations between actions’ authors;
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  4. Slide 4 of 16
    What is Process Mining III
    Extracted models
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  5. Slide 5 of 16
    Control–flow discovery example run
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  6. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  7. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    received
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00 #1 > #2
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  8. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    received
    Goods
    available
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30 #2 > #3
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  9. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15 #3 > #4
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  10. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  11. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    reminder
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45 #1 > #2
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  12. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    reminder
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31 #2 > #3
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  13. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    reminder
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00 #3 > #4
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  14. Slide 5 of 16
    Control–flow discovery example run
    Order
    received
    Payment
    reminder
    Payment
    received
    Goods
    available
    Shipping
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 10:00
    5 Shipping apr 26, 2010 12:30 #4 > #5
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  15. Slide 6 of 16
    Control–flow discovery example run
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 12:30
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  16. Slide 6 of 16
    Control–flow discovery example run
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Payment reminder apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Goods available apr 26, 2010 12:30
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  17. Slide 6 of 16
    Control–flow discovery example run
    # Activities Completion Time
    Instance 1
    1 Order received apr 21, 2010 12:00
    2 Payment received apr 22, 2010 09:00
    3 Goods available apr 26, 2010 08:30
    4 Shipping apr 26, 2010 10:15
    Instance 2
    1 Order received apr 23, 2010 15:45
    2 Goods available apr 25, 2010 15:45
    3 Payment received apr 25, 2010 17:31
    4 Payment reminder apr 26, 2010 12:30
    5 Shipping apr 26, 2010 12:30
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  18. Slide 7 of 16
    Heuristics Miner, core behaviour I
    Heuristics Miner evaluates a “dependency function” between two
    activities (e.g. X, Y ), in order to decide if the relationship holds:
    X ⇒ Y =
    |X > Y | − |Y > X|
    |X > Y | + |Y > X| + 1
    Where:
    X > Y holds if X executed at time t and Y at t + 1
    |X > Y | is the number of times that X > Y holds in the log
    With all the relations above a threshold, the algorithm builds a
    directed graph with all the dependencies.
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  19. Slide 8 of 16
    Heuristics Miner, core behaviour II
    We can have both X ⇒ Y and X ⇒ Z:
    X
    Y
    Z
    Y and Z can be executed in mutual exclusion (XOR) or in parallel
    (in no specific order; AND)
    X ⇒ (Y ∧ Z) =
    |Y > Z| + |Z > Y |
    |X > Y | + |X > Z| + 1
    If the value is above a threshold than AND relation else XOR
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  20. Slide 9 of 16
    Time intervals in the logs
    Many times, activities are stored inside the log in terms of many
    “sub-activities”, composing the main one:
    Start
    End
    Main ac�vity
    Sub‐ac�vity 1
    Sub‐ac�vity 2
    Sub‐ac�vity n‐1
    Sub‐ac�vity n
    t
    Considering the first and the last sub-activity, we can build a time
    interval for the main activity.
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  21. Slide 10 of 16
    Information on the intervals vs time spot
    Allen’s Interval Algebra: the “overlap relation”
    A
    B
    C
    D
    D
    C
    B
    A
    A
    B
    C
    D
    A B C D
    Events as �me intervals
    Instantaneous events
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  22. Slide 11 of 16
    New definitions, with time intervals support I
    Heuristics Miner a > b direct succession of points
    Heuristics Miner++ a > b direct succession of intervals
    A
    B
    C
    C
    B
    A
    A > B holds A > B does not hold A > C holds
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  23. Slide 12 of 16
    New definitions, with time intervals support II
    Direct succession, A > B
    A B
    ti
    tk
    tj
    C
    Parallelism (overlap relation), A B
    A
    B
    ti
    tu
    tj
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  24. Slide 13 of 16
    New definitions, with time intervals support III
    Dependency function, for time intervals
    X ⇒ Y =
    |X > Y | − |Y > X|
    |X > Y | + |Y > X| + 2 · |X Y | + 1
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  25. Slide 13 of 16
    New definitions, with time intervals support III
    Dependency function, for time intervals
    X ⇒ Y =
    |X > Y | − |Y > X|
    |X > Y | + |Y > X| + 2 · |X Y | + 1
    AND function, for time intervals
    X ⇒ (Y ∧ Z) =
    |Y > Z| + |Z > Y | + 2 · |Y Z|
    |X > Y | + |X > Z| + 1
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  26. Slide 14 of 16
    Results for a real case
    Result for
    Heuristics Miner
    Activity 0
    Activity 4
    Activity 20
    Activity 10
    Activity 30
    Activity 23
    Activity 11
    Activity 22
    Activity 32
    Activity 31
    Activity 21
    Log composed of 1465 log traces (in H.M., considering only the starting event)
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  27. Slide 14 of 16
    Results for a real case
    Result for
    Heuristics Miner
    Activity 0
    Activity 4
    Activity 20
    Activity 10
    Activity 30
    Activity 23
    Activity 11
    Activity 22
    Activity 32
    Activity 31
    Activity 21
    Result for
    Heuristics Miner++
    Activity 11
    Activity 4
    Activity 21
    Activity 0
    Activity 10 Activity 30
    Activity 20
    Activity 22
    Activity 31
    Activity 32 Activity 23
    Log composed of 1465 log traces (in H.M., considering only the starting event)
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  28. Slide 15 of 16
    Results for an artificial dataset
    0 10 20 30 40 50 60 70 80 90 100
    0
    0,1
    0,2
    0,3
    0,4
    0,5
    0,6
    0,7
    0,8
    0,9
    1
    Percentage of activities as intervals
    F1 measure average, min and max
    F1 = 2 ·
    p · r
    p + r
    p =
    tp
    tp + fp
    r =
    tp
    tp + fn
    Test data: 100 random processes logs with 6 activities
    tp: correctly mined dependencies
    fp: dependencies present in the original model but not in the mined one
    fn: dependencies present in mined model but not in the original one
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide

  29. Slide 16 of 16
    Conclusions and future works
    What we achieved:
    We considered each activity as a time interval
    Added the “notion” of time intervals into the Heuristics Miner
    algorithm
    The new version of the algorithm is “backward compatible”
    Possible future works:
    Test of the algorithm against more (and bigger) processes
    Autonomous identification of best parameters’ values
    Support for “noise” into the time intervals, example:
    equal to
    Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

    View Slide