Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Framework for Online Conformance Checking

Andrea Burattin
September 10, 2017

A Framework for Online Conformance Checking

Conformance checking -- a branch of process mining -- focuses on establishing to what extent actual executions of a process are in line with the expected behavior of a reference model. Current conformance checking techniques only allow for a-posteriori analysis: the amount of (non-)conformant behavior is quantied after the completion of the process instance. In this paper we propose a framework for online conformance checking: not only we quantify (non-)conformant behavior as the execution is running, we also restrict the computation to constant time complexity per event analyzed, thus enabling the online analysis of a stream of events. The framework is instantiated with ideas coming from the theory of regions, and state similarity. An implementation is available in ProM and promising results have been obtained.

More info: https://andrea.burattin.net/publications/2017-bpi

Andrea Burattin

September 10, 2017
Tweet

More Decks by Andrea Burattin

Other Decks in Research

Transcript

  1. A Framework for Online Conformance Checking Andrea Burattin1 and Josep

    Carmona2 1 University of Innsbruck, Austria; now at Technical University Denmark 2 Universitat Politècnica de Catalunya, Spain This work was partially funded by the Spanish Ministry for Economy and Competitiveness (MINECO) and the EU (FEDER funds) under grant COMMAS (TIN2013-46181-C2-1-R).
  2. What is conformance checking? • Conformance checking • Input: a

    reference model and an execution trace • Output: the extent to which the observed behaviour conforms the given model • Quantification of the conformance as distance of the observed trace to the most similar execution allowed by the model (e.g., by using token replay or alignments) 2 van der Aalst, W. M. P. (2016). Process Mining (Second Ed.). Springer Berlin Heidelberg.
  3. What is online data/process mining? • In online processing, the

    input is a stream • Example of event stream: • Boxes represent events • Background colour is the case id • Letters are activity name 3 Data-based management systems Data-stream management systems Persistent relations Transient streams One-time queries Continuous queries Random access Sequential access Access plan determined by query processor and physical DB design Unpredictable data characteristics and arrival patterns Gama, J. (2010). Knowledge Discovery from Data Streams. Chapman and Hall/CRC. E F E G F E F G G H I H H E F E G F E F G G H I H H Time Case: Cx Case: Cy Case: Cz
  4. Event log and event stream 4 Event # Activity Originator

    Time Case id: C1 1 A U1 2017-09-01 … 2 B U1 2017-09-02 … 3 C U2 2017-09-03 … 4 E U2 2017-09-04 … Case id: C2 1 A U1 2017-09-02 … 2 B U1 2017-09-03 … 3 D U3 2017-09-04 … 4 E U3 2017-09-05 … Time Case id Activity Originator 2017-09-01 C1 A U1 … 2017-09-02 C2 A U1 … 2017-09-02 C1 B U1 … 2017-09-03 C1 C U2 … 2017-09-03 C2 B U1 … 2017-09-04 C1 E U2 … 2017-09-04 C2 D U3 … 2017-09-05 C2 E U3 … ... Time Time Time
  5. Online mining peculiarities • Peculiarities of the stream mining problem

    • Impossible to store the entire stream (approximation) • Unbounded backtracking not feasible over streams (algorithms required to make one pass on data: scale linearly wrt the number of processed items) • Deal with variable system conditions, such as fluctuating stream rates • Quickly adapt the model to cope with unusual data values (concept drifts) • Stream mining is a good candidate to analyze big data 5 van der Aalst, W. M. P. (2016). Process Mining (Second Ed.). Springer Berlin Heidelberg.
  6. Related work • Process mining for event stream • Some

    control flow discovery techniques • Adaptations of Heuristrics Miner • Adaptations of Declare discovery • Some social network mining approaches • Some prediction-based techniques • Conformance checking received attention only recently • Current conformance checking algorithms cannot be applied in stream context because • Computational complexity (search space / backtracking) • Need for a complete trace (to have optimal alignment) 6
  7. Table of contents • Introduction and problem presentation • Online

    conformance checking • General approach • OCTS properties • A possible OCTS construction approach • Online conformance • Implementation and experiments • Conclusion and future work 7
  8. General idea of the approach • Steps 0 and 1

    are offline (only once) • Step 2 is online (constraints to be enforced here) 8 Construction of enriched model (with region theory, state distance, and proper parameters configured) 1 1 1 Online Conformance Transition System (OCTS) 1 2 A 3 B 4 C 5 C B 6 D B C D C B D D A B,C A A C A A D D A B C D 1 2 A 3 B 4 C 5 C B 6 D Process model into transition system E F E G F E Event stream ... Online conformance checking 1 1 2 0
  9. Properties of OCTS • An OCTS represents all possible executions

    of process’ activities • Constructed from the transition system of a process (e.g., reachability graph) extended with additional transitions • Properties of an OCTS • Deterministic transition system • From each state, any activity can be followed • Each transition has a cost • Transitions allowed by the process have cost 0 • Transitions not allowed by the process have cost > 0 • Properties on an OCTS replay • Sum of costs of transitions followed by conformant trace = 0 • Sum of costs of transitions followed by non-conformant trace > 0 9 Suitable for stream processing Good for conformance checking
  10. Challenges related to OCTS • Main challenge: precompute all deviating

    paths in advance • Consider this model • We are in state 1, and we now observe . What to do now? • Ignore • Replay • Go to state 3 • Go to state 5 • Problem: partial information, assumptions needed 10
  11. Theory of regions • Given a transition system where •

    A subset of its states ′ • A transition label • We can define the following properties • nocross(, ′) • enter(, ′) • exit(, ′) 11 1 2 5 4 3 6 a c d b a 1 2 5 4 3 6 a c d b a 1 2 5 4 3 6 a c d b a
  12. Theory of regions (cont.) • Given a transition system, a

    subset of its states ′ is a region if • For all ∈ Σ (all transition labels) 1. enter(, ′) ⇒ ¬ nocross(, ′) ∧ ¬ exit(, ′) 2. exit(, ′) ⇒ ¬ nocross(, ′) ∧ ¬ enter(, ′) • Considering the previous example (Σ = {, , , }) • Regarding : 1 and 2 are both fulfilled (enter and ¬ nocross and ¬ exit) • Regarding : 1 and 2 are both fulfilled (¬ enter and nocross and ¬ exit) • Regarding : 1 and 2 are both fulfilled (¬ enter and nocross and ¬ exit) • Regarding : 1 and 2 are both fulfilled (¬ enter and ¬ nocross and exit) 12 1 2 5 4 3 6 a c d b a
  13. Construction of an OCTS • Given the current state ,

    and an activity to execute • We select log move if within any of the regions of there’s a transition labelled • Rationale: if states of the regions interact with but does not cross the region, then the local state of the system does not change • Log move implies staying in the same state • Example • From state 3, we want to execute 13
  14. Construction of an OCTS (cont.) • Given the current state

    , and an activity to execute • We select model moves + synchronous move if there is a transition labelled in the model • To add the new transition 1. Select the candidate states for the synchronous move 2. Choose the candidate maximizing cosine similarity of vector representations of corresponding states and target 3. Add the chosen transition, and set its cost > 0 • Example: state 1, activity 14
  15. Construction of an OCTS (cont.) • Vector representation (, )

    of state with target activity • dimensions, where = |Σ| (i.e., number of different transition labels) • Each component of the vector refers to a label Σ • The value of is • 1 if the shortest path from start state to contains label Σ • 0 otherwise (including the case Σ = ) • Example: state 1, activity A B C D E F G • 3, = 1 1 0 0 0 0 0 • 5, = [1 1 0 1 0 0 0] • 1, = [1 0 0 0 0 0 0] • ൗ 1, ⋅(3,) (1,) (3,) = 0.71 • ൗ 1, ⋅(5,) (1,) (5,) = 0.58 15 C
  16. Given a model, is the OCTS unique? • The construction

    of the OCTS is not unique • Different policies for different scenarios • Our approach tries to partially mimic the concept of alignments • Deviations might be extremely impactful • A no-deviations allowed extension 16
  17. Online conformance algorithm Input: the OCTS : the event stream

    : max number of running process instances at the same time M ← new map // given a case id returns current state, cost, last update time forever do read the event (, ) from // is activity name, is cost , , ← () // process instance status , ← OCTS(, ) // new state and cost of it ← ( , + , ) // replay if || > then // cleanup of remove oldest executions from end end 17 1: 2: 3: 4: 5: 6: 8: 9: 10: 11:
  18. Experimental evaluation • We tested our implementation on a realistic

    process model simulated with PLG2 • 26 tasks • 20 gateways 19 Enter Request Calculate Available Funds Calculate Annual Income Calculate Required Funds Prepare Offer Send Rejection Letter Close Application Make Deposit Customer Accepts Conditions Mortgage is Approved Mortgage is rejected Customer Does Not Accept Conditions Check if all Info is Available Contact Customer Information not complete Information complete Make Decision Single Employee Make Decision Employee 2 Make Decision Employee 1 Meet and Discuss >= 1.000.000 < 1.000.000 Evaluate Respone Forms Contact Customer Evaluate Customer Response Update Mortgage Request Send Letter Update Offer Do Not Update Offer Query central Database Register Application Locally Inform Headquarter Send Rejection Letter More Than One Mortgage Single Active Mortgage No Active Mortgage Assess Mortgage Value to Prop. Value Assess Employment Check Payment History
  19. Experimental evaluation (cont.) • We let PLG2 run for 1

    hour and 10 minutes of wall-clock time • Emission rate: about 65 events / second • In total: 256110 events generated 20
  20. Conclusion and future work • Approach for online conformance checking

    • Offline generation of OCTS • Input is a transition system (e.g., reachability graph) • Alignment-based technique to add additional transitions • Online conformance checking of the OCTS • Future work • Understand the quality of the current OCTS generation policy • Exploit contextual information (i.e., data) • Merge technique with other online conformance techniques 21