$30 off During Our Annual Pro Sale. View Details »

Lazy Abstraction for Markov Decision Processes

Lazy Abstraction for Markov Decision Processes

This presentation introduces our ongoing work on a novel lazy abstraction technique for Markov Decision Process (MDP) analysis using adaptive simulation graphs. Lazy abstraction is an abstraction technique that combines exploring the abstract state space and refining it into a single process, refining only parts of the state space on demand. As this has good synergies with partial state-space exploration techniques, we also propose combining our lazy abstraction algorithm and Bounded Real-Time Dynamic Programming to compute the numerical information on the fly while constructing the abstraction. It was presented at the Alpine Verification Meeting 2023.

Critical Systems Research Group

September 26, 2023
Tweet

More Decks by Critical Systems Research Group

Other Decks in Research

Transcript

  1. Lazy Abstraction for Markov
    Decision Processes
    Dániel Szekeres
    https://ftsrg.mit.bme.hu/
    Lazy Abstraction for MDPs

    View Slide

  2. Lazy Abstraction for MDPs 2
    System under
    analysis
    External
    systems
    User
    behavior
    Physical
    environment
    Component
    failures
    Probabilistic
    behavior
    Non-deterministic
    behavior
    Context: Reliability analysis

    View Slide

  3. Lazy Abstraction for MDPs 3
    Markov Decision Processes (MDP)
    Multiple actions
    available in each state
    → Non-deterministic
    behavior
    Resulting state sampled
    from a distribution
    → Probabilistic behavior
    Discrete set of states
    Commonly described
    through higher-level
    formalisms

    View Slide

  4. Lazy Abstraction for MDPs 4
    • A set of state variables
    • A set of commands, each having:
    – A Boolean guard expression over the state variables
    – A probability distribution over effects changing the variables
    Probabilistic Guarded Commands

    View Slide

  5. Lazy Abstraction for MDPs 5
    State-space explosion
    Exponentially large
    state space in the
    description size
    Hinders verifying
    complex systems
    in practice
    Exacerbated by
    numerical
    computations in
    probabilistic model
    checking

    View Slide

  6. 6
    • Stop exploring new states
    when enough information is
    available
    Counteracting state space explosion
    • Merges similar concrete states
    into abstract states
    • Needs to be conservative
    Partial state space exploration Abstraction
    Lazy Abstraction for MDPs

    View Slide

  7. 7
    Counteracting state space explosion
    Partial state space exploration + Abstraction
    Lazy Abstraction for MDPs
    • Explore only a part of the abstract
    state space
    • Already used in non-probabilistic
    abstraction-based model-checking
    • Not in probabilistic model-checking
    – Existing MDP abstraction-refinement
    algorithms rely on the whole abstract
    state space
    – Lazy abstraction synergizes much
    better with partial exploration
    → needs to be adapted for MDPs

    View Slide

  8. Partial state-space
    exploration for
    MDPs: BRTDP
    Lazy Abstraction for MDPs 10

    View Slide

  9. Lazy Abstraction for MDPs 11
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.0, 1.0]

    View Slide

  10. Lazy Abstraction for MDPs 12
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.0, 1.0]

    View Slide

  11. Lazy Abstraction for MDPs 13
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [0.5, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.25, 1.0]

    View Slide

  12. Lazy Abstraction for MDPs 14
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [0.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [0.5, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.25, 1.0]

    View Slide

  13. Lazy Abstraction for MDPs 15
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [1.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.5, 1.0]

    View Slide

  14. Lazy Abstraction for MDPs 16
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [1.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.5, 1.0]

    View Slide

  15. Lazy Abstraction for MDPs 17
    Bounded Real-Time Dynamic Programming (BRTDP)
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [1.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [1.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [1.0, 1.0]

    View Slide

  16. Lazy abstraction
    for MDPs
    Lazy Abstraction for MDPs 18

    View Slide

  17. 19
    CounterExample-Guided Abstraction Refinement
    Lazy Abstraction for MDPs
    Concretize
    counterexample
    Construct
    abstract
    model
    Check
    abstract
    model
    Refine
    precision
    Output:
    Property
    violated
    Output:
    Property
    satisfied
    Property
    satisfied?
    Concretizable?
    Yes
    No
    Yes
    No
    Starts with trivial
    abstraction
    Based on the
    counterexample

    View Slide

  18. Lazy Abstraction for MDPs 20
    • Builds on the idea of CEGAR
    • Merged abstract exploration and
    refinement
    • Precision is local to each node in
    the abstract state graph
    • Refinement is performed locally
    on the required nodes
    • Better suited for combination
    with BRTDP than non-lazy
    probabilistic CEGAR approaches
    Lazy abstraction
    Π0
    Π0
    Π0
    Π1
    Π1
    Π1
    Π1
    Π1

    View Slide

  19. Lazy Abstraction for MDPs 21
    • Several different lazy abstraction implementations (BLAST, Impact, etc.)
    → We use an Adaptive Simulation Graph-based version
    • Abstract model: Probabilistic Adaptive Simulation Graph (PASG)
    • Domain-agnostic in general
    • Currently implemented with Explicit Value Abstraction:
    Some variables are tracked exactly, others are unknown
    Lazy abstraction for MDPs
    x=0,
    y=0
    x=0,
    y=1
    x=0,
    y=?
    x=1,
    y=0
    x=1,
    y=1
    x=1,
    y=?

    View Slide

  20. Lazy Abstraction for MDPs 22
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    Initial node:
    - concrete label is the
    concrete initial state
    - abstract label is as coarse
    as possible
    Probabilistic Adaptive Simulation Graph (ASG):
    - Nodes are labeled by a concrete state
    - and an abstract state (describing a set of
    concrete states) that contains it
    - The concrete state represents all states in the
    abstract state w.r.t. available “behaviors” (action
    sequences)

    View Slide

  21. Lazy Abstraction for MDPs 23
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    Expansion:
    - Select an action enabled in
    the concrete state
    - Compute the image of the
    concrete state
    - Overapproximate the image
    of the abstract state
    Adaptive Simulation Graph (ASG):
    - Nodes are labeled by a concrete state
    - and an abstract state (describing a set of concrete
    states) that contains it
    - The concrete state represents all states in the abstract
    state w.r.t. available “behaviors” (action sequences)
    Probabilistic Adaptive Simulation Graph (ASG):
    - Nodes are labeled by a concrete state
    - and an abstract state (describing a set of
    concrete states) that contains it
    - The concrete state represents all states in the
    abstract state w.r.t. available “behaviors” (action
    sequences)

    View Slide

  22. Lazy Abstraction for MDPs 24
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    If an action is not enabled in
    any part of the abstract
    state, it is ignored
    X
    𝑐3
    Expansion:
    - Select an action enabled in
    the concrete state
    - Compute the image of the
    concrete state
    - Overapproximate the image
    of the abstract state

    View Slide

  23. Lazy Abstraction for MDPs 25
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    𝑐𝑜𝑣𝑒𝑟
    Covering:
    - If the new concrete state after expansion is already
    contained in another abstract state
    - A cover edge is created
    - Expansion of the covered node can be skipped

    View Slide

  24. Lazy Abstraction for MDPs 26
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    𝑐𝑜𝑣𝑒𝑟
    𝑛5
    𝐿𝑐
    : 𝑥 = 2, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 2
    𝑛4
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑐1
    0.2 0.8

    View Slide

  25. Lazy Abstraction for MDPs 27
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    𝑐𝑜𝑣𝑒𝑟
    𝑛5
    𝐿𝑐
    : 𝑥 = 2, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 2
    𝑛4
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑐1
    0.2 0.8
    X
    , 𝑦 = 0
    , 𝑦 = 0
    , 𝑦 = 0
    X
    𝑐3
    , 𝑦 = 0
    , 𝑦 = 0

    View Slide

  26. Lazy Abstraction for MDPs 28
    PASG versions
    Upper-cover:
    • Direct adaptation of the
    original ASG for MDPs
    • Action that might be enabled
    somewhere in the abstract
    label must be enabled in the
    concrete
    • Upper approximation

    View Slide

  27. Lazy Abstraction for MDPs 29
    PASG versions
    Lower-cover:
    • Inverted representativity
    requirement
    • Action disabled somewhere in
    the abstract label must be
    disabled in the concrete
    • Lower approximation

    View Slide

  28. Lazy Abstraction for MDPs 30
    PASG versions
    Bi-cover:
    • Combines the upper- and
    lower-cover constraints
    • Provides exact numerical
    results
    • Resulting value is
    independent of the order of
    exploration

    View Slide

  29. Lazy Abstraction for MDPs 31
    Quantitative Analysis – Full Exploration
    0 1
    Upper-cover
    Lower-cover
    Original (concrete)
    • Construct full PASG → Analyze it as an MDP
    • Cover edges are deterministic actions
    • Any MDP analysis algorithm can be applied (value iteration
    variants, policy iteration, linear programming, …)
    • Provable guarantees for the target probability:
    Bi-cover

    View Slide

  30. Lazy abstraction
    + BRTDP
    Lazy Abstraction for MDPs 32

    View Slide

  31. Lazy Abstraction for MDPs 33
    BRTDP reminder
    [1.0, 1.0]
    [0.0, 0.0]
    [0.0, 1.0]
    [1.0, 1.0]
    [0.0, 1.0]
    p=0.5
    Simulate traces
    → update only simulated states
    Maintain both a lower and an upper
    value approximation
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 1.0]
    Iterate until convergence:
    Initial state has small enough interval
    [0.5, 1.0]

    View Slide

  32. Lazy Abstraction for MDPs 34
    • Uses BRTDP for analysis
    • Merges PASG construction and numeric computations
    • PASG nodes are constructed during trace simulation
    Quantitative Analysis – On-the-fly
    Less states
    explored
    Less
    inconsistencies
    Coarser
    abstract
    labels
    Smaller
    abstract
    state space

    View Slide

  33. Lazy Abstraction for MDPs 35
    • Provable guarantees:
    • Convergence for finite state spaces: PASG is finished after a
    finite number of traces + BRTDP convergence results applied to
    the finished PASG
    • Guarantees for the target probability:
    Quantitative Analysis – On-the-fly
    0 1
    Upper-cover
    Lower-cover
    Original (concrete)
    Bi-cover

    View Slide

  34. Lazy Abstraction for MDPs 36
    Correctness of the on-the-fly analysis
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    𝑐𝑜𝑣𝑒𝑟
    𝑛5
    𝐿𝑐
    : 𝑥 = 2, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 2
    𝑛4
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑐1
    0.2 0.8

    View Slide

  35. 𝑛3
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 2
    𝐿𝑎
    : 𝑥 = 1
    Lazy Abstraction for MDPs 37
    𝑛0
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    𝑛2
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑛1
    𝐿𝑐
    : 𝑥 = 0, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 0
    0.2 0.8 1.0
    𝑐1
    𝑐2
    𝑐𝑜𝑣𝑒𝑟
    𝑛5
    𝐿𝑐
    : 𝑥 = 2, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 2
    𝑛4
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑐1
    0.2 0.8
    , 𝑦 = 0
    , 𝑦 = 0
    , 𝑦 = 0
    𝑐3
    Correctness of the on-the-fly analysis
    Refinement when n
    5
    is expanded
    → the cover edge is removed
    → the green trace does not exist in
    the finished PASG
    𝑛5
    𝐿𝑐
    : 𝑥 = 2, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 2
    𝑛4
    𝐿𝑐
    : 𝑥 = 1, 𝑦 = 0
    𝐿𝑎
    : 𝑥 = 1
    𝑐1
    0.2 0.8
    But an equivalent
    one exists!

    View Slide

  36. Lazy Abstraction for MDPs 38
    (Preliminary) Measurements on the QComp benchmarks

    View Slide

  37. Lazy Abstraction for MDPs 39
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    [1.0, 1.0]
    [0.0, 0.0]
    Current state:
    • Upper/lower/bi-cover PASG
    • Full construction / BRTDP
    • Only for maximal probability
    • Explicit Value Domain
    → Implemented in the Theta
    model checker
    Future work:
    Predicate domain

    View Slide

  38. Thank you for your attention

    View Slide