Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimization of probabilistic argumentation with Markov processes

Emmanuel Hadoux
September 29, 2015

Optimization of probabilistic argumentation with Markov processes

Talk @International Joint Conference on Artificial Intelligence (IJCAI15) and Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes (JFPDA15) the 29/09/15.

Emmanuel Hadoux

September 29, 2015
Tweet

More Decks by Emmanuel Hadoux

Other Decks in Research

Transcript

  1. optimization of probabilistic argumentation with markov processes E. Hadoux1, A.

    Beynier1, N. Maudet1, P. Weng2 and A. Hunter3 Tue., Sept. 29th (1) Sorbonne Universités, UPMC Univ Paris 6, UMR 7606, LIP6, F-75005, Paris, France (2) SYSU-CMU Joint Institute of Engineering, Guangzhou, China SYSU-CMU Shunde International Joint Research Institute, Shunde, China (3) Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK
  2. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic

    executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) 1
  3. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic

    executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) ∙ Purpose of this work: optimize the sequence of arguments of one agent 1
  4. Introduction ∙ Debate argumentation problems between two agents ∙ Probabilistic

    executable logic to improve expressivity ∙ New class of problems: Argumentation Problem with Probabilistic Strategies (APS) (Hunter, 2014) ∙ Purpose of this work: optimize the sequence of arguments of one agent There will be abuse of the word predicate! 1
  5. Formalization of a debate problem ∙ Turn-based game between two

    agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge 3
  6. Formalization of a debate problem ∙ Turn-based game between two

    agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments 3
  7. Formalization of a debate problem ∙ Turn-based game between two

    agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks 3
  8. Formalization of a debate problem ∙ Turn-based game between two

    agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks ∙ P = 2A × 2E, the public space gathering voiced arguments 3
  9. Formalization of a debate problem ∙ Turn-based game between two

    agents ∙ Rules to fire in order to attack arguments of the opponent and revise knowledge Let us define a debate problem with: ∙ A, the set or arguments ∙ E, the set of attacks ∙ P = 2A × 2E, the public space gathering voiced arguments ∙ Two agents: agent 1 and agent 2 3
  10. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks:

    e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) 4
  11. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks:

    e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) 4
  12. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks:

    e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) 4
  13. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks:

    e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) ∙ Premises: conjunctions of e(, ), a(), hi() 4
  14. Notation ∙ Arguments: literals (e.g., a, b, c) ∙ Attacks:

    e(x, y) if x attacks y ∙ Args. in public (resp. private) space: a(x) (resp. hi(x)) ∙ Goals: ∧ k g(xk) (resp. g(¬xk)) if xk is (resp. is not) accepted in the public space (Dung, 1995) ∙ Rules: prem ⇒ Pr(Acts) ∙ Premises: conjunctions of e(, ), a(), hi() ∙ Acts: conjunctions of ⊞, ⊟ on e(, ), a() and ⊕, ⊖ on hi() 4
  15. Formalization of an APS An APS is characterized (from the

    point of view of agent 1) by ⟨A, E, G, S1, g1, g2, S2, P, R1, R2⟩: ∙ A, E, P as specified above ∙ G, the set of all possible goals ∙ Si , the set of private states for agent i ∙ gi ∈ G, the given goal for agent i ∙ Ri , the set of rules for agent i 5
  16. Example: Arguments Is e-sport a sport? a E-sport is a

    sport b E-sport requires focusing, precision and generates tiredness c Not all sports are physical d Sports not referenced by IOC exist e Chess is a sport f E-sport is not a physical activity g E-sport is not referenced by IOC h Working requires focusing and generates tiredness but is not a sport 6
  17. Example: Formalization ∙ A = {a, b, c, d, e,

    f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} 7
  18. Example: Formalization ∙ A = {a, b, c, d, e,

    f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} ∙ g1 = g(a) 7
  19. Example: Formalization ∙ A = {a, b, c, d, e,

    f, g, h} ∙ E = { e(f, a), e(g, a), e(b, f), e(c, f), e(h, b), e(g, c), e(d, g), e(e, g)} ∙ g1 = g(a) ∙ R1 = {h1(a) ⇒ ⊞a(a), h1(b) ∧ a(f) ∧ h1(c) ∧ e(b, f) ∧ e(c, f) ⇒ 0.5 : ⊞a(b) ∧ ⊞e(b, f) ∨ 0.5 : ⊞a(c) ∧ ⊞e(c, f), h1(d) ∧ a(g) ∧ h1(e) ∧ e(d, g) ∧ e(e, g) ⇒ 0.8 : ⊞a(e) ∧ ⊞e(e, g) ∨ 0.2 : ⊞a(d) ∧ ⊞e(d, g)} 7
  20. Example: Formalization ∙ R2 = {h2(h) ∧ a(b) ∧ e(h,

    b) ⇒ ⊞a(h) ∧ ⊞e(h, b), h2(g) ∧ a(c) ∧ e(g, c) ⇒ ⊞a(g) ∧ ⊞e(g, c), a(a) ∧ h2(f) ∧ h2(g) ∧ e(f, a) ⇒ 0.8 : ⊞a(f) ∧ ⊞e(f, a) ∨ 0.2 : ⊞a(g) ∧ ⊞e(g, a)} ∙ Initial state: h1(a, b, c, d, e), {}, h2(f, g, h) 8
  21. Attacks graph a g f c b d e h

    Figure: Graph of arguments of Example e-sport 9
  22. Probabilistic Finite State Machine: Graph APS → Probabilistic Finite State

    Machine σ1 start σ2 σ3 σ4 σ5 σ6 σ7 σ8 σ9 σ10 σ11 σ12 1 0.8 0.2 0.5 0.5 1 1 0.8 0.2 0.8 0.2 Figure: PFSM of Example e-sport 10
  23. Probabilistic Finite State Machine To optimize the sequence of arguments

    for agent 1, we could optimize the PFSM but: 11
  24. Probabilistic Finite State Machine To optimize the sequence of arguments

    for agent 1, we could optimize the PFSM but: 1. depends of the initial state 11
  25. Probabilistic Finite State Machine To optimize the sequence of arguments

    for agent 1, we could optimize the PFSM but: 1. depends of the initial state 2. requires knowledge of the private state of the opponent 11
  26. Probabilistic Finite State Machine To optimize the sequence of arguments

    for agent 1, we could optimize the PFSM but: 1. depends of the initial state 2. requires knowledge of the private state of the opponent Using Markov models, we can relax assumptions 1 and 2. Moreover, the APS formalization can be modified in order to comply with the Markov assumption. 11
  27. Markov Decision Process A Markov Decision Process (MDP) (Puterman, 1994)

    is characterized by a tuple ⟨S, A, T, R⟩: ∙ S, a set of states, ∙ A, a set of actions, ∙ T : S × A → Pr(S), a transition function, ∙ R : S × A → R, a reward function. 12
  28. Partially-Observable Markov Decision Process A Partially-Observable MDP (POMDP) (Puterman, 1994)

    is characterized by a tuple ⟨S, A, T, R, O, Q⟩: ∙ S, a set of states, ∙ A, a set of actions, ∙ T : S × A → Pr(S), a transition function, ∙ R : S × A → R, a reward function, ∙ O, an observation set, ∙ Q : S × A → Pr(O), an observation function. 13
  29. Mixed-Observability Markov Decision Process A Mixed-Observability MDP (MOMDP) (Ong et

    al., 2010) is characterized by a tuple ⟨Sv, Sh, A, T, R, Ov, Oh, Q⟩: ∙ Sv, Sh , a visible and hidden parts of the state, ∙ A, a set of actions, ∙ T : Sv × A × Sh → Pr(Sv × Sh), a transition function, ∙ R : Sv × A × Sh → R, a reward function, ∙ Ov = Sv, an observation set on the visible part of the state, ∙ Oh , an observation set on the hidden part of the state, ∙ Q : Sv × A × Sh → Pr(Ov × Oh), an observation function. 14
  30. Transformation to a MOMDP An APS from the point of

    view of agent 1 can be transformed to a MOMDP: ∙ Sv = S1 × P, Sh = S2 ∙ A = {prem(r) ⇒ m|r ∈ R1 and m ∈ acts(r)} ∙ Ov = Sv and Oh = ∅ ∙ Q(⟨sv, sh⟩, a, ⟨sv⟩) = 1, otherwise 0 ∙ T, see after 16
  31. Transformation to a MOMDP: Transition function Application set Let Cs(Ri)

    be the set of rules of Ri that can be fired in state s. The application set Fr(m, s) is the set of predicates resulting from the application of act m of a rule r on s. If r cannot be fired in s, Fr(m, s) = s. ∙ s, a state and r : p ⇒ m, an action s.t. r ∈ A ∙ s′ = Fr(m, s) ∙ r′ ∈ Cs′ (R2) s.t. r′ : p′ ⇒ [π1/m1, . . . , πn/mn] ∙ s′′ i = Fr′ (mi, s′) ∙ T(s, r, s′′ i ) = πi 17
  32. Reward function For the reward function: ∙ with Dung’s semantics:

    positive reward for each part holding ∙ can be generalized: General Gradual Valuation (Cayrol and Lagasquie-Schiex, 2005) 18
  33. Transformation to a MOMDP Model sizes: APS : 8 arguments,

    8 attacks, 6 rules POMDP : 4 294 967 296 states MOMDP : 16 777 216 states Untractable instances → need to optimize at the root 19
  34. Solving an APS Two algorithms to solve MOMDPs: ∙ MO-IP

    (Araya-López et al., 2010), IP of POMDP on MOMDP (exact method) ∙ MO-SARSOP (Ong et al., 2010), SARSOP of POMDP on MOMDP (approximate method albeit very efficient) Two kinds of optimizations: with or without dependencies on the initial state 21
  35. Optimizations without dependencies Irr. Prunes irrelevant arguments Enth. Infers attacks

    Dom. Removes dominated arguments Guarantee on the unicity and optimality of the solution. 22
  36. Attacks graph Argument dominance If an argument is attacked by

    any unattacked argument, it is dominated. a f g b c d e h Figure: Attacks graph of Example 23
  37. Optimization with dependencies Irr(s0) has to be reapplied each time

    the initial state changes. 1. For each predicate that is never modified but used as premises: 1.1 Remove all the rules that are not compatible with the value of this predicate in the initial state. 1.2 For all remaining rules, remove the predicate from the premises. 24
  38. Optimization with dependencies Irr(s0) has to be reapplied each time

    the initial state changes. 1. For each predicate that is never modified but used as premises: 1.1 Remove all the rules that are not compatible with the value of this predicate in the initial state. 1.2 For all remaining rules, remove the predicate from the premises. 2. For each remaining action of agent 1, track the rules of agent 2 compatible with the application of this action. If a rule of agent 2 is not compatible with any application of an action of agent 1, remove it. 24
  39. Experiments We computed a solution for the e-sport problem with:

    ∙ MO-IP, which did not finish after tens of hours ∙ MO-SARSOP without optimizations, idem ∙ MO-SARSOP with optimizations, 4sec for the optimal solution 26
  40. Experiments: Policy graph r1 1,1 start r1 2,2 r1 3,1

    ∅ r1 3,1 ∅ r1 2,2 ∅ ∅ o2 o5 o4 o6 o7 o8 o5 o1 o7 o8 o3 o3 o4 Figure: Policy graph for Example 27
  41. Experiments: More examples None Irr. Enth. Dom. Irr(s0). All Ex

    1 — — — — — 0.56 Ex 2 3.3 0.3 0.3 0.4 0 0 Dv. — — — — — 32 6 1313 22 43 7 2.4 0.9 7 — 180 392 16 20 6.7 8 — — — — 319 45 9 — — — — — — Table: Computation time (in seconds) 28
  42. Conclusion We presented: 1. A new framework to represent more

    complex debate problems (APS) 2. A method to transform those problems to a MOMDP 3. Several optimizations that can be used outside of the context of MOMDP 4. A method to optimize actions of an agent in an APS 30
  43. Perspectives We are currently working on using POMCP (Silver and

    Veness, 2010). We are also using HS3MDPs (Hadoux et al., 2014). 31
  44. Bibliography I Araya-López, M., Thomas, V., Buffet, O., and Charpillet,

    F. (2010). A closer look at MOMDPs. In 22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Cayrol, C. and Lagasquie-Schiex, M.-C. (2005). Graduality in argumentation. Journal of Artificial Intelligence Research (JAIR), 23:245–297. Dung, P. M. (1995). On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial Intelligence, 77(2):321–358. 33
  45. Bibliography II Hadoux, E., Beynier, A., and Weng, P. (2014).

    Solving Hidden-Semi-Markov-Mode Markov Decision Problems. In Straccia, U. and Calì, A., editors, Scalable Uncertainty Management, volume 8720 of Lecture Notes in Computer Science, pages 176–189. Springer International Publishing. Hunter, A. (2014). Probabilistic strategies in dialogical argumentation. In International Conference on Scalable Uncertainty Management (SUM’14) LNCS volume 8720. Ong, S. C., Png, S. W., Hsu, D., and Lee, W. S. (2010). Planning under uncertainty for robotic tasks with mixed observability. In The International Journal of Robotics Research. 34
  46. Bibliography III Puterman, M. L. (1994). Markov Decision Processes: discrete

    stochastic dynamic programming. John Wiley & Sons. Silver, D. and Veness, J. (2010). Monte-Carlo planning in large POMDPs. In Proceedings of the 24th Conference on Neural Information Processing Systems (NIPS), pages 2164–2172. 35