Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Adversarial Search 2012

Mat Roscoe
November 29, 2012

Adversarial Search 2012

A lecture for the course principles of cognitive robotics at Bonn-Rhein-Sieg University of Applied Science.

Mat Roscoe

November 29, 2012
Tweet

More Decks by Mat Roscoe

Other Decks in Education

Transcript

  1. Slide  1   b-­‐it-­‐bots   Adversarial Search Matthew S Roscoe

    Principles of Cognitive Robotics November 29th 2012
  2. Slide  2   b-­‐it-­‐bots   Questions From You: •  Politely

    Interrupt me. •  Raising Hand, Speak Up, etc… From Me: •  I will randomly pick someone using the numbers you were given. We will not move on until we can sum up the question. (no man/woman left behind) •  I need someone to record the questions and answers given during the class so that I can verify the answers and redistribute them as review material.
  3. Slide  3   b-­‐it-­‐bots   Material Sources: •  Artificial Intelligence

    a Modern Approach (AIMA) 2nd Edition (CH6) •  AIMA 3rd Edition (CH5) •  Game Theory for Applied Economics •  Robert Gibson (Princeton University Press) •  PDF (pre-print) Available on request.
  4. Slide  4   b-­‐it-­‐bots   Adversarial Search Examining the problems

    that occur when we try to plan ahead in a world where other agents are planning against us.
  5. Slide  5   b-­‐it-­‐bots   Requirements: •  Assumed Understanding: • 

    View actors of problems as agents (ch2) •  View problems as games (game theory) •  Recursion (Basic Computer Science) •  New Knowledge: •  Use searching to “win” games.
  6. Slide  6   b-­‐it-­‐bots   (REVIEW) What are Agents? • 

    The “thing” trying to solve the problem. •  It perceives and acts in its environment. •  An “agent function” is the agents response to a “percept sequence” •  Performance measures determine how “well” the agent is doing (happiness). •  Reflex vs. Model Based. •  Room for improvement through learning (See this in a later course)
  7. Slide  7   b-­‐it-­‐bots   (REVIEW) What are Games? • 

    Representing a “real” world scenario and all of the possible decisions. •  Multi-Agent! •  Mapping decisions (agent functions) to outcomes (performance measures) •  Trying to pick a grouping of decisions that lead to the best possible outcome (not always winning)
  8. Slide  8   b-­‐it-­‐bots   EXAMPLE Game Prisoners Dilemma • 

    Two Prisoners in Separate Rooms. •  Both are being interrogated and are offered two choices: •  Stay Quiet or Confess •  Represent the payoff for each: Player 2 Player 1 Quiet Confess Quiet 1: 1month 2: 1month 1: 1 Year 2: Goes Free Confess 1: Goes Free 2: 1 Year 1: 3 months 2: 3months
  9. Slide  9   b-­‐it-­‐bots   How Do We Use This?

    Problems we have seen so far: •  Sudoku •  Traveling Salesman •  N-Puzzle Problems •  Etc…
  10. Slide  11   b-­‐it-­‐bots   What Does This Mean? • 

    We can no longer make decisions the same way we did before. •  We need to consider what the other players (agent(s)) might do. •  This will affect our searching strategies!
  11. Slide  12   b-­‐it-­‐bots   Adversarial Search We will be

    searching through the state-space for a specific type of games namely: •  Game Theory: •  Two Player, Deterministic, co-operative/competitive, Turn taking, Zero-sum Games of Perfect Information. •  Artificial Intelligence: •  Two-Agent, Deterministic, (competitive), Turn Taking, Zero-sum, Fully Observable
  12. Slide  13   b-­‐it-­‐bots   FORMAL DEFINITION •  Adversarial Search

    Game: •  A Search problem that contains the following elements: •  Initial State •  Player(s) •  Action(s) •  Result(s) •  Terminal Test •  Utility Function (payoff function, objective function)
  13. Slide  14   b-­‐it-­‐bots   Direction •  Where are we

    going with this: •  Min-Max Search Algorithm •  Alpha-Beta Search Algorithm •  Heavy focus on parsing with this one! •  Chance & Imperfection •  If we have time •  !!You still need to cover this for the course!!
  14. Slide  15   b-­‐it-­‐bots   MiniMax Searching •  Prisoners dilemma

    was a “small” game. •  Two players, Two Options •  4 possible outcomes •  A possible outcome is called a “Terminal State” •  AIMA looks at another “small” game: •  Tic-Tac-Toe •  362880 Terminal Nodes •  Chess has 1040 Terminal Nodes!
  15. Slide  16   b-­‐it-­‐bots   Tic-Tac-Toe Image Curtosy of: Bob

    Felts(http://stablecross.com/files/Mechanics_of_Morality.html)
  16. Slide  17   b-­‐it-­‐bots   Games & Search Trees • 

    Before when we used a search tree we would examined the “whole” state space •  In games we do not have this luxury. •  For now we will describe search trees as a tree that is “super-imposed” on our state space that allows us to see enough nodes in order to allow a player to determine their next move
  17. Slide  18   b-­‐it-­‐bots   Games & Optimal Decisions • 

    Before and optimal solution is the one that leads us to the goal or a “win” condition. •  With games this is different •  But why?
  18. Slide  19   b-­‐it-­‐bots   Min-Max(minimax) Theorem •  For every

    two-person zero-sum game with finitely many strategies there exists a value V and a mixed strategy for each player such that: ①  Given Player 2’s strategy, the best payoff possible for player 1 is V and ②  Given Player 1’s strategy, the best payoff possible for player 2 is -V
  19. Slide  20   b-­‐it-­‐bots   Min-Max(minimax) •  We have two

    players ( “min” & “max”(us)) •  Now we need to generate the state space •  Recursively generate all possible moves for each turn until we reach a game ending. •  Given three possible states (-1,1,0) •  -1: lose, 1:win 0: draw •  Max (we) will try to maximize the game score •  Eg: game score -> 1 •  Min will try to minimize the game score •  Eg: game score -> -1
  20. Slide  21   b-­‐it-­‐bots   Min-Max(minimax) •  We have assigned

    a “value” to the final possible boards (1,0,-1) •  Note: This value does not have to be 1,0,-1 I simply picked these values to demonstrate my point. •  How do we assign values along the way? (eg: from our initial to our goal state) •  Keep in mind that for now both players are perfect.
  21. Slide  22   b-­‐it-­‐bots   Min-Max(minimax) •  Because each player

    is perfect they will always make the best decision for themselves at that point. •  To deal with this we will simply give each “game” (board) the best possible value that can result from any of the moves that can be made from the current location. •  We do this recursively by starting at the end position and propagating values “upwards” •  Note that the branching factor can commonly change from one move to another as one players moves affects the possible moves of the other
  22. Slide  26   b-­‐it-­‐bots   Min-Max Simple Example Terminal Min

    Max Initial 6 2 1 9 4 Populate Terminal States using the utility function
  23. Slide  27   b-­‐it-­‐bots   Min-Max Simple Example Terminal Min

    Max Initial 2 6 2 1 1 9 4 Working Backward Determin minimax value
  24. Slide  28   b-­‐it-­‐bots   Min-Max Simple Example Terminal Min

    Max 2 2 6 2 1 1 9 4 Working Backward Determin minimax value
  25. Slide  30   b-­‐it-­‐bots   Tic-Tac-Toe (…again) Image Curtosy of:

    Bob Felts(http://stablecross.com/files/Mechanics_of_Morality.html)
  26. Slide  31   b-­‐it-­‐bots   We Need Optimization! •  Depth

    is the enemy of MiniMax (DFS?) •  Time complexity: O(bm) •  Space Complexity: •  O(bm) if you generate all states at once (!) •  O(m) if you generate one state at a time b= legal moves from a node m = depth of the search tree
  27. Slide  32   b-­‐it-­‐bots   α-β Pruning •  A rather

    simple optimization. •  Given the same game, it must reach the same decision as minimax. •  It will not however generate states or branches that can have no effect on the end result.
  28. Slide  33   b-­‐it-­‐bots   α-β Pruning •  It is

    called α-β Pruning because of the following parameters: •  α = the value of the best (i.e., highest-value) choice we have found so far at any choice point along the path for MAX. •  β = the value of the best (i.e., lowest-value) choice we have found so far at any choice point along the path for MIN.
  29. Slide  34   b-­‐it-­‐bots   α-β Pruning •  The algorithm

    will search through the tree updating the values for α and β. •  If the value of a current node is worse than α or β for Max or Min (respectively) we “prune” the search tree •  In this case pruning means we terminate that particular recursive call.
  30. Slide  36   b-­‐it-­‐bots   α-β Pruning vs. MiniMax MiniMax

    •  Time Complexity •  O(bm) •  Space Complexity •  O(bm) α-β Pruning •  Time Complexity •  O(bm) •  Space Complexity •  O(bm/2)
  31. Slide  37   b-­‐it-­‐bots   α-β Pruning Optimizations •  Transposition

    Tables •  A hash map off all the positions that we have been in before. As noted earlier revisiting nodes can cause an exponential explosion in both time and space complexity. We can use transposition tables to avoid this increase.
  32. Slide  38   b-­‐it-­‐bots   Something is still wrong • 

    Minimax •  Helps us deal with opponents (but is time and space cost inefficient). •  α-β Pruning •  Helps us deal with the space and time costs of MiniMax But α-β still requires us to search to the terminal states!
  33. Slide  39   b-­‐it-­‐bots   Imperfect Decisions •  If we

    do not have the time or power to search a full tree before a decision can be made this leads us into the realm of imperfect decision making •  Note: This is not guessing (we are not there yet…)
  34. Slide  40   b-­‐it-­‐bots   Imperfect Decisions •  Replace Utility

    function with an EVAL function •  Replace TerminalTest function with a CutoffTest function.
  35. Slide  41   b-­‐it-­‐bots   The EVAL function 1.  Must

    evaluate terminal states in the same way that the utility function does. 2.  Computation must be fast (the whole point now is speed!) 3.  For nonterminal states our EVAL should be pretty closely tied to our chances of winning •  This is typically done through weighted linear functions (read AIMA for more!)
  36. Slide  42   b-­‐it-­‐bots   The CUTOFF function 1.  This

    is normally done through checking a certain depth (have we gone to far).
  37. Slide  43   b-­‐it-­‐bots   Alternative Optimization Ideas? •  Forward

    Pruning •  Get rid of bad ideas before we chase them down? •  Using Iterative Deepening Search •  Evaluating loss vs gain
  38. Slide  44   b-­‐it-­‐bots   WARNING! •  The more of

    these techniques you employ the “smarter” your algorithms become. •  But be warned! •  They are more prone to errors! Each method approximates some aspect instead of fully investigating it
  39. Slide  45   b-­‐it-­‐bots   Games of Chance •  What

    if we don’t know what is coming or what if the rules can change as the game progresses? •  Note: Okay now we are guessing (but in an educated way)
  40. Slide  46   b-­‐it-­‐bots   Games of Chance •  Remember

    our Min Max search trees? •  Now we will add a new type of node called the chance node.
  41. Slide  48   b-­‐it-­‐bots   Expected MiniMax •  Essentially we

    replace our MiniMax value with an ExpectedMiniMax value. •  P(s) is the probability of choosing that particular node. !"#$%&$'()*)(+" !"#$ =! !"#$#"%(!"#$) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) ! ∈ !"#$%%&'% !"#$ ! ! ∗ !"#$%&$'()*)(+"(!) !!
  42. Slide  49   b-­‐it-­‐bots   Expected MiniMax •  if the

    node is terminal: •  Utility(node) •  If node is a Max Node: •  maxs∈Sucessors (node) ExpectedMiniMax(s) •  If node is a Min Node: •  mins∈Sucessors (node) ExpectedMiniMax(s) •  If a node is a Chance Node: •  Σ s∈Sucessors(node) Ps*ExpectedMiniMax(s) !"#$%&$'()*)(+" !"#$ =! !"#$#"%(!"#$) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) ! ∈ !"#$%%&'% !"#$ ! ! ∗ !"#$%&$'()*)(+"(!) !!
  43. Slide  50   b-­‐it-­‐bots   Expected MiniMax •  While we

    can now deal with chance this comes at a cost. •  Original MiniMax time complexity: •  O(bm) •  ExpectedMiniMax time complexity: •  O(bmnm) •  b = branching factor, m = depth, n = probability factor
  44. Slide  51   b-­‐it-­‐bots   Summary I MiniMax Searching • 

    An improvement on our DFS algorithms. Allows for us to deal with multi-agent game scenarios. •  Time Complexity: O(bm) •  Space Complexity: O(bm) α-β Pruning Search •  An improvement on MiniMax algorithm. It is designed to alieviate the time and space complexities that MiniMax incurs. •  Time Complexity: O(bm) •  Space Complexity: O(bm/2)
  45. Slide  52   b-­‐it-­‐bots   Summary II Imperfect Decisions • 

    We need to develop the ability to deal with situations where we cannot explore the full search tree. •  We do this by changing the following: •  utility_function è EVAL •  terminal_test è cutoff_function Decisions with Chance •  We need to be able to deal with changing scenarios or random events. •  We do this by changing the following: •  utility_function è ExpectedMiniMax
  46. Slide  55   b-­‐it-­‐bots   Google AI Challenge 2011 • 

    Create an ant that is able to survive, reproduce, and conquer the other ants! •  We will work on it during the lab session as well as an assignment. •  You can play against each other very easily! •  Work in groups of 2! •  http://www.aichallenge.org
  47. Slide  56   b-­‐it-­‐bots   Objectives Beginners: •  Build an

    Ant that can do the following: 1.  Survive a Full Game. 2.  Hunt for food. 3.  Guard its home nest. 4.  Attack the enemy. Testing Map: •  Tutorial1.map •  Work in groups of 2 Advanced: •  Build an Ant that can do the following: 1.  Survive a Full Game. 2.  Hunt for food. 3.  Guard its home nest. 4.  Attack the enemy. Testing Map: •  random_walk_07.map •  Work Alone!
  48. Slide  57   b-­‐it-­‐bots   Evaluation For Marks: •  Ability

    to complete task. •  Survive a game: 2 marks •  Hunt for food: 2 marks •  Guard Home Base: 1 mark •  Attack other ants: 1 mark •  Implementation of Algorithms. •  Is it Adversarial: 2 marks For Fun (bonus): •  Competition to see which ant is the best! •  Winner gets an extra mark! Total: 8 marks (1 bonus)
  49. Slide  58   b-­‐it-­‐bots   Please use the rest of

    the lab time to work in groups on the assignment. ASK QUESTIONS!!