Adversarial Search 2012

b-‐it-‐bots Adversarial Search Matthew S Roscoe
Principles of Cognitive Robotics November 29th 2012

b-‐it-‐bots Questions From You: •  Politely
Interrupt me. •  Raising Hand, Speak Up, etc… From Me: •  I will randomly pick someone using the numbers you were given. We will not move on until we can sum up the question. (no man/woman left behind) •  I need someone to record the questions and answers given during the class so that I can verify the answers and redistribute them as review material.

b-‐it-‐bots Material Sources: •  Artificial Intelligence
a Modern Approach (AIMA) 2nd Edition (CH6) •  AIMA 3rd Edition (CH5) •  Game Theory for Applied Economics •  Robert Gibson (Princeton University Press) •  PDF (pre-print) Available on request.

b-‐it-‐bots Adversarial Search Examining the problems
that occur when we try to plan ahead in a world where other agents are planning against us.

b-‐it-‐bots Requirements: •  Assumed Understanding: • 
View actors of problems as agents (ch2) •  View problems as games (game theory) •  Recursion (Basic Computer Science) •  New Knowledge: •  Use searching to “win” games.

b-‐it-‐bots (REVIEW) What are Agents? • 
The “thing” trying to solve the problem. •  It perceives and acts in its environment. •  An “agent function” is the agents response to a “percept sequence” •  Performance measures determine how “well” the agent is doing (happiness). •  Reflex vs. Model Based. •  Room for improvement through learning (See this in a later course)

b-‐it-‐bots (REVIEW) What are Games? • 
Representing a “real” world scenario and all of the possible decisions. •  Multi-Agent! •  Mapping decisions (agent functions) to outcomes (performance measures) •  Trying to pick a grouping of decisions that lead to the best possible outcome (not always winning)

b-‐it-‐bots EXAMPLE Game Prisoners Dilemma • 
Two Prisoners in Separate Rooms. •  Both are being interrogated and are offered two choices: •  Stay Quiet or Confess •  Represent the payoff for each: Player 2 Player 1 Quiet Confess Quiet 1: 1month 2: 1month 1: 1 Year 2: Goes Free Confess 1: Goes Free 2: 1 Year 1: 3 months 2: 3months

b-‐it-‐bots How Do We Use This?
Problems we have seen so far: •  Sudoku •  Traveling Salesman •  N-Puzzle Problems •  Etc…

b-‐it-‐bots What’s Changed? We are no
longer alone!

b-‐it-‐bots What Does This Mean? • 
We can no longer make decisions the same way we did before. •  We need to consider what the other players (agent(s)) might do. •  This will affect our searching strategies!

b-‐it-‐bots Adversarial Search We will be
searching through the state-space for a specific type of games namely: •  Game Theory: •  Two Player, Deterministic, co-operative/competitive, Turn taking, Zero-sum Games of Perfect Information. •  Artificial Intelligence: •  Two-Agent, Deterministic, (competitive), Turn Taking, Zero-sum, Fully Observable

b-‐it-‐bots FORMAL DEFINITION •  Adversarial Search
Game: •  A Search problem that contains the following elements: •  Initial State •  Player(s) •  Action(s) •  Result(s) •  Terminal Test •  Utility Function (payoff function, objective function)

b-‐it-‐bots Direction •  Where are we
going with this: •  Min-Max Search Algorithm •  Alpha-Beta Search Algorithm •  Heavy focus on parsing with this one! •  Chance & Imperfection •  If we have time •  !!You still need to cover this for the course!!

b-‐it-‐bots MiniMax Searching •  Prisoners dilemma
was a “small” game. •  Two players, Two Options •  4 possible outcomes •  A possible outcome is called a “Terminal State” •  AIMA looks at another “small” game: •  Tic-Tac-Toe •  362880 Terminal Nodes •  Chess has 1040 Terminal Nodes!

b-‐it-‐bots Tic-Tac-Toe Image Curtosy of: Bob
Felts(http://stablecross.com/files/Mechanics_of_Morality.html)

b-‐it-‐bots Games & Search Trees • 
Before when we used a search tree we would examined the “whole” state space •  In games we do not have this luxury. •  For now we will describe search trees as a tree that is “super-imposed” on our state space that allows us to see enough nodes in order to allow a player to determine their next move

b-‐it-‐bots Games & Optimal Decisions • 
Before and optimal solution is the one that leads us to the goal or a “win” condition. •  With games this is different •  But why?

b-‐it-‐bots Min-Max(minimax) Theorem •  For every
two-person zero-sum game with finitely many strategies there exists a value V and a mixed strategy for each player such that: ①  Given Player 2’s strategy, the best payoff possible for player 1 is V and ②  Given Player 1’s strategy, the best payoff possible for player 2 is -V

b-‐it-‐bots Min-Max(minimax) •  We have two
players ( “min” & “max”(us)) •  Now we need to generate the state space •  Recursively generate all possible moves for each turn until we reach a game ending. •  Given three possible states (-1,1,0) •  -1: lose, 1:win 0: draw •  Max (we) will try to maximize the game score •  Eg: game score -> 1 •  Min will try to minimize the game score •  Eg: game score -> -1

b-‐it-‐bots Min-Max(minimax) •  We have assigned
a “value” to the final possible boards (1,0,-1) •  Note: This value does not have to be 1,0,-1 I simply picked these values to demonstrate my point. •  How do we assign values along the way? (eg: from our initial to our goal state) •  Keep in mind that for now both players are perfect.

b-‐it-‐bots Min-Max(minimax) •  Because each player
is perfect they will always make the best decision for themselves at that point. •  To deal with this we will simply give each “game” (board) the best possible value that can result from any of the moves that can be made from the current location. •  We do this recursively by starting at the end position and propagating values “upwards” •  Note that the branching factor can commonly change from one move to another as one players moves affects the possible moves of the other

b-‐it-‐bots Min-Max(minimax) Image Curtosy of: AIMA
2/3 edition Text Book

b-‐it-‐bots Min-Max(minimax)

b-‐it-‐bots Min-Max Simple Example Terminal Min
Max Initial

Max Initial 6 2 1 9 4 Populate Terminal States using the utility function

Max Initial 2 6 2 1 1 9 4 Working Backward Determin minimax value

Max 2 2 6 2 1 1 9 4 Working Backward Determin minimax value

Max 2 2 6 2 1 1 9 4 Finale Game State

b-‐it-‐bots Tic-Tac-Toe (…again) Image Curtosy of:
Bob Felts(http://stablecross.com/files/Mechanics_of_Morality.html)

b-‐it-‐bots We Need Optimization! •  Depth
is the enemy of MiniMax (DFS?) •  Time complexity: O(bm) •  Space Complexity: •  O(bm) if you generate all states at once (!) •  O(m) if you generate one state at a time b= legal moves from a node m = depth of the search tree

b-‐it-‐bots α-β Pruning •  A rather
simple optimization. •  Given the same game, it must reach the same decision as minimax. •  It will not however generate states or branches that can have no effect on the end result.

b-‐it-‐bots α-β Pruning •  It is
called α-β Pruning because of the following parameters: •  α = the value of the best (i.e., highest-value) choice we have found so far at any choice point along the path for MAX. •  β = the value of the best (i.e., lowest-value) choice we have found so far at any choice point along the path for MIN.

b-‐it-‐bots α-β Pruning •  The algorithm
will search through the tree updating the values for α and β. •  If the value of a current node is worse than α or β for Max or Min (respectively) we “prune” the search tree •  In this case pruning means we terminate that particular recursive call.

b-‐it-‐bots Min-Max(minimax) Image Curtosy of: AIMA
2/3 edition Text Book

b-‐it-‐bots α-β Pruning vs. MiniMax MiniMax
•  Time Complexity •  O(bm) •  Space Complexity •  O(bm) α-β Pruning •  Time Complexity •  O(bm) •  Space Complexity •  O(bm/2)

b-‐it-‐bots α-β Pruning Optimizations •  Transposition
Tables •  A hash map off all the positions that we have been in before. As noted earlier revisiting nodes can cause an exponential explosion in both time and space complexity. We can use transposition tables to avoid this increase.

b-‐it-‐bots Something is still wrong • 
Minimax •  Helps us deal with opponents (but is time and space cost inefficient). •  α-β Pruning •  Helps us deal with the space and time costs of MiniMax But α-β still requires us to search to the terminal states!

b-‐it-‐bots Imperfect Decisions •  If we
do not have the time or power to search a full tree before a decision can be made this leads us into the realm of imperfect decision making •  Note: This is not guessing (we are not there yet…)

b-‐it-‐bots Imperfect Decisions •  Replace Utility
function with an EVAL function •  Replace TerminalTest function with a CutoffTest function.

b-‐it-‐bots The EVAL function 1.  Must
evaluate terminal states in the same way that the utility function does. 2.  Computation must be fast (the whole point now is speed!) 3.  For nonterminal states our EVAL should be pretty closely tied to our chances of winning •  This is typically done through weighted linear functions (read AIMA for more!)

b-‐it-‐bots The CUTOFF function 1.  This
is normally done through checking a certain depth (have we gone to far).

b-‐it-‐bots Alternative Optimization Ideas? •  Forward
Pruning •  Get rid of bad ideas before we chase them down? •  Using Iterative Deepening Search •  Evaluating loss vs gain

b-‐it-‐bots WARNING! •  The more of
these techniques you employ the “smarter” your algorithms become. •  But be warned! •  They are more prone to errors! Each method approximates some aspect instead of fully investigating it

b-‐it-‐bots Games of Chance •  What
if we don’t know what is coming or what if the rules can change as the game progresses? •  Note: Okay now we are guessing (but in an educated way)

b-‐it-‐bots Games of Chance •  Remember
our Min Max search trees? •  Now we will add a new type of node called the chance node.

b-‐it-‐bots Games of Chance Terminal Min
Chance Max Initial

b-‐it-‐bots Expected MiniMax •  Essentially we
replace our MiniMax value with an ExpectedMiniMax value. •  P(s) is the probability of choosing that particular node. !"#$%&$'()*)(+" !"#$ =! !"#$#"%(!"#$) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) ! ∈ !"#$%%&'% !"#$ ! ! ∗ !"#$%&$'()*)(+"(!) !!

b-‐it-‐bots Expected MiniMax •  if the
node is terminal: •  Utility(node) •  If node is a Max Node: •  maxs∈Sucessors (node) ExpectedMiniMax(s) •  If node is a Min Node: •  mins∈Sucessors (node) ExpectedMiniMax(s) •  If a node is a Chance Node: •  Σ s∈Sucessors(node) Ps*ExpectedMiniMax(s) !"#$%&$'()*)(+" !"#$ =! !"#$#"%(!"#$) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) !"# !∈!"#$%%&'%(!"#$) !!"#$%&$'()*)(+"(!) ! ∈ !"#$%%&'% !"#$ ! ! ∗ !"#$%&$'()*)(+"(!) !!

b-‐it-‐bots Expected MiniMax •  While we
can now deal with chance this comes at a cost. •  Original MiniMax time complexity: •  O(bm) •  ExpectedMiniMax time complexity: •  O(bmnm) •  b = branching factor, m = depth, n = probability factor

b-‐it-‐bots Summary I MiniMax Searching • 
An improvement on our DFS algorithms. Allows for us to deal with multi-agent game scenarios. •  Time Complexity: O(bm) •  Space Complexity: O(bm) α-β Pruning Search •  An improvement on MiniMax algorithm. It is designed to alieviate the time and space complexities that MiniMax incurs. •  Time Complexity: O(bm) •  Space Complexity: O(bm/2)

b-‐it-‐bots Summary II Imperfect Decisions • 
We need to develop the ability to deal with situations where we cannot explore the full search tree. •  We do this by changing the following: •  utility_function è EVAL •  terminal_test è cutoff_function Decisions with Chance •  We need to be able to deal with changing scenarios or random events. •  We do this by changing the following: •  utility_function è ExpectedMiniMax

b-‐it-‐bots TEACHING REVIEW HTTP://WWW.SURVEYMONKEY.COM/S/HFWYH85 Thank You
for your time and attention!

b-‐it-‐bots Assignment / Lab Exercise

b-‐it-‐bots Google AI Challenge 2011 • 
Create an ant that is able to survive, reproduce, and conquer the other ants! •  We will work on it during the lab session as well as an assignment. •  You can play against each other very easily! •  Work in groups of 2! •  http://www.aichallenge.org

b-‐it-‐bots Objectives Beginners: •  Build an
Ant that can do the following: 1.  Survive a Full Game. 2.  Hunt for food. 3.  Guard its home nest. 4.  Attack the enemy. Testing Map: •  Tutorial1.map •  Work in groups of 2 Advanced: •  Build an Ant that can do the following: 1.  Survive a Full Game. 2.  Hunt for food. 3.  Guard its home nest. 4.  Attack the enemy. Testing Map: •  random_walk_07.map •  Work Alone!

b-‐it-‐bots Evaluation For Marks: •  Ability
to complete task. •  Survive a game: 2 marks •  Hunt for food: 2 marks •  Guard Home Base: 1 mark •  Attack other ants: 1 mark •  Implementation of Algorithms. •  Is it Adversarial: 2 marks For Fun (bonus): •  Competition to see which ant is the best! •  Winner gets an extra mark! Total: 8 marks (1 bonus)

b-‐it-‐bots Please use the rest of
the lab time to work in groups on the assignment. ASK QUESTIONS!!

Adversarial Search 2012

Adversarial Search 2012

More Decks by Mat Roscoe

Other Decks in Education

Featured

Transcript