Shall We Play A Game?

Shall We Play A Game?

From RubyConf 2015.

Teaching computers to play games has been a pursuit and passion for many programmers. Game playing has led to many advances in computing over the years, and the best computerized game players have gained a lot of attention from the general public (think Deep Blue and Watson).

Using the Ricochet Robots board game as an example, let's talk about what's involved in teaching a computer to play games. Along the way, we'll touch on graph search techniques, data representation, algorithms, heuristics, pruning, and optimization.

27204e228cc521c6cafed3c92b95184c?s=128

Randy Coulman

November 17, 2015
Tweet

Transcript

  1. 15.

    1 2 3 1. Blue Right 2. Blue Down 3.

    Blue Left Moves to get to
  2. 16.

    1 2 3 4 1. Blue Right 2. Blue Down

    3. Blue Left 4. Green Right Moves to get to
  3. 17.

    1 2 3 4 5 1. Blue Right 2. Blue

    Down 3. Blue Left 4. Green Right 5. Green Down Moves to get to
  4. 18.

    1 2 3 4 5 6 1. Blue Right 2.

    Blue Down 3. Blue Left 4. Green Right 5. Green Down 6. Green Left Moves to get to
  5. 19.

    1 2 3 4 5 6 7 1. Blue Right

    2. Blue Down 3. Blue Left 4. Green Right 5. Green Down 6. Green Left 7. Green Down Moves to get to
  6. 20.

    Characterizing the Problem Possible Board States (size of state space):

    252 * 251 * 250 * 249 * 248 = 976,484,376,000
  7. 23.

    The Board • Board (Static: 16 x 16) • Walls

    & Targets (changes each game)
  8. 24.

    The Board • Board (Static: 16 x 16) • Walls

    & Targets (changes each game) • Goal (changes each turn)
  9. 25.

    The Board • Board (Static: 16 x 16) • Walls

    & Targets (changes each game) • Goal (changes each turn) • Robot Positions (changes each move)
  10. 27.
  11. 43.
  12. 44.
  13. 45.

    Depth-First Search 1 2 3 4 5 6 7 8

    9 10 11 12 13 14 15 16
  14. 46.

    Depth-First Search 1 2 3 4 5 6 7 8

    9 10 11 12 13 14 15 16
  15. 47.

    def solve solve_recursively(Path.initial(state)) candidates.min_by(&:length) || Outcome.no_solution(state) end def solve_recursively(path) return

    candidates << path.to_outcome if path.solved? path.allowable_successors.each do |successor| solve_recursively(successor) end end Depth-First Search
  16. 48.
  17. 49.
  18. 50.
  19. 51.
  20. 52.
  21. 53.
  22. 54.
  23. 55.
  24. 56.
  25. 57.

    class Path def allowable_successors allowable_moves .map { |direction| successor(direction) }

    .compact end def successor(direction) next_robot = robot.moved(direction) next_robot == robot ? nil : self.class.new(next_robot, moves + [direction]) end end Cycles
  26. 58.

    class Path def allowable_successors allowable_moves .map { |direction| successor(direction) }

    .compact .reject(&:cycle?) end def successor(direction) next_robot = robot.moved(direction) next_robot == robot ? nil : self.class.new(next_robot, moves + [direction], visited + [robot]) end def cycle? visited.include?(robot) end end Cycles
  27. 84.
  28. 85.

    Breadth-First Search 1 2 3 4 5 6 7 8

    9 10 11 12 13 14 15 16
  29. 86.

    Breadth-First Search 1 2 5 3 7 8 4 9

    10 11 6 12 13 14 15 16
  30. 87.

    Breadth-First Search def solve paths = [Path.initial(initial_state)] until paths.empty? path

    = paths.shift return path.to_outcome if path.solved? paths += path.allowable_successors end Outcome.no_solution(initial_state) end
  31. 89.

    BFS: Global Visited List def solve visited = Set.new paths

    = [Path.initial(initial_state)] until paths.empty? path = paths.shift return path.to_outcome if path.solved? next if visited.include?(path.state) visited << path.state paths += path.allowable_successors end Outcome.no_solution(initial_state) end
  32. 94.

    initial_state.ensure_goal_robot_first(goal) def ensure_goal_robot_first(goal) return if goal.color == :any goal_index =

    robots.index { |robot| robot.color == goal.color } robots.rotate!(goal_index) end Heuristic: Move Active Robot First
  33. 95.

    Heuristic: Move Active Robot First States Considered 0 350000 700000

    1050000 1400000 Original Algorithm Active First
  34. 96.

    def solve paths = [Path.initial(initial_state)] until paths.empty? path = paths.shift

    return path.to_outcome if path.solved? paths += path.allowable_successors end Outcome.no_solution(initial_state) end Do Less Things: Check for Solutions at Generation Time
  35. 97.

    Do Less Things: Check for Solutions at Generation Time 1

    2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  36. 98.

    Do Less Things: Check for Solutions at Generation Time def

    solve paths = [Path.initial(initial_state)] until paths.empty? path = paths.shift successors = path.allowable_successors solution = successors.find(&:solved?) return solution.to_outcome if solution paths += successors end Outcome.no_solution(initial_state) end
  37. 100.

    Do Less Things: Check for Solutions at Generation Time 0

    800,000 1,600,000 2,400,000 3,200,000 Original Algorithm Check at Generation Total States Considered
  38. 102.

    Do Things Faster: Precompute stopping cells States / second 0

    675 1350 2025 2700 Original Algorithm Pre-compute Stops
  39. 104.

    class BoardState def equivalence_class @equivalence_class ||= begin Set.new(robots.map do |robot|

    robot.position_hash + (robot.active?(goal) ? 1000 : 0) end) end end end def position_hash row * board.size + column end Do Less Things / Do Things Faster: Treat non-active robots as equivalent
  40. 105.

    Do Less Things / Do Things Faster: Treat non-active robots

    as equivalent 0 1,000,000 2,000,000 3,000,000 4,000,000 Original Algorithm Check at Generation Robot Equivalence Total States Considered
  41. 106.

    Do Less Things / Do Things Faster: Treat non-active robots

    as equivalent States / second 0 750 1500 2250 3000 Original Algorithm Pre-compute Stops Robot Equiv.
  42. 107.

    class BoardState def equivalence_class @equivalence_class ||= begin Set.new(robots.map do |robot|

    robot.position_hash + (robot.active?(goal) ? 1000 : 0) end).sort! end end end Do Things Faster: Sorted Array vs Set
  43. 108.

    Do Things Faster: Sorted Array vs Set States / second

    0 800 1600 2400 3200 Original Algorithm Pre-compute Stops Robot Equiv. Arrays not Sets
  44. 109.

    class Robot def moved(direction, board_state) self.class.new(color, cell.next_cell(direction, board_state)) end end

    class BoardState def with_robot_moved(robot, direction) moved_robots = robots.map do |each_robot| each_robot == robot ? each_robot.moved(direction, self) : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Less Object Creation
  45. 110.

    class Robot def moved(direction, board_state) next_cell = cell.next_cell(direction, board_state) next_cell

    == cell ? self : self.class.new(color, next_cell) end end class BoardState def with_robot_moved(robot, direction) moved_robot = robot.moved(direction, self) return self if moved_robot.equal?(robot) moved_robots = robots.map do |each_robot| each_robot == robot ? moved_robot : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Less Object Creation
  46. 111.

    States / second 0 1250 2500 3750 5000 Original Algorithm

    Pre-compute Stops Robot Equiv. Arrays not Sets Less Objects Do Things Faster: Less Object Creation
  47. 112.

    class BoardState def with_robot_moved(robot, direction) moved_robot = robot.moved(direction, self) return

    self if moved_robot.equal?(robot) moved_robots = robots.map do |each_robot| each_robot.equal?(robot) ? moved_robot : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Use Object Identity Instead of Deep Equality
  48. 113.

    Do Things Faster: Use Object Identity Instead of Deep Equality

    States / second 0 1750 3500 5250 7000 Original Algorithm Pre-compute Stops Robot Equiv. Arrays not Sets Less Objects Object Identity
  49. 114.

    Results So Far Solving time (seconds) 0 750 1500 2250

    3000 Original Active First Check at Gen. Pre-compute Robot Equiv. Arrays not Sets Less Objects Robot Identity
  50. 122.

    def solve paths = FastContainers::PriorityQueue.new(:min).tap do |paths| add_path(paths, path) end

    until paths.empty? path = paths.top; paths.pop successors = path.allowable_successors solution = successors.find(&:solved?) return solution.to_outcome if solution add_paths(paths, successors) end Outcome.no_solution(initial_state) end Best-First Search
  51. 123.

    def add_paths(paths, successors) successors.each { |path| add_path(paths, path) } end

    def add_path(paths, path) paths.push(path, score(path)) end def score(path) # ... end Best-First Search
  52. 127.
  53. 130.

    A* Algorithm 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 1 1 1 1 1 0
  54. 131.

    A* Algorithm 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0
  55. 132.

    A* Algorithm 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0
  56. 133.

    A* Algorithm 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0
  57. 134.

    A* Algorithm 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 0
  58. 145.

    Acknowledgements • Trever Yarrish of Zeal for the awesome graphics

    and visualizations • My fellow Zeals for ideas, feedback, and pairing on the solver • Michael Fogleman for some optimization ideas • Trevor Lalish-Menagh for introducing me to the game • Screen Captures from War Games. (Dir. John Badham. MGM/UA. 1983)