Randy Coulman
November 17, 2015
290

# Shall We Play A Game?

From RubyConf 2015.

Teaching computers to play games has been a pursuit and passion for many programmers. Game playing has led to many advances in computing over the years, and the best computerized game players have gained a lot of attention from the general public (think Deep Blue and Watson).

Using the Ricochet Robots board game as an example, let's talk about what's involved in teaching a computer to play games. Along the way, we'll touch on graph search techniques, data representation, algorithms, heuristics, pruning, and optimization.

## Randy Coulman

November 17, 2015

## Transcript

get to
15. ### 1 2 3 1. Blue Right 2. Blue Down 3.

Blue Left Moves to get to
16. ### 1 2 3 4 1. Blue Right 2. Blue Down

3. Blue Left 4. Green Right Moves to get to
17. ### 1 2 3 4 5 1. Blue Right 2. Blue

Down 3. Blue Left 4. Green Right 5. Green Down Moves to get to
18. ### 1 2 3 4 5 6 1. Blue Right 2.

Blue Down 3. Blue Left 4. Green Right 5. Green Down 6. Green Left Moves to get to
19. ### 1 2 3 4 5 6 7 1. Blue Right

2. Blue Down 3. Blue Left 4. Green Right 5. Green Down 6. Green Left 7. Green Down Moves to get to
20. ### Characterizing the Problem Possible Board States (size of state space):

252 * 251 * 250 * 249 * 248 = 976,484,376,000
21. ### Characterizing the Problem Branching Factor: 9 - 20 possible moves

from each state

23. ### The Board • Board (Static: 16 x 16) • Walls

& Targets (changes each game)
24. ### The Board • Board (Static: 16 x 16) • Walls

& Targets (changes each game) • Goal (changes each turn)
25. ### The Board • Board (Static: 16 x 16) • Walls

& Targets (changes each game) • Goal (changes each turn) • Robot Positions (changes each move)

Ways Through

9

9 10

9 10 11

9 10 11 12
42. ### Depth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13
43. ### Depth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14
44. ### Depth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15
45. ### Depth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 16
46. ### Depth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 16
47. ### def solve solve_recursively(Path.initial(state)) candidates.min_by(&:length) || Outcome.no_solution(state) end def solve_recursively(path) return

candidates << path.to_outcome if path.solved? path.allowable_successors.each do |successor| solve_recursively(successor) end end Depth-First Search

57. ### class Path def allowable_successors allowable_moves .map { |direction| successor(direction) }

.compact end def successor(direction) next_robot = robot.moved(direction) next_robot == robot ? nil : self.class.new(next_robot, moves + [direction]) end end Cycles
58. ### class Path def allowable_successors allowable_moves .map { |direction| successor(direction) }

.compact .reject(&:cycle?) end def successor(direction) next_robot = robot.moved(direction) next_robot == robot ? nil : self.class.new(next_robot, moves + [direction], visited + [robot]) end def cycle? visited.include?(robot) end end Cycles

9

9 10

9 10 11

9 10 11 12
82. ### Breadth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13
83. ### Breadth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14
84. ### Breadth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15
85. ### Breadth-First Search 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 16
86. ### Breadth-First Search 1 2 5 3 7 8 4 9

10 11 6 12 13 14 15 16
87. ### Breadth-First Search def solve paths = [Path.initial(initial_state)] until paths.empty? path

= paths.shift return path.to_outcome if path.solved? paths += path.allowable_successors end Outcome.no_solution(initial_state) end

89. ### BFS: Global Visited List def solve visited = Set.new paths

= [Path.initial(initial_state)] until paths.empty? path = paths.shift return path.to_outcome if path.solved? next if visited.include?(path.state) visited << path.state paths += path.allowable_successors end Outcome.no_solution(initial_state) end

93. ### Optimization Heuristics: Rules of Thumb Less certain; may work in

some circumstances, but not others
94. ### initial_state.ensure_goal_robot_first(goal) def ensure_goal_robot_first(goal) return if goal.color == :any goal_index =

robots.index { |robot| robot.color == goal.color } robots.rotate!(goal_index) end Heuristic: Move Active Robot First
95. ### Heuristic: Move Active Robot First States Considered 0 350000 700000

1050000 1400000 Original Algorithm Active First
96. ### def solve paths = [Path.initial(initial_state)] until paths.empty? path = paths.shift

return path.to_outcome if path.solved? paths += path.allowable_successors end Outcome.no_solution(initial_state) end Do Less Things: Check for Solutions at Generation Time
97. ### Do Less Things: Check for Solutions at Generation Time 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
98. ### Do Less Things: Check for Solutions at Generation Time def

solve paths = [Path.initial(initial_state)] until paths.empty? path = paths.shift successors = path.allowable_successors solution = successors.find(&:solved?) return solution.to_outcome if solution paths += successors end Outcome.no_solution(initial_state) end
99. ### Do Less Things: Check for Solutions at Generation Time 1

2 3 4 5 6 X X X X X X X X X 16
100. ### Do Less Things: Check for Solutions at Generation Time 0

800,000 1,600,000 2,400,000 3,200,000 Original Algorithm Check at Generation Total States Considered

102. ### Do Things Faster: Precompute stopping cells States / second 0

675 1350 2025 2700 Original Algorithm Pre-compute Stops
103. ### Do Less Things / Do Things Faster: Treat non-active robots

as equivalent
104. ### class BoardState def equivalence_class @equivalence_class ||= begin Set.new(robots.map do |robot|

robot.position_hash + (robot.active?(goal) ? 1000 : 0) end) end end end def position_hash row * board.size + column end Do Less Things / Do Things Faster: Treat non-active robots as equivalent
105. ### Do Less Things / Do Things Faster: Treat non-active robots

as equivalent 0 1,000,000 2,000,000 3,000,000 4,000,000 Original Algorithm Check at Generation Robot Equivalence Total States Considered
106. ### Do Less Things / Do Things Faster: Treat non-active robots

as equivalent States / second 0 750 1500 2250 3000 Original Algorithm Pre-compute Stops Robot Equiv.
107. ### class BoardState def equivalence_class @equivalence_class ||= begin Set.new(robots.map do |robot|

robot.position_hash + (robot.active?(goal) ? 1000 : 0) end).sort! end end end Do Things Faster: Sorted Array vs Set
108. ### Do Things Faster: Sorted Array vs Set States / second

0 800 1600 2400 3200 Original Algorithm Pre-compute Stops Robot Equiv. Arrays not Sets
109. ### class Robot def moved(direction, board_state) self.class.new(color, cell.next_cell(direction, board_state)) end end

class BoardState def with_robot_moved(robot, direction) moved_robots = robots.map do |each_robot| each_robot == robot ? each_robot.moved(direction, self) : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Less Object Creation
110. ### class Robot def moved(direction, board_state) next_cell = cell.next_cell(direction, board_state) next_cell

== cell ? self : self.class.new(color, next_cell) end end class BoardState def with_robot_moved(robot, direction) moved_robot = robot.moved(direction, self) return self if moved_robot.equal?(robot) moved_robots = robots.map do |each_robot| each_robot == robot ? moved_robot : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Less Object Creation
111. ### States / second 0 1250 2500 3750 5000 Original Algorithm

Pre-compute Stops Robot Equiv. Arrays not Sets Less Objects Do Things Faster: Less Object Creation
112. ### class BoardState def with_robot_moved(robot, direction) moved_robot = robot.moved(direction, self) return

self if moved_robot.equal?(robot) moved_robots = robots.map do |each_robot| each_robot.equal?(robot) ? moved_robot : each_robot end self.class.new(moved_robots, goal) end end Do Things Faster: Use Object Identity Instead of Deep Equality
113. ### Do Things Faster: Use Object Identity Instead of Deep Equality

States / second 0 1750 3500 5250 7000 Original Algorithm Pre-compute Stops Robot Equiv. Arrays not Sets Less Objects Object Identity
114. ### Results So Far Solving time (seconds) 0 750 1500 2250

3000 Original Active First Check at Gen. Pre-compute Robot Equiv. Arrays not Sets Less Objects Robot Identity

122. ### def solve paths = FastContainers::PriorityQueue.new(:min).tap do |paths| add_path(paths, path) end

until paths.empty? path = paths.top; paths.pop successors = path.allowable_successors solution = successors.find(&:solved?) return solution.to_outcome if solution add_paths(paths, successors) end Outcome.no_solution(initial_state) end Best-First Search
123. ### def add_paths(paths, successors) successors.each { |path| add_path(paths, path) } end

def add_path(paths, path) paths.push(path, score(path)) end def score(path) # ... end Best-First Search

left
127. ### def score(path) path.length + best_estimate end def best_estimate state.active_robots.map {

|robot| estimate_at(robot.cell) }.min end A* Algorithm

130. ### A* Algorithm 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 0
131. ### A* Algorithm 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0
132. ### A* Algorithm 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0
133. ### A* Algorithm 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0
134. ### A* Algorithm 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 0

MRU robots

types

145. ### Acknowledgements • Trever Yarrish of Zeal for the awesome graphics

and visualizations • My fellow Zeals for ideas, feedback, and pairing on the solver • Michael Fogleman for some optimization ideas • Trevor Lalish-Menagh for introducing me to the game • Screen Captures from War Games. (Dir. John Badham. MGM/UA. 1983)