Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Autonomous Agents with gym-retro

Kartones
October 26, 2018

Building Autonomous Agents with gym-retro

Given at MindCamp X

Kartones

October 26, 2018
Tweet

More Decks by Kartones

Other Decks in Programming

Transcript

  1. paddle_size = 24 paddle_safe_margin = paddle_size/4 if last_info: go_left =

    1 if (last_info['player_x_start'] + paddle_safe_margin > last_info['ball_x']) else 0 go_right = 1 if (last_info['player_x_end'] - paddle_safe_margin < last_info['ball_x']) else 0 # ["B", null, "SELECT", "START", "UP", "DOWN", "LEFT", "RIGHT", "A"] action = [0, 0, 0, 0, 0, 0, go_left, go_right, self.env.current_time % 2]
  2. probability = numpy.random.random() if probability < EPSILON: return self._random_movement() else:

    if self.env.current_time < self.best_time: action = self.best_actions[self.env.current_time] else: return self._ai_movement(last_info)
  3. def _epsilon_value(self): if self.USE_DECAYING_EPSILON: # with 1 becomes practically random

    return 0.1 / (self.env.current_time + 1) else: return self.EPSILON
  4. Why decaying epsilon = 0.1? 0.1 * 5 = 0.02

    0.1 * 250 = 0.0004 0.1 * 500 = 0.0002 0.1 * 1000 = 0.0001
  5. • Intro documentation: blog.openai.com/gym-retro/ • Example code and documentation: github.com/openai/retro/

    github.com/openai/retro-baselines/ "gotta_learn_fast_report.pdf" • Game integration guide: github.com/openai/retro/blob/develop/IntegratorsGuide.md • Contests and other resources: openai.com • JERK sources: www.noob-programmer.com/openai-retro-contest/jerk-agent-algorithm/