Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building Autonomous Agents with gym-retro
Search
Kartones
October 26, 2018
Programming
52
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Building Autonomous Agents with gym-retro
Given at MindCamp X
Kartones
October 26, 2018
More Decks by Kartones
See All by Kartones
Python static typing with MyPy
kartones
0
85
High-impact refactors keeping the lights on
kartones
0
76
Remote Work
kartones
0
100
Geospatial CSV Imports Hidden Complexity
kartones
0
62
Intro to GameBoy Development
kartones
0
110
Myths & The Real World of OpenSource Development
kartones
0
53
CartoDB Tech Intro
kartones
0
57
Copy Protection & Cracking History
kartones
0
140
Cómo ganar dinero con tus juegos online
kartones
1
130
Other Decks in Programming
See All in Programming
Make SRE Operations Easier with Azure SRE Agent
kkamegawa
0
8k
AIで効率化できた業務・日常
ochtum
0
150
TSKaigi Night Talks 2026_TypeScriptでサプライチェーンの整合性を型に閉じ込める
geekplus_tech
0
410
「AIで開発し、AIを届ける」をEvalでつなぐ 〜AIネイティブに始めるプロダクト開発の実践〜 / Connecting "Develop with AI, deliver AI" with Eval
rkaga
4
5.4k
Even G2とAWSで推しのエージェントを召喚しよう!
har1101
1
120
RTSPクライアントを自作してみた話
simotin13
0
630
Oxlintのカスタムルールの現況
syumai
6
1.2k
Creating Composable Callables in Contemporary C++
rollbear
0
170
The ROI of Quarkus for Spring Boot Applications
hollycummins
0
140
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
260
Go1.27で導入されるジェネリクスメソッドでできること
mackee
0
180
Contextとはなにか
chiroruxx
1
370
Featured
See All Featured
Navigating Weather and Climate Data
rabernat
0
240
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
1
1.8k
Scaling GitHub
holman
464
140k
What does AI have to do with Human Rights?
axbom
PRO
1
2.2k
YesSQL, Process and Tooling at Scale
rocio
174
15k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
740
The untapped power of vector embeddings
frankvandijk
2
1.8k
Principles of Awesome APIs and How to Build Them.
keavy
128
18k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
480
Accessibility Awareness
sabderemane
1
140
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.9k
Claude Code のすすめ
schroneko
67
230k
Transcript
@Kartones
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
observation, reward, done, info = environment.step(action)
None
None
action = environment.action_space.sample()
None
paddle_size = 24 paddle_safe_margin = paddle_size/4 if last_info: go_left =
1 if (last_info['player_x_start'] + paddle_safe_margin > last_info['ball_x']) else 0 go_right = 1 if (last_info['player_x_end'] - paddle_safe_margin < last_info['ball_x']) else 0 # ["B", null, "SELECT", "START", "UP", "DOWN", "LEFT", "RIGHT", "A"] action = [0, 0, 0, 0, 0, 0, go_left, go_right, self.env.current_time % 2]
None
None
None
probability = numpy.random.random() if probability < EPSILON: return self._random_movement() else:
if self.env.current_time < self.best_time: action = self.best_actions[self.env.current_time] else: return self._ai_movement(last_info)
None
None
def _epsilon_value(self): if self.USE_DECAYING_EPSILON: # with 1 becomes practically random
return 0.1 / (self.env.current_time + 1) else: return self.EPSILON
Why decaying epsilon = 0.1? 0.1 * 5 = 0.02
0.1 * 250 = 0.0004 0.1 * 500 = 0.0002 0.1 * 1000 = 0.0001
None
None
None
None
None
None
soon at : github.com/kartones
None
None
github.com/openai/retro-baselines/agents
• Intro documentation: blog.openai.com/gym-retro/ • Example code and documentation: github.com/openai/retro/
github.com/openai/retro-baselines/ "gotta_learn_fast_report.pdf" • Game integration guide: github.com/openai/retro/blob/develop/IntegratorsGuide.md • Contests and other resources: openai.com • JERK sources: www.noob-programmer.com/openai-retro-contest/jerk-agent-algorithm/
None