Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building Autonomous Agents with gym-retro
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Kartones
October 26, 2018
Programming
52
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Building Autonomous Agents with gym-retro
Given at MindCamp X
Kartones
October 26, 2018
More Decks by Kartones
See All by Kartones
Python static typing with MyPy
kartones
0
85
High-impact refactors keeping the lights on
kartones
0
76
Remote Work
kartones
0
100
Geospatial CSV Imports Hidden Complexity
kartones
0
62
Intro to GameBoy Development
kartones
0
110
Myths & The Real World of OpenSource Development
kartones
0
53
CartoDB Tech Intro
kartones
0
57
Copy Protection & Cracking History
kartones
0
140
Cómo ganar dinero con tus juegos online
kartones
1
130
Other Decks in Programming
See All in Programming
Oxlintのカスタムルールの現況
syumai
6
1.2k
dRuby over BLE
makicamel
2
390
Signal Forms: Details & Live Coding @enterJS 2026 in Mannheim
manfredsteyer
PRO
0
190
気づいたらRubyで100作品 ー クリエイティブコーディングが生活の一部になるまで / 100 Ruby Sketches Later: How Creative Coding Became Part of My Life
chobishiba
3
610
トークンをケチるな、設計しろ:GitHub Copilotを賢く使うコンテキスト戦略
ochtum
0
170
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
270
過去最大のMCPアップデート! 2026-07-28 RC版の謎に迫る
licux
6
400
Dataformのリポジトリを立ち上げるときにまずやること / dataform-day0-2026
snhryt
0
190
なぜ型を書くのか? TSKaigi2026で改めて考える #tskaigi_smarthr
kajitack
0
160
その問い、本当に正しいですか?AI時代のエンジニアに必要な哲学と認知科学 / ai-philosophy-cognitive-science
minodriven
13
6.3k
1B+ /day規模のログを管理する技術
broadleaf
0
110
脅威をエンジニアリングの糧にして――現場編 / Turning Threats into Engineering Fuel — Field Edition
nrslib
0
300
Featured
See All Featured
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
210
It's Worth the Effort
3n
188
29k
What's in a price? How to price your products and services
michaelherold
247
13k
Navigating Team Friction
lara
192
16k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4.1k
Designing for humans not robots
tammielis
254
26k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
120k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
450
Technical Leadership for Architectural Decision Making
baasie
3
420
The SEO Collaboration Effect
kristinabergwall1
1
490
Transcript
@Kartones
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
observation, reward, done, info = environment.step(action)
None
None
action = environment.action_space.sample()
None
paddle_size = 24 paddle_safe_margin = paddle_size/4 if last_info: go_left =
1 if (last_info['player_x_start'] + paddle_safe_margin > last_info['ball_x']) else 0 go_right = 1 if (last_info['player_x_end'] - paddle_safe_margin < last_info['ball_x']) else 0 # ["B", null, "SELECT", "START", "UP", "DOWN", "LEFT", "RIGHT", "A"] action = [0, 0, 0, 0, 0, 0, go_left, go_right, self.env.current_time % 2]
None
None
None
probability = numpy.random.random() if probability < EPSILON: return self._random_movement() else:
if self.env.current_time < self.best_time: action = self.best_actions[self.env.current_time] else: return self._ai_movement(last_info)
None
None
def _epsilon_value(self): if self.USE_DECAYING_EPSILON: # with 1 becomes practically random
return 0.1 / (self.env.current_time + 1) else: return self.EPSILON
Why decaying epsilon = 0.1? 0.1 * 5 = 0.02
0.1 * 250 = 0.0004 0.1 * 500 = 0.0002 0.1 * 1000 = 0.0001
None
None
None
None
None
None
soon at : github.com/kartones
None
None
github.com/openai/retro-baselines/agents
• Intro documentation: blog.openai.com/gym-retro/ • Example code and documentation: github.com/openai/retro/
github.com/openai/retro-baselines/ "gotta_learn_fast_report.pdf" • Game integration guide: github.com/openai/retro/blob/develop/IntegratorsGuide.md • Contests and other resources: openai.com • JERK sources: www.noob-programmer.com/openai-retro-contest/jerk-agent-algorithm/
None