Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
これからの強化学習2.7
Search
moyomot
May 19, 2017
0
130
これからの強化学習2.7
moyomot
May 19, 2017
Tweet
Share
More Decks by moyomot
See All by moyomot
DRIVE CHARTのMLOpsを体感しよう
moyomot
0
68
現場課題に向き合い MLOps成熟度を高める道
moyomot
1
940
第1回 Data-Centric AI勉強会 LT: AIドラレコを支える一貫性のあるデータの作り方
moyomot
0
800
DRIVE CHARTにおけるAI開発とアーキテクチャ全容
moyomot
0
860
これからの強化学習2.6
moyomot
0
200
Gunosyのデータ分析基盤、ログ基盤の全容
moyomot
14
9.4k
GunosyにおけるSparkStreaming活用事例
moyomot
1
5.1k
トピックモデル第2章
moyomot
0
300
adhoc analysis apache spark
moyomot
1
1.1k
Featured
See All Featured
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.2k
Music & Morning Musume
bryan
46
6.2k
Optimising Largest Contentful Paint
csswizardry
33
2.9k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
27
4.3k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
364
24k
Become a Pro
speakerdeck
PRO
25
5k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
42
9.2k
Documentation Writing (for coders)
carmenintech
65
4.4k
Practical Orchestrator
shlominoach
186
10k
How GitHub (no longer) Works
holman
310
140k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
4
370
A designer walks into a library…
pauljervisheath
203
24k
Transcript
͜Ε͔ΒͷڧԽֶश 2.7 ෳརܕڧԽֶश GUNOSY σʔλϚΠχϯάݚڀձ #121
INTRODUCTION ͓͢Δ͜ͱ ▸ རӹͷෳརޮՌΛ۩ମྫΛ௨ͯ͠ཧղ͠ ▸ ෳརΛQֶशͷΈʹఆࣜԽ͢Δ
INTRODUCTION ࣍ ▸ 2.7.1 རӹͷෳརޮՌͱࢿൺ ▸ زԿฏۉΛ༻ͨ͠ෳརޮՌͷ۩ମྫ ▸ 2.7.2 ෳརܕڧԽֶशͷΈ
▸ ߦಈՁؔQͷఆࣜԽ ▸ 2.7.3 ෳརܕڧԽֶशΞϧΰϦζϜ ▸ ෳརܕQֶश ▸ ෳརܕOnPSʢOnline Profit Sharingʣ ▸ 2.7.4 ࢿൺͷ࠷దԽ ▸ 2.7.5 ϑΝΠφϯεͷԠ༻ɿࠃ࠴ฑબ
2.7.1 རӹͷෳརޮՌͱࢿൺ 3ຊόϯσΟοτ ▸ ͲͷϚγϯ͕͓ಘ͔ʁ
2.7.1 རӹͷෳརޮՌͱࢿൺ ͲͷϚγϯ͕͓ಘ͔ʁ ▸ ͷຊ࣭ࢉज़ฏۉ͔زԿฏۉ͔ ▸ ֫ಘͨ͠རӹؚΊͯશֹ͔͚ଓ͚Δͱ͖زԿฏۉΛߟྀ͢Δ ඞཁ͕͋ΔʢAͷબ͕ྑ͍ʣ ▸ زԿฏۉෳརͷΑ͏ͳൺͰมԽ͢Δͱ͖ʹ༻͢Δ
▸ ʢຖճ1υϧBET͢Δ߹ࢉज़ฏۉͷ΄͏͕ྑ͍݁ՌʹͳΔ ͣʣ ▸ https://www.jstage.jst.go.jp/article/tjsai/26/2/26_2_330/_pdf
2.7.1 རӹͷෳརޮՌͱࢿൺ ෳརޮՌΛ࠷େԽ͢ΔͨΊʹέϦʔج४ ▸ ΫϩʔυɾγϟϊϯΒͱڞʹɺใڞ༗Λར༻͠ɺΪϟϯϒϧͰ࠷ޮͷΑ ͍Ṍ͚ํΛݚڀͨ͠ɻͨͩ͠ɺࣗΒṌ͚Δ͜ͱ͠ͳ͔ͬͨɻʢwikipediaʣ
2.7.2 ෳརܕڧԽֶशͷΈ ऩӹׂҾͷൺֱ ऩӹͷׂҾ ׂҾෳརརӹ R: རӹ, γ: ׂҾ, f:
ࢿൺ r: ใु త: ߦಈՁؔQΛ࠶ؼతͳܗͰఆࣜԽ͠Q- learningʹ͍͖͍࣋ͬͯͨ
2.7.2 ෳརܕڧԽֶशͷΈ ঢ়ଶՁ؍ͷఆࣜԽ
2.7.2 ෳརܕڧԽֶशͷΈ ߦಈՁ؍ͷఆࣜԽ ͋ͱQΛ࠷େԽ͢ΔํࡦπΛֶश͢Δ
2.7.3 ෳརܕڧԽֶशΞϧΰϦζϜ ෳརܕQֶश ▸ ҰൠతͳQֶशͱߟ͑ํಉ͡ ▸ ใुΛརӹͷରͰஔ͖͑ͨ
2.7.3 ෳརܕڧԽֶशΞϧΰϦζϜ ෳརܕQֶशͷΞϧΰϦζϜ
2.7.3 ෳརܕڧԽֶशΞϧΰϦζϜ ෳརܕOnPS(ONLINE PROFIT SHARING) ▸ Profit Sharing ▸ QֶशϚϧίϑੑɺProfit
SharingඇϚϧίϑੑOK ▸ ঢ়ଶs, ߦಈaͷ༏ઌΛPΛஔ͘, F৴༻ׂؔʢڧԽؔ ʣ ▸ ใु֫ಘʹෆඞཁͳߦಈΛଟ࣮͘ߦ͢Δඇ߹ཧͳํࡦΛֶ श͢Δ՝͋Γ ▸ https://www.jstage.jst.go.jp/article/fss/27/0/27_0_304/ _pdf
2.7.3 ෳརܕڧԽֶशΞϧΰϦζϜ ෳརܕOnPSͷΞϧΰϦζϜ
2.7.4 ࢿൺͷ࠷దԽ ࢿൺGͬͯͲ͏ͬͯબͿͷʁ ▸ ΦϯϥΠϯޯ๏ͰfΛߋ৽͢ΕOK
Ͳͷࠃͷࠃ࠴Λߪೖ͢ΕΑ͍͔ ▸ ෳརܕQֶशͱैདྷͷQֶशΛൺֱ ▸ ෳརܕQֶशزԿฏۉͰརӹͷ ෳརޮՌΛେ͖͘͢Δ͜ͱ͕Ͱ͖ͯ ͍Δ 2.7.5 ϑΝΠφϯεͷԠ༻ྫɿࠃ࠴ฑબ