Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction for Reading "Reinforcement Learning"

E2dd989b2ba0f83d8a981b9cb3197bf1?s=47 mocobt
November 04, 2019

Introduction for Reading "Reinforcement Learning"

To introduce for reading "reinforcement learning" in Japanese.

The original book is written by Tetsuro Morimura and published by Kodansha Scientific.
You can get more information from here: https://www.kspub.co.jp/book/detail/5155912.html .

This activity is unofficial.

E2dd989b2ba0f83d8a981b9cb3197bf1?s=128

mocobt

November 04, 2019
Tweet

Transcript

  1. ʰڧԽֶशʱྠಡձ Introduction @mocobt @icebee__ ओ࠵ @lystahi

  2. Agenda ։࠵֓ཁ (10min) by @mocobt ࣗݾ঺հ (10min) ୈ1ষղઆ (60min) by

    @lystahi ୲౰ܾΊ (10min) ୈ2ষղઆ (as possible) by @icebee__ ఫऩ (5min)
  3. ։࠵֓ཁ ໨త: ڧԽֶशͷཧ࿦ΛʮԿͱͳ͘ʯཧղ͢Δ (෮शਪ঑) ର৅ • ڧԽֶशͷཧ࿦Λֶͼ͍ͨਓ • ॳֶऀͱަྲྀ͢Δ͜ͱͰ৽ͨͳؾ͖ͮΛಘ͍ͨਓ (׻ܴ)

    ৔ॴ:ौ୩͔৽॓ (Ԡ૬ஊ) ࢀՃඅ༻: ݪଇ(৔ॴ୅/ਓ਺)ԁɽֶͨͩ͠ੜແྉ
  4. ର৅ॻ੶ • ػցֶशϓϩϑΣογϣφϧγϦʔζʮڧԽֶशʯ • ৿ଜ఩࿠ ஶ • ߨஊࣾαΠΤϯςΟϑΟΫ ग़൛ •

    ཧ࿦த৺Ͱ࣮૷ܥͷ࿩͸ແ͠ • https://www.kspub.co.jp/book/detail/5155912.html
  5. ਐΊํ • 1ճʹ͖ͭݪଇ1ষΛਐΊ͍ͯ͘ - ૣ͘ਐΜͩ৔߹͸લ౗͠ʹ͢ΔͨΊɼ࣍ि୲౰ऀ΋ࢿྉ࡞੒͓ئ͍͠·͢ • ൃද୲౰ऀͷTODO • ॏཁͱࢥͬͨՕॴΛϐοΫΞοϓͯ͠εϥΠυΛ࡞Δ -

    Θ͔Βͳ͍͜ͱ͸ʮΘ͔Γ·ͤΜͰͨ͠ʯͰOK - ΋ͪΖΜ׬ᘳʹ௥͍͚ͬͯͨͩΔͱେมॿ͔Γ·͢ʂ - Α΄Ͳͷࣄ৘͕ͳ͍ݶΓɼυλΩϟϯ͸͓߇͍͑ͩ͘͞ • ௌߨऀͷTODO: ܰ͘ಡΜͰ͘Δ
  6. εέδϡʔϧ ೔ఔ Chapter ಺༰ ୲౰ऀ 11/05 (Ր) 1 ४උ @lystahi

    11/19 (Ր) 2 ϓϥϯχϯά @icebee__ 12/03 (Ր) 3 ୳ࡧͱ׆༻ͷτϨʔυΦϑ • 12/03 (Ր) 4 ϞσϧϑϦʔܕͷڧԽֶश 12/17 (Ր) 5 ϞσϧϕʔεܕͷڧԽֶश 01/14 (Ր) 6 ؔ਺ۙࣅΛ༻͍ͨڧԽֶश 01/28 (Ր) 7 ෦෼؍ଌϚϧίϑܾఆաఔ 02/11 (Ր) 8 ࠷ۙͷ࿩୊ ※ ୈ3ষͱୈ4ষ͸ಉ͡୲౰ऀͱ͢Δ ※ ೔ఔ͸ࢀՃऀͷ༧ఆʹԠͯ͡దٓมߋͷՄೳੑ͋Γ
  7. (͕࣌ؒ༨ͬͨ࣌༻) ͪΐͬͱࡶஊ: ڧԽֶश΍ͬͯԿ͕͏Ε͍͠ͷʁ

  8. ڧԽֶशͷࡶͳཧղ https://deepage.net/machine_learning/2017/08/10/reinforcement-learning.html 1. ΤʔδΣϯτ͕ߦಈΛߦ͏͜ͱͰใुΛಘΔ 2. ใुʹԠͯ͡ΤʔδΣϯτͷঢ়ଶ͕ભҠ͢Δ 3. ֫ಘใु͕࠷େͱͳΔΑ͏ͳߦಈΛֶश͢Δ ऩଋ͢Δ·Ͱ ܁Γฦ͢

  9. ڧԽֶशͷԠ༻ྫ ΩϟϥΫλʔΞχϝʔγϣϯੜ੒ ٯڧԽֶशʹΑΔNN Architecture୳ࡧ ը૾෮ݩ Ray Tracing

  10. RL-Restore [Yu++. CVPR 2018] ྼԽͷछྨʹґΒͣɼ౷Ұతʹը૾෮ݩ͢Δख๏ͷݚڀ طଘݚڀͷ՝୊ λεΫಛԽͷArchitecture͸൚༻ੑ͕௿͍ (ྫ͑͹ɼDeblur͢Δ͚ͩ…) ৽نੑ ෳ਺ͷCNNΛ༻͍ͯɼྼԽͷछྨʹԠͯ͡

    ಈతʹArchitectureΛมߋ͢ΔRLख๏ Deblurring Denoising
  11. IRLAS [Guo++. CVPR 2019] “ڧ͍” NN ArchitectureΛ୳ࡧ͢ΔͨΊͷख๏ͷݚڀ ٯڧԽֶश (Inverse RL)

    ࠷దͳߦಈ͔ΒใुΛਪఆ͢Δख๏ (ใुΛఆٛͮ͠Β͍৔߹ʹศར) ৽نੑ Inverse RLΛ༻͍ͨNeural Architecture Search (ࠨ: ResNeXt, த: NASNet, ӈ: ఏҊख๏ʹΑΔNN) https://qiita.com/neka-nat@github/items/aaab6184aea7d285b103
  12. DeepMimic [Peng++. TOG 2018] λεΫΛ͜ͳ্͢Ͱ࠷΋Β͍͠ϞʔγϣϯΛੜ੒͢Δݚڀ طଘݚڀͷ՝୊ Mocap΍RL͚ͩΛ༻͍ͨಈ͖͸ΩϞ͍ (ϊΠζͷӨڹ or λεΫΛ͜ͳ͚ͩ͢)

    ৽نੑ Reference MotionΛ༻͍ͭͭɼ RLͰಛఆλεΫΛ͜ͳ͢Ϟʔγϣϯੜ੒
  13. Learning Light Transport the Reinforced Way [Dahm++. SIGGRAPH 2017 Talks]

    ୭͔ͷ࣮૷: https://github.com/PalashBansal96/CUDAPathTracerRL Ray TracingͰRLΛ༻͍ͯޮ཰తʹRayΛඈ͹͢ݚڀ طଘݚڀͷ՝୊ RayΛඈ͹ͯ͠΋தʑऩଋͤͣɼNoisy (a,b,c: ैདྷख๏, d: ఏҊख๏ͰಉҰ਺ͷRay) ৽نੑ Path TracingͷSamplingʹRLΛಋೖͨ͠఺
  14. References • cousera ڧԽֶशઐ໳ߨ࠲: ·ͱ΋ͳڭ͑Λड͚͍ͨਓ޲͚ • Richard S. SuttonઌੜͷHP: ڧԽֶशͷࢀߟॻଟ਺

    • https://github.com/openai/gym: ࣮૷͍ͨ͠ਓ޲͚ • AWS DeepRacer: ڧԽֶशΛ༻͍ͨࣗಈӡసίϯϖɽಆ૪ΛٻΊΔਓ޲͚ • https://github.com/aikorea/awesome-rl: awesome repository (survey) • Mathpix Snipping Tool: εϥΠυʹ਺ࣜΛೖΕ͍ͨͱ͖ʹศར (ͨͩ͠΄΅༗ྉ੍ʹ)
  15. ࣗݾ঺հλΠϜ

  16. • CGͷR&DΤϯδχΞ - LightingͱModelingؔ࿈ • ػցֶश͸झຯ & Kaggle Expert •

    ڧԽֶशૉਓ - ׂͱςΩτʔͳ͜ͱݴ͍·͢ About me @mocobt