Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Curiosity-driven Exploration: 好奇心駆動探索

Yasunari Ota
October 05, 2018

Curiosity-driven Exploration: 好奇心駆動探索

Explanation of one of the reinforcement learning, Curiosity-driven Exploration.

Yasunari Ota, Hokkaido Univ., Faculty of Engineering, Mechanical Engineering Department
北海道大学工学部機械知能工学科3年 大田 康就

Yasunari Ota

October 05, 2018

Other Decks in Technology


  1. ɹಛ௃ϕΫτϧԽํ๏ͷ࠷దԽ ٯϞσϧ inverse dynamics model st , st+1Λ؍ଌޙCNN౳Ͱಛ௃ϕΫτϧԽ ؀ڥ͔Βঢ়ଶ ؒͷߦಈ

    st , st+1 at Λਪଌ = ̂ at ࣮ࡍͷߦಈͱਪଌ͞ΕͨߦಈͷෆҰக౓Λද͢ ଛࣦؔ਺ͷ࠷খԽ ϕ(st ), ϕ(st+1 ) ̂ at = g(st , st+1 ; θI ) st , st+1 → at ɹͷ࠷దԽ͕໨త ϕ
  2. ɹ޷ح৺ͷಋೖ ॱϞσϧ forward dynamics model st , at ͔Β࣍ͷঢ়ଶ ঢ়ଶͱߦಈ

    Λਪଌ ࣮ࡍͷঢ়ଶͱਪଌ͞Εͨঢ়ଶͷෆҰக౓Λද͢ ޡࠩؔ਺Λ࠷খԽ ϕ(st+1 ) ̂ ϕ(st+1 ) = f(ϕ(st ), at ; θF ) ಺෦ใु͸࣍ঢ়ଶͷ༧૝͕೉͍͠΄Ͳେ͖͍ ri t = η 2 || ̂ ϕ(st+1 ) − ϕ(st+1 )||2 st , at → st+1 ٯϞσϧͰಘͨ ಛ௃ϕΫτϧ
  3. Intrinsic Curiosity Module (ICM)ͷߏ଄ ৞ΈࠐΈ૚: 4૚ ϑΟϧλʔ: ֤ʑ32ݸ ΧʔωϧαΠζ: 3x3

    શ݁߹૚: 288+1 → 256 → 288 શ݁߹૚: 288x2 → 256 → 4
  4. ࢀߟจݙ • ݩ࿦จ
 https://pathak22.github.io/noreward-rl/resources/ icml17.pdf • Unity Blog
 https://blogs.unity3d.com/jp/2018/06/26/solving-sparse- reward-tasks-with-curiosity/

    • ୈ44ճCVษڧձʮڧԽֶश࿦จಡΈձʯൃදࢿྉ
 https://www.slideshare.net/takmin/curiosity-driven- exploration