Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Visual Forecasting by Imitating Dynamics in Natural Sequences (ICCV’17) 論文紹介

jellied_unagi
February 03, 2018

Visual Forecasting by Imitating Dynamics in Natural Sequences (ICCV’17) 論文紹介

コンピュータビジョン勉強会@関東 #44 強化学習論文読み会

jellied_unagi

February 03, 2018
Tweet

Other Decks in Research

Transcript

  1. Visual Forecasting by Imitating Dynamics in Natural Sequences (ICCV’17) Kuo-Hao

    Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles Presenter: @jellied_unagi MIRU2018
  2.    • '-20147 • %"<+   !2

    • ECCV’18*1(8 • https://twitter.com/rezoolab/status/958173195944669184 /;&3,9)0)#   :65$ 4.3/14
  3. Visual Forecasting    " !   

     #: !  !  Visual Forecasting by Imitating Dynamics in Natural Sequences (https://arxiv.org/abs/1708.05827) " "
  4. Visual Forecasting •    -> imitation learning •

      •   • CMU Deep RL and Control (https://katefvision.github.io/) • Generative Adversarial Imitation Learning [Jonathan+, NIPS’16] • https://arxiv.org/abs/1606.03476 • Visual Forecasting by Imitating Dynamics in Natural Sequences [Zeng+, ICCV’17] • https://arxiv.org/abs/1708.05827
  5.  (Imitation Learning) • #" "&% @3 J<D3I7Q  #"

    . M6-: ,F • RL) • RL: OL;.Aor !"%.AM6?G NF • IL: L;.Aor !"%.AE7Q O Q: 8I*H 95'(+C1=3M20 →L;.AB/ 4K →>*H$# -P Deep RL and Control (https://katefvision.github.io/katefSlides/immitation_learning_I_katef.pdf)
  6. : Behavior Cloning • !" (imitation learning as supervised learning)

    • NVidia  [Bojarski+,16] :    :  !: NN Deep RL and Control (https://katefvision.github.io/katefSlides/immitation_learning_I_katef.pdf)
  7. Behavior Cloning • TIK: ;W S @P 4>  •

    W: !#(&<N<N / '/   • Demonstration augmentation: 3=J%$F • W: QC +.VU [Bojarski+,16; Giusti+,15] • Dataset aggregation (DAGGER) [Ross+,11]: • ;W?H8MR:4> • 4>R:U"*,1"-257A?HED • ED ?HGB6 8M.)0O • 1. ;W2., 3.ED.)0O ;WL9A;W1.  Deep RL and Control (https://katefvision.github.io/katefSlides/immitation_learning_I_katef.pdf)
  8.  (Inverse Reinforcement Learning) • #/'&+ • &+ 'RL-"&+ (apprenticeship

    learning [Abbeel & Ng, 04]) ˜ c = IRL(⇡E) ⇡ = RL(˜ c) ' -"' -" $!-"  )(  %* ,#/. Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476)
  9.    • (94,8 •  "!8-6&$ !)1208-3 

    • 08- "!8-*  • #%75.:  !'/!+  Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) E⇡E [c(s, a)]  E⇡ [c(s, a)] 8⇡ E⇡ [c(s, a)] = E " 1 X t tc(st, at) #  "!8- !)12 08- !)12
  10.    Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) IRL(⇡E) =

    argmax c2C min ⇡2⇧ (c) H(⇡) + E⇡ [c(s, a)] E⇡E [c(s, a)] RL(c) = argmin ⇡2⇧ H(⇡) + E⇡ [c(s, a)] ?L7M: <A6E FGC(!%*1 @41>"#%1>)$*/L2?L [Abbeel & Ng, ‘04] C = {w T f(s, a) | kwk2  1} H(⇡) = E⇡ [ log ⇡(a | s)] J91,%+'8B/ (Max-Ent IRL [Ziebart+, 08]) !1 8B/:  #&%J9"#% ;   "1 8;/:  #&%J93J95 “IRLHIRL0” ?L.2K  "#%8;/J9=D ,%+'8B/-.:5 http://www.singularpoint.org/blog/math/optimization/logit-maximum-entoropy/
  11. Occupancy Measure • 7)*( %/-1(465 • .$"&02!#, •  ,3

     ⇢⇡(s, a) = ⇡(a | s) X t tP(st = s | ⇡) '+   E⇡ [c(s, a)] = X s,a ⇢⇡(s, a)c(s, a) H(⇡) = X s,a ⇢(s, a) log ⇢(s, a)/ X a0 ⇢(s, a 0) ! = ¯ H(⇢) Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476)
  12. Occupancy Measure  Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) max c

    min ⇡2⇧ (c) H(⇡) + E⇡ [c(s, a)] E⇡E [c(s, a)] max c min ⇢ const ¯ H(⇢) + X s,a c(s, a) (⇢(s, a) ⇢E(s, a)) min ⇢ ¯ H(⇢) s.t., ⇢(s, a) = ⇢E(s, a) ! " constant    minimizexf(x) s.t. h(x) = 0 $ maximize inf x f(x) + X i ih(x) !   http://ir5.hatenablog.com/entry/20141214/1418553079
  13. Occupancy Measure  • “IRLoccupancy measure matching;634;6” • !"&134;6% •

    .;6%!#"occupancy measure -> !#"9, •  ! ", $ = !& (", $)) 2< • -='05*7/     Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) min ⇢ ¯ H(⇢) s.t., ⇢(s, a) = ⇢E(s, a) :  GAIL>8+($
  14. Generative Adversarial IL • !  -+% = $.,# &)

    • ,# "( • ! !" ” ” Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) min ⇡ ⇤(⇢ ⇢E) H(⇡)  *'
  15. Generative Adversarial IL • !)13$ N8' • GAN 4A •

    Discriminator: IH(EK<P+,.0/EK<P)9Q'7G)5B • +,.0/R=JD(s, a)CT M' +,.0/R=7M'-./7G)5B'"   • Generator: +,.0/?U@EK<P)IH'R=)5B • O3&6S ! ;L#*adversarial loss>':2 D (! +,.0/R= % C EK<P FNR= % C EK<P  adversarial loss Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) GA(⇢ ⇢E) = max D2{0,1}S,A E⇡ [log D(s, a)] + E⇡E [log(1 D(s, a))]
  16. Generative Adversarial IL D  RL G s, a D

    Generative Adversarial Imitation Learning (https://arxiv.org/abs/1606.03476) TRPO  https://www.slideshare.net/mooopan/trust-region-policy-optimization  
  17. Visual Forecasting by Imitating Dynamics in Natural Sequences (ICCV’17) Kuo-Hao

    Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles Presenter: @jellied_unagi
  18. Visual Forecasting Visual Forecasting by Imitating Dynamics in Natural Sequences

    (https://arxiv.org/abs/1708.05827)    !      ":    
  19.    • %: "(&,; ': +"(&, • ):

    ! Visual Forecasting by Imitating Dynamics in Natural Sequences (https://arxiv.org/abs/1708.05827) $" occupancy measure matching  )  +" (&,*#  
  20.    Visual Forecasting by Imitating Dynamics in Natural

    Sequences (https://arxiv.org/abs/1708.05827)  generator   discriminator  discriminator   generator
  21.    • NVE\?<#fY • 7U#(#0NVE\ - 2635RQi.e., 7URQXc

    • W/,"7U+RQ.deep feature0H Visual Forecasting by Imitating Dynamics in Natural Sequences (https://arxiv.org/abs/1708.05827) Resnet  CPZ!;T: 1. CVag# -fY08'"bagF9$ML# O=Lgenerative adversarial IL0]`- 2. CV^h#fYNVE\?<D!-#' O=L0#%%[i-#$G_… 3. O=L0) :li - Je#* !l m#S&kd I %2. 3.!  technical contributionM B/. bag#=L0 JD241CV#fY"H  H ! (# 0A -  @>Kj-* "!-
  22.  • Future frame generation • %  $ •

    ! Moving MNIST • Action prediction • %  #$ • TV Human Interaction Dataset • Storyline forecasting • 5  " % Short-term (next frame) & long-term (next representative event) • Visual Story-telling Dataset Visual Forecasting by Imitating Dynamics in Natural Sequences (https://arxiv.org/abs/1708.05827)
  23. Future Frame Generation G6!#0.- )18 *<!+Adversarial loss5L1 loss G,4 3-steps;#0concatenate

    (A.- "0>2LSTM [Srivastava+, 15] @? &3*<%=91'/>2   -> 7'>2 :  8 *<G 7'$(A   Visual Forecasting by Imitating Dynamics in Natural Sequences (https://arxiv.org/abs/1708.05827)
  24.  Action prediction Story forecasting Visual Forecasting by Imitating Dynamics

    in Natural Sequences (https://arxiv.org/abs/1708.05827)
  25.  • Visual forecasting &=generative adversarial IL/% • RL/ILCV>@ +

     • IL 5(A3: visual forecasting [Zeng+, ICCV’17], activity forecasting [Ma+, CVPR’17; Rhinehart+, ICCV’17] • RL )8>': object tracking [Huang+, ICCV’17], activity recognition [Wu+, ICCV’17] [Huang+, ICCV’17] Object tracking: 7! CNN"1? forward*6 [Wu+, ICCV’17] ,8<-CB.0$2 9( #47;D*6 Learning Policies for Adaptive Tracking with Deep Feature Cascades (https://arxiv.org/abs/1708.02973); Anticipating Daily Intention using On-Wrist Motion Triggered Sensing (https://arxiv.org/abs/1710.07477)