Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Metalearning shared Hierarchy
Search
Wonseok Jung
August 28, 2018
Science
0
49
Metalearning shared Hierarchy
Metalearning shared Hierarchy
논문 review
Wonseok Jung
August 28, 2018
Tweet
Share
More Decks by Wonseok Jung
See All by Wonseok Jung
Ai for business -self car driving
wonseokjung
0
200
reinforcement_learning_.pdf
wonseokjung
2
1.5k
원석이의 모두연에서 강화학습 보석되기
wonseokjung
0
430
NeuralIPS
wonseokjung
0
420
Introduction Deep Reinforcement Learning
wonseokjung
0
160
Deep reinforcemenet learning -2
wonseokjung
0
200
Deep Reinforcement Learning - Introduction
wonseokjung
1
650
How to become a datascientist ?
wonseokjung
2
2.3k
Review of Taylor series
wonseokjung
1
120
Other Decks in Science
See All in Science
データベース10: 拡張実体関連モデル
trycycle
PRO
0
1k
(2025) Balade en cyclotomie
mansuy
0
300
データベース02: データベースの概念
trycycle
PRO
2
990
動的トリートメント・レジームを推定するDynTxRegimeパッケージ
saltcooky12
0
240
Collective Predictive Coding as a Unified Theory for the Socio-Cognitive Human Minds
tanichu
0
140
NDCG is NOT All I Need
statditto
2
2.6k
論文紹介 音源分離:SCNET SPARSE COMPRESSION NETWORK FOR MUSIC SOURCE SEPARATION
kenmatsu4
0
460
会社でMLモデルを作るとは @電気通信大学 データアントレプレナーフェロープログラム
yuto16
1
440
【論文紹介】Is CLIP ideal? No. Can we fix it?Yes! 第65回 コンピュータビジョン勉強会@関東
shun6211
5
2.1k
凸最適化からDC最適化まで
santana_hammer
1
340
データマイニング - ウェブとグラフ
trycycle
PRO
0
210
Rashomon at the Sound: Reconstructing all possible paleoearthquake histories in the Puget Lowland through topological search
cossatot
0
200
Featured
See All Featured
Visualization
eitanlees
150
16k
Prompt Engineering for Job Search
mfonobong
0
120
Side Projects
sachag
455
43k
Public Speaking Without Barfing On Your Shoes - THAT 2023
reverentgeek
0
270
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
0
2.2k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
2
3.8k
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
47
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
94
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
23
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Evolving SEO for Evolving Search Engines
ryanjones
0
72
Transcript
.FUB-FBOJOHTIBSFE)JFSBSDIZ 8POTFPL+VOH 3FJOGPSDFNFOU-FBSOJOH
ਗࢳ 8POTFPL+VOH $JUZ6OJWFSTJUZPG/FX:PSL#BSVDI$PMMFHF %BUB4DJFODF.BKPS $POOFYJPO"*"*3FTFBSDIFS %FFQ-FBSOJOH$PMMFHF3FJOGPSDFNFOU-FBSOJOH3FTFBSDIFS .PEVMBCT$53--FBEFS 3FJOGPSDFNFOU-FBSOJOH 0CKFDU%FUFDUJPO
$IBUCPU (JUIVC IUUQTHJUIVCDPNXPOTFPLKVOH 'BDFCPPL IUUQTXXXGBDFCPPLDPNXTKVOH #MPH IUUQTXPOTFPLKVOHHJUIVCJP
ݾର 1. Introduction 2. Problem Statement 3. Algorithm 4. Experiments
META LEARNING SHARED HIERARCHIES
1.INTRODUCTION
1. UTILIZE PRIOR KNOWLEDGE META LEARNING SHARED HIERARCHIES 6UJMJ[FQSJPSLOPXMFEHF .BTUFSOFXUBTL
1.1 BUT REINFORCEMENT… META LEARNING SHARED HIERARCHIES How about Reinforcement
Learning?
1.2 SOLVE EACH TASK INDEPENDENTLY AND FROM SCRATCH SUPERMARIO WITH
R.L https://www.youtube.com/watch?v=IjvbhwuCaF0
1.3 ISSUES META LEARNING SHARED HIERARCHIES Sharing information Task1 Task2
Task3 θ1 θ2 θ3
1.4 MASTER POLICY META LEARNING SHARED HIERARCHIES Master Policy Sub1
Sub2 Sub3 θ1 θ2 θ3
1.5 MLSH META LEARNING SHARED HIERARCHIES Metalearning shared hierarchies
2.PROBLEM STATEMENT
2.1 NOTATION Time step Action Transition Function Reward Set of
states Set of actions Start state Discount factor t a P(s′, r ∣ s, a) r A S S0 γ Set of reward Policy Reward State R π r REINFORCEMENT LEARNING s
2.2 NOTATION META LEARNING SHARED HIERARCHIES EJTUSJCVUJPOPWFS.%1T "HFOUחQBSBNFUFSWFDUPSܳӝਵ۽VQEBUFೠ పझٜՙܻҕਬೞחۄఠ
пపझۄఠ BHFOUоഅపझ.ਸߓݴসؘೞחۄఠ PM πθ,ϕ(a∣s) ϕ θ
"DUJPO "HFOU &OWJSPONFOU 3FXBSE At Rt 4UBUF St Rt+1 St+1
REINFORCEMENT LEARNING 2.3 OBJECTIVE MDP
REINFORCEMENT LEARNING 2.4 NEW MDP &OWJSPONFOU 3FXBSE At Rt St
Rt+1 St+1 5BQUIFCBMM 1PTJUJWF3FXBSE New MDP
SUPERMARIO WITH R.L 2.5 NEW MDP-2 "DUJPO "HFOU &OWJSPONFOU 3FXBSE
At Rt 4UBUF St Rt+1 St+1 3FXBSE 1FOBMUZ Another New MDP
2.6 FIND SHARING PARAMETER META LEARNING SHARED HIERARCHIES maximizeϕ EM∼PM
, t = 0...T − 1[R]
2.7 STRUCTURE META LEARNING SHARED HIERARCHIES
3.ALGORITHM
3.1 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES
3.2 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Two main components
3.3 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period
3.4 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period
3.5 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period θ θ, ϕ update
3.6 MLSH ALGORITHM-2 META LEARNING SHARED HIERARCHIES Joint update period
Warmup period θ θ, ϕ update
3.7 MLSH ALGORITHM-WARMUP META LEARNING SHARED HIERARCHIES update
3.8 MLSH ALGORITHM- JOINT UPDATE PERIOD META LEARNING SHARED HIERARCHIES
update
3.8 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES update
4. EXPERIMENTS
4.1 2D MOVING BANDITS TASK META LEARNING SHARED HIERARCHIES
4.2 RESULT(2D BALL) META LEARNING SHARED HIERARCHIES
4.3 WALKING, CRAWLING META LEARNING SHARED HIERARCHIES
4.4 WALKING, CRAWLING META LEARNING SHARED HIERARCHIES