Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Metalearning shared Hierarchy
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Wonseok Jung
August 28, 2018
Science
0
51
Metalearning shared Hierarchy
Metalearning shared Hierarchy
논문 review
Wonseok Jung
August 28, 2018
Tweet
Share
More Decks by Wonseok Jung
See All by Wonseok Jung
Ai for business -self car driving
wonseokjung
0
210
reinforcement_learning_.pdf
wonseokjung
2
1.5k
원석이의 모두연에서 강화학습 보석되기
wonseokjung
0
430
NeuralIPS
wonseokjung
0
430
Introduction Deep Reinforcement Learning
wonseokjung
0
170
Deep reinforcemenet learning -2
wonseokjung
0
210
Deep Reinforcement Learning - Introduction
wonseokjung
1
650
How to become a datascientist ?
wonseokjung
2
2.3k
Review of Taylor series
wonseokjung
1
120
Other Decks in Science
See All in Science
データベース09: 実体関連モデル上の一貫性制約
trycycle
PRO
0
1.1k
NASの容量不足のお悩み解決!災害対策も兼ねた「Wasabi Cloud NAS」はここがスゴイ
climbteam
1
340
データベース11: 正規化(1/2) - 望ましくない関係スキーマ
trycycle
PRO
0
1.1k
白金鉱業Meetup_Vol.20 効果検証ことはじめ / Introduction to Impact Evaluation
brainpadpr
2
1.6k
データベース08: 実体関連モデルとは?
trycycle
PRO
0
1k
検索と推論タスクに関する論文の紹介
ynakano
1
150
Accelerating operator Sinkhorn iteration with overrelaxation
tasusu
0
190
LayerXにおける業務の完全自動運転化に向けたAI技術活用事例 / layerx-ai-jsai2025
shimacos
5
21k
機械学習 - SVM
trycycle
PRO
1
980
Navigating Weather and Climate Data
rabernat
0
110
機械学習 - ニューラルネットワーク入門
trycycle
PRO
0
940
Celebrate UTIG: Staff and Student Awards 2025
utig
0
790
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Chasing Engaging Ingredients in Design
codingconduct
0
110
YesSQL, Process and Tooling at Scale
rocio
174
15k
Six Lessons from altMBA
skipperchong
29
4.2k
Technical Leadership for Architectural Decision Making
baasie
2
250
How to Talk to Developers About Accessibility
jct
2
130
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
0
260
The Art of Programming - Codeland 2020
erikaheidi
57
14k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
100
How to build a perfect <img>
jonoalderson
1
4.9k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
67
Transcript
.FUB-FBOJOHTIBSFE)JFSBSDIZ 8POTFPL+VOH 3FJOGPSDFNFOU-FBSOJOH
ਗࢳ 8POTFPL+VOH $JUZ6OJWFSTJUZPG/FX:PSL#BSVDI$PMMFHF %BUB4DJFODF.BKPS $POOFYJPO"*"*3FTFBSDIFS %FFQ-FBSOJOH$PMMFHF3FJOGPSDFNFOU-FBSOJOH3FTFBSDIFS .PEVMBCT$53--FBEFS 3FJOGPSDFNFOU-FBSOJOH 0CKFDU%FUFDUJPO
$IBUCPU (JUIVC IUUQTHJUIVCDPNXPOTFPLKVOH 'BDFCPPL IUUQTXXXGBDFCPPLDPNXTKVOH #MPH IUUQTXPOTFPLKVOHHJUIVCJP
ݾର 1. Introduction 2. Problem Statement 3. Algorithm 4. Experiments
META LEARNING SHARED HIERARCHIES
1.INTRODUCTION
1. UTILIZE PRIOR KNOWLEDGE META LEARNING SHARED HIERARCHIES 6UJMJ[FQSJPSLOPXMFEHF .BTUFSOFXUBTL
1.1 BUT REINFORCEMENT… META LEARNING SHARED HIERARCHIES How about Reinforcement
Learning?
1.2 SOLVE EACH TASK INDEPENDENTLY AND FROM SCRATCH SUPERMARIO WITH
R.L https://www.youtube.com/watch?v=IjvbhwuCaF0
1.3 ISSUES META LEARNING SHARED HIERARCHIES Sharing information Task1 Task2
Task3 θ1 θ2 θ3
1.4 MASTER POLICY META LEARNING SHARED HIERARCHIES Master Policy Sub1
Sub2 Sub3 θ1 θ2 θ3
1.5 MLSH META LEARNING SHARED HIERARCHIES Metalearning shared hierarchies
2.PROBLEM STATEMENT
2.1 NOTATION Time step Action Transition Function Reward Set of
states Set of actions Start state Discount factor t a P(s′, r ∣ s, a) r A S S0 γ Set of reward Policy Reward State R π r REINFORCEMENT LEARNING s
2.2 NOTATION META LEARNING SHARED HIERARCHIES EJTUSJCVUJPOPWFS.%1T "HFOUחQBSBNFUFSWFDUPSܳӝਵ۽VQEBUFೠ పझٜՙܻҕਬೞחۄఠ
пపझۄఠ BHFOUоഅపझ.ਸߓݴসؘೞחۄఠ PM πθ,ϕ(a∣s) ϕ θ
"DUJPO "HFOU &OWJSPONFOU 3FXBSE At Rt 4UBUF St Rt+1 St+1
REINFORCEMENT LEARNING 2.3 OBJECTIVE MDP
REINFORCEMENT LEARNING 2.4 NEW MDP &OWJSPONFOU 3FXBSE At Rt St
Rt+1 St+1 5BQUIFCBMM 1PTJUJWF3FXBSE New MDP
SUPERMARIO WITH R.L 2.5 NEW MDP-2 "DUJPO "HFOU &OWJSPONFOU 3FXBSE
At Rt 4UBUF St Rt+1 St+1 3FXBSE 1FOBMUZ Another New MDP
2.6 FIND SHARING PARAMETER META LEARNING SHARED HIERARCHIES maximizeϕ EM∼PM
, t = 0...T − 1[R]
2.7 STRUCTURE META LEARNING SHARED HIERARCHIES
3.ALGORITHM
3.1 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES
3.2 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Two main components
3.3 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period
3.4 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period
3.5 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES Joint update period
Warmup period θ θ, ϕ update
3.6 MLSH ALGORITHM-2 META LEARNING SHARED HIERARCHIES Joint update period
Warmup period θ θ, ϕ update
3.7 MLSH ALGORITHM-WARMUP META LEARNING SHARED HIERARCHIES update
3.8 MLSH ALGORITHM- JOINT UPDATE PERIOD META LEARNING SHARED HIERARCHIES
update
3.8 MLSH ALGORITHM META LEARNING SHARED HIERARCHIES update
4. EXPERIMENTS
4.1 2D MOVING BANDITS TASK META LEARNING SHARED HIERARCHIES
4.2 RESULT(2D BALL) META LEARNING SHARED HIERARCHIES
4.3 WALKING, CRAWLING META LEARNING SHARED HIERARCHIES
4.4 WALKING, CRAWLING META LEARNING SHARED HIERARCHIES