Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[Journal club] Unbiased Scene Graph Generation ...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Semantic Machine Intelligence Lab., Keio Univ.
PRO
May 19, 2022
Technology
1.4k
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
[Journal club] Unbiased Scene Graph Generation from Biased Training
Semantic Machine Intelligence Lab., Keio Univ.
PRO
May 19, 2022
More Decks by Semantic Machine Intelligence Lab., Keio Univ.
See All by Semantic Machine Intelligence Lab., Keio Univ.
[Journal club ] PHyCLIP: ðð-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning
keio_smilab
PRO
0
43
[Journal club] ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation
keio_smilab
PRO
0
100
[Journal club] ReLaGS: Relational Language Gaussian Splatting
keio_smilab
PRO
0
110
[Journal club] Flow as the Cross-Domain Manipulation Interface
keio_smilab
PRO
0
90
Mobi-ð: Mobilizing Your Robot Learning Policy
keio_smilab
PRO
0
160
A Gentle Introduction to Transformers
keio_smilab
PRO
16
6.9k
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
keio_smilab
PRO
0
58
[Journal club] VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
keio_smilab
PRO
1
140
[Journal club] Improved Mean Flows: On the Challenges of Fastforward Generative Models
keio_smilab
PRO
0
200
Other Decks in Technology
See All in Technology
GitHub Copilotéçšã®ãªã¢ã« ïœAI Creditæä»£ã«ã©ãåãåããïœ
takafumisu2uk1
0
480
èªåã詳ãããªãé åã§AIã䜿ã #ãããã¹2026
konifar
20
7.9k
ãµã€ãããŒãšãŒã·ãã§ã³ãã«ãããAIæšé²æŠç¥ãšå€é©ãžã®åãçµã¿
shotatsuge
0
600
AIãããã¬ãŒã·ã§ã³ãã¹ãã» ã»ãã¥ãªãã£æ€èšŒãAgenticSecã玹ä»è³æ
laysakura
2
7.7k
åŸéçŒå°éšçœ²å šéœäº€çµŠ AIïŒå¯Šäœ AI é© åçèªååæµçš
appleboy
0
180
Flow äžæ»ïŒAI æä»£ DevOps çäžè®æ¬è³ª
cheng_wei_chen
2
550
4人ç®ã®SREã¯Agent
tanimuyk
0
270
飲é£åºãAIã§ãã¬ãžç· ãããã³ãã£ã·ã¹ãã ãã€ãã£ãŠã話 / Using AI for restaurant management
vtryo
0
200
åŸè§æå°å šå ¬åžèœå°ïŒAI Agentic Coding å°å ¥å¯Šæ° â æµçšæŽåèå®å šæ²»ç
appleboy
0
160
ãæšæ¶ã10åšå¹Žãè¿ããå ±åµã©ãã®ãããŸã§ãšããããã
iotcomjpadmin
0
150
åããŠã®Databrickså匷äŒ
taka_aki
2
180
ã³ãããã®ããªãããèªã
ota1022
0
120
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
270
14k
Prompt Engineering for Job Search
mfonobong
0
350
Navigating Team Friction
lara
192
16k
Optimizing for Happiness
mojombo
378
71k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
170
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Claude Code ã®ããã
schroneko
67
230k
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.5k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
260
WENDY [Excerpt]
tessaabrams
11
38k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
280
Color Theory Basics | Prateek | Gurzu
gurzu
0
370
Transcript
Kaihua Tang1, Yulei Niu3, Jianqiang Huang1,2, Jiaxin Shi4, Hanwang Zhang1
(1Nanyang Technological University,2Damo Academy, Alibaba Group, 3Renmin University of China,4Tsinghua University) Unbiased Scene Graph Generation from Biased Training Tang, Kaihua, et al. "Unbiased scene graph generation from biased training." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. æ ¶æçŸ©å¡Ÿå€§åŠ ææµŠåæç 究宀 çäžé§¿å¹³
2 ⢠Scene Graph Generation (SGG) ã«ãããŠïŒå ææšè«ã®æ çµã¿ãçšããŠããŒã¿ã»ããã äžãããã€ã¢ã¹ãäœæžãããææ³ã®ææ¡ â Counterfactual
Thinking â Total Direct Effect ( TDE ) ⢠TDEãå°å ¥ããããšã§ããŒã¹ã©ã€ã³ããã¹ã³ã¢ã®åäžã»ãã€ã¢ã¹ã®äœæž æŠèŠ â SGGã«ãããããŒã¿ã»ãããäžãããã€ã¢ã¹ã«é¢ããŠå§ããŠèæ ® â æ¢åã®SGGã®ãã¬ãŒã ã¯ãŒã¯ã«ãã®ãŸãŸé©çšå¯èœ
3 ⢠Scene Graph ( SG )ïŒç©äœéã«ãããé¢ä¿ãã°ã©ãã§è¡šçŸãããã® â Triplet <
subject(object)ïŒpredicateïŒobject > ⢠Scene Graph Generation ( SGG ) ã¿ã¹ã¯ïŒç»åããScene Graphãäºæž¬ãã ⢠SGã®å¿çšäŸïŒVision Question Answering (VQA)ïŒImage RetrievalïŒCaption Generationãªã© èæ¯ïŒSGGã¯VQAã¿ã¹ã¯ãªã©ã®CVåŠçãšããŠææ¡ããã Yang, Jianwei, et al. "Graph r-cnn for scene graph generation." Proceedings of the European conference on computer vision (ECCV). 2018.
4 æ¢åææ³ïŒæ§ã ãªè§åºŠããSGGã¢ãã«ãèšèšãããŠãã SGG model ç¹åŸŽ MOTIFS [Rowan+, IEEE2018] ã·ãŒã³ã°ã©ãã®é«æ¬¡ã¢ããŒããæããããã«èšèš VC-Tree
[Kaihua+, IEEE2019] åãªããžã§ã¯ãã®äŸåæå¹æ§ãèšç®ããã¹ã³ã¢é¢æ°ãèšèšãïŒ ã¹ã³ã¢è¡åããæå€§ã¹ããã³ã°ããªãŒã2å€åããããªãŒãäœæ MOTIFS [Rowan+, IEEE2018] VC-Tree [Kaihua+, IEEE2019]
5 ⢠ããããããŒã¿ã»ããã®è¿°èª(predicate)ã®ååžã«åã â ãã³ã°ããŒã«ãªååž â¢ æ¢åã®SGGã¢ãã«ã®çŸç¶ ⺠ç©äœæ€åºã®ç²ŸåºŠ ï
é¢ä¿(predicate)ã衚ã衚çŸåã®å°ãªã åé¡ç¹ïŒèŠèŠçãªé¢ä¿ãäºçްã§ããïŒæ å ±éãå°ãªã
6 â¢ æææ±ºå®ã¯ content ãš context ã®çµã¿åããã«ãã圢æãããŠãã â Content (å çççç±)ïŒsubjectã»object
ã®èŠèŠçç¹åŸŽé â Context (å€çççç±)ïŒsubjectãšobjectã®çµåé åã察ãšãªãç©äœã¯ã©ã¹ã®èŠèŠçç¹åŸŽ ⢠人éãšæ©æ¢°ã®æææ±ºå®ã®éã ä»®èª¬ïŒæ©æ¢°ãå æé¢ä¿ã®èæ ®ã«ãã£ãŠæªããã€ã¢ã¹ãæé€ ããããšãã§ããªãã 人é Causality-based äž»å æã远æ±ãïŒæªããã€ã¢ã¹ïŒå¯å æïŒãæé€ æ©æ¢° Likelihood-based contentãšcontextãåãããŠå°€åºŠã«ããäºæž¬ æ©æ¢°ã«ãäž»å æãšå¯å æãåºå¥ãããã
7 ⢠Counterfactual causality ( åäºå®å æé¢ä¿ ) ã®èã â ããããã®contentãèŠãŠãªããŠãïŒããã§ãåãäºæž¬ãããã®ãã
â¢ å ·äœäŸïŒã¢ã€ã¹ã¯ãªãŒã ãšç¯çœªçã®çžé¢é¢ä¿ åææ¡ä»¶ïŒåäºå®ä»®æ³ã«ãã£ãŠäž»å æãäºæž¬ ãã¢ã€ã¹ã¯ãªãŒã ã®å£²ãäžããäžãããšç¯çœªçãäžããã â ã¢ã€ã¹ã¯ãªãŒã ãç¯çœªã«èµ·å ããŠããïŒïŒïŒ æ°æž©ã®äžæã«ãã£ãŠäººã ãå€åºãã â ã¢ã€ã¹ã¯ãªãŒã ã®å£²ãäžããäžãã â 人ãå¯éããã®ã§ç¯çœªçãäžãã é ãã倿°ã®å¯èœæ§ â ã¢ã€ã¹ã¯ãªãŒã ã®æé€
8 ⢠SGGã®ã¢ãã« â æ¢åææ³ã®SGGããã®ãŸãŸæ¡çš ⢠Causal Graph â Counterfactual
Thinking (åäºå®æè)ãSGGã«é©çš â Total Direct Effect ( TDE ) ã®èšç®ã«ããæªããã€ã¢ã¹ãæé€ ææ¡ææ³ïŒSGGã«ãããCausal Graphã®å šäœå
9 ⢠Node ð° ( Input Image ïŒ Backbone )
â å ¥åç»å ðŒ ãFaster R-CNN[Ren+, 2016] (FRCNN)ã«é©çš â â³ïŒ ðŒ ã®ç»åç¹åŸŽéã»ð©ïŒbounding box (bbox) ã®éå ⢠Link ð° â ð¿ (Object Feature Extractor) â FRCNNãã RoI Align ç¹åŸŽé(ðð )ã»bbox(ðð )ã»ã©ãã«æ å ±(ðð )ã ç²åŸ â ðŒððð¢ð¡: ðð , ðð , ðð â¹ ðð¢ð¡ðð¢ð¡: {ð¥ð } SGGã®åŠç(1/4)ïŒNode ðŒ, Link ðŒ â ð
10 ⢠Node ð¿ ( Object Feature ) â Subscript
ð: ð¥ð = (ð¥ð , ð¥ð )ã®äœæ (ãã ãð â ð) ⢠Link ð¿ â ð ( Object Classification ) â ðŒððð¢ð¡: ð¥ð â¹ ðð¢ð¡ðð¢ð¡: {ð§ð } ⢠Node ð ( Object Class ) â ð§ð = (ð§ð , ð§ð )ãå«ãŸãã SGGã®åŠç(2/4)ïŒNode ð, Link ð â ð, Node ð
11 ⢠Link ð¿ â ð ( Object Feature Input
for SGG ) â ðŒððð¢ð¡: ð¥ð â¹ ðð¢ð¡ðð¢ð¡: {ð¥â²ð } ⢠Link ð â ð ( Object Class Input for SGG ) â ð§â²ð = ð ð§ [ð§ð âšð§ð ] ⢠Link ð° â ð ( Visual Context Input for SGG ) â ð£â²ð = Convs(RoIAlign(â³, ðð ⪠ðð )) SGGã®åŠç(3/4)ïŒLink ð â ð, Link ð â ð, Link ðŒ â ð
12 ⢠Node ð ( Predicate Classification ) â Link
ðâðã» Link ðâðã» Link ðŒâðã«ãã£ãŠåŸããã3çš®é¡ã® ç¹åŸŽé ð¥â²ð ïŒ ð§â²ð ïŒ ð£â²ð ãå ¥åãšããŠ2çš®é¡ã®æ¹æ³ã§èšç® â SUMïŒðŠð = ð ð¥ ð¥â²ð + ð ð£ ð£â²ð + ð§â²ð â GATEïŒ ðŠð = ð ð ð¥â²ð â ð ð ð¥ ð¥â²ð + ð ð£ ð£â²ð + ð§â²ð ⢠æå€±é¢æ°ïŒã¯ãã¹ãšã³ããããŒèª€å·®é¢æ° â Objectãšpredicateã«å¯ŸããŠäž»ã«äœ¿çš â è£å©çã«å3ã€ã®ãã©ã³ãã«ãé©çš SGGã®åŠç(4/4)ïŒNode ðïŒæå€±é¢æ°
13 ⢠åé ãŸã§ã®åŠçã«ãã£ãŠSGGãåŠç¿ â çŸæç¹ã§ã¯å°€åºŠã«ãããã€ã¢ã¹ãªäºæž¬ã®ãŸãŸ ⢠ãã®SGGã¢ãã«ã«å ææšè«ã®èããé©çš â ä»å ¥(intervention)ã«ããåäºå®æè (counterfactual
thinking) ⢠äŸïŒLink ðŒ â ðã» Link ð â ðãåãé¢ã â Node ðã«ã¯ãããŒã® àŽ€ ðãä»£å ¥ â ãã ãïŒNode ðã¯ä»¥åã®ãã®ãäœ¿çš â¢ åäºå®æèã®ååŸã®å·®ããæçµçãªã©ãã« ãäºæž¬ â Total Direct Effect (TDE) Causal Graph(1/3)ïŒåŠç¿ããSGGããå ææšè«ãèæ ®
14 ⢠ä»å ¥ (intervention)ã®å®çŸ©ïŒð ð(â) â ðð(ð = Ò§ ð¥)ïŒLink ð°âð¿
(Object Feature Extractor)ãåé€ããããŒå€æ°ãä»£å ¥ â ãã®çµæïŒ ð = Ò§ ð§ãšãªã ⢠åäºå® (counterfactual)ã®é©çš â ðð(ð = Ò§ ð¥)ãé©çšããŠããïŒNode ðã¯å€æŽããªã Causal Graph(2/3)ïŒä»å ¥ãšåäºå®æè å®éã®äºå®ãšã¯å¥ã®éçšãçµæã æ³åããããš https://www.dhbr.net/articles/-/4705 ä»åã¯ãã¡ããæ¡çš
15 æçµçãªPredicateã®äºæž¬ã©ãã« Causal Graph(3/3)ïŒTotal Direct Effect ðð·ðž = ðð¥,ð§ ð¢
â ð Ò§ ð¥,ð§ ð¢ ðŠð â = ðŠð (ð¥, ð§) â ðŠð ( Ò§ ð¥, ð§) â2åæèâããŠãã
16 ïïŒæ¢åææ³ãwalking onããªã©ã®åçŸçãã»ãŒ0ã«è¿ããonãã®åçŸçãé«ã âºïŒTDEãé©çšããããšã§ãã€ã¢ã¹ãè§£æ¶ãããŠãã çµæ(1/3)ïŒTDEã®é©çšã«ãããã€ã¢ã¹ã®è§£æ¶ã®ç¢ºèª
17 çµæ(2/3)ïŒæ¢åææ³ã®ã¢ãã«ã«TDEã®é©çšã§æ§èœåäž SGGã®ã¢ãã«æ§é ã¯å€æŽããªãã§TDEãé©çšãã ã ãã§åé¡äºæž¬ãæ€åºã«ãããŠæ§èœåäžãã
18 çµæ(3/3)ïŒPredicateã®ãã€ã¢ã¹ã®è§£æ¶ã»æ¹åã確èª
19 ⢠Scene Graph Generation (SGG) ã«ãããŠïŒå ææšè«ã®æ çµã¿ãçšããŠããŒã¿ã»ããã äžãããã€ã¢ã¹ãäœæžãããææ³ã®ææ¡ â Counterfactual
Thinking â Total Direct Effect ( TDE ) ⢠TDEãå°å ¥ããããšã§ããŒã¹ã©ã€ã³ããã¹ã³ã¢ã®åäžã»ãã€ã¢ã¹ã®äœæž ãŸãšã â SGGã«ãããããŒã¿ã»ãããäžãããã€ã¢ã¹ã«é¢ããŠå§ããŠèæ ® â æ¢åã®SGGã®ãã¬ãŒã ã¯ãŒã¯ã«ãã®ãŸãŸé©çšå¯èœ