2 ⾃⼰紹介 経歴 • 2017.10 ‒ 2021.09 特別研究員@理研AIP • 2018.10 ‒ 現在 助教@東北⼤ (最近)関⼼のある研究分野 • Vision and Language NAS + 画像分類 [GECCOʼ17 (Best paper)] NAS+画像復元 [ICMLʼ18, CVPRʼ19] GT-1: a child is brushing her hair in the mirror GT-2: a little girl is brushing GT-1: an ele to far from a GT-2: an ele GT-2: A cat is sleeping on a skateboard. M2: a kitten laying on the floor next to a skateboard GRIT: a cat laying on a skateboard on the floor GT-2: A small standing next to M2: an elephan two birds in the GRIT: a baby e walking in a fie GT-1: a kitchen with a refrigerator next to a sink. GT-2: a red bucket sits in a sink next to an open refrigerator M2: an open refrigerator with the door open in a kitchen GRIT: a kitchen with a sink and an open refrigerator GT-1: a woman luggage past an GT-2: a woman suitcase past a f M2: a person rid down a street w GRIT: a person suitcase next to GT-1: a small teddy bear is wedged into an opening in a car dashboard GT-1: horses ra track with jocke GT-2: a group o BHSPVQPGKPDLF POB BMJUUMFHJSMCSVTIJOHIFSIBJS XJUIBCSVTI V&L [ECCVʼ20, IJCAIʼ21, ECCVʼ22]
AmoebaNet [Real+, AAAIʼ19] • 進化計算法ベースで⾼精度な⽅法の⼀つ(しかし計算コスト⼤) • NAS-Net [Zoph+, CVPRʼ18]と同じ探索空間 • Mutationによって新しい構造を⽣成 • Hidden state mutation:セル内の⼀つの演算を選択し,それの⼊⼒元をランダムに変更 • Op mutation :セル内の⼀つの演算を選択し,その演算をランダムに変更 20 進化計算法+Cellベース a. 𝑆個体をランダムに選択 b. 𝑆個体中最も優れた個体を選択 a〜eの繰返し … × d. 最も古い個体を除外 e. 新しい個体を追加 c. 選択された個体に対して mutationを適⽤
多層になるにつれて,マルチヘッド注意機構の多様性が失われていく[1] • どのヘッドも同じような特徴抽出を⾏う≈ 特徴マップ(⾏列)のランクが1に近くなり, 性能劣化に繋がる • ゆえに,マルチヘッド注意機構内の重み⾏列のランクが性能指標として使えそう 63 Synaptic diversity in multi-head attention [1] Dong+, Attention is not all you need: pure attention loses rank doubly exponentially with depth, ICMLʼ21
87 Adversarial training + NAS [Guo+, CVPRʼ20] One-shot NASを利⽤して,多数のCNNを⽣成し, それらCNNのAdversarial attackに対する頑健性を調査 Supernet … sampling Finetuning the network with adversarial training and evaluate it on eval samples Subnets … … Finetuning the network with adversarial training and evaluate it on eval samples
Transformerの構造をprimitiveな演算から探索 • 四則演算⼦などの基本的な演算のみでのTransformerの構造探索 • ただし,探索空間が膨⼤すぎるので,vanilla Transformerから探索を開始する • 進化計算で探索(Regularized Evolution) • 探索コストは1145.8 TPUv2 days • 様々な⾔語タスク上で良好な結果(e.g. , Primer improves the original T5 architecture on C4 auto-regressive language modeling, reducing the training cost by 4X) 92 Primer: Searching for Efficient Transformers for Language Modeling [So+, NeurIPSʼ21] 獲得された構造例
• Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search • Distribution Consistent Neural Architecture Search • ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior • Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search • BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule • HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet • Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning • Neural Architecture Search with Representation Mutual Information • Training-free Transformer Architecture Search • Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? • β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search • Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search 103 CVPR2022に採択されているNAS論⽂リスト