A closer look at few shot classification

1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ A Closer Look
at Few-shot Classification (ICLR2019) Kazuki Fujikawa, DeNA

サマリ • 書誌情報 – A Closer Look at Few-shot Classification
• ICLR2019（to appear） • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang • 概要 – Few-shot classificationの標準的な評価⽅法の問題を指摘 • Few-shot learningで標準的に⽤いられる実験設定において、今回定義するBaseline++は SOTAに匹敵する • ⼤きなNNを⽤いた場合、CUB, mini-ImageNet両⽅のベンチマークデータセットで BaselineがSOTAに匹敵する • データにドメインシフトがある場合、Baselineの⽅が優れている – 適切に⽐較可能な実験設定で⼀貫して実験できるよう、実験コードを公開 • まだ解かれていないドメインシフトのあるFew-shot Learningの問題が前進することを期待する 2

アウトライン • 背景 • ⽐較⼿法 • 実験・結果 3

背景 • Few-shot classification を扱う研究が近年増加している – 解きたいタスクの教師データが少ない場合に、⼤規模データから他のタスクでも転⽤可能な知識を抽出できるようにしよう、というのが⼀つの⽅向性 • ⼀⽅過去の研究では下記の観点で課題がある
– ベースラインのパフォーマンスが不当に低評価されており、公平な評価ができていない • Data augumentationされていない、等 – データのドメインシフトを考慮していない • 未知のクラスも同じデータセットからサンプリングして評価している 5 Oriol Vinyals, NIPS 17 Meta Learning Models Taxonomy Model Based • Santoro et al. ’16 • Duan et al. ’17 • Wang et al. ‘17 • Munkhdalai & Yu ‘17 • Mishra et al. ‘17 Metric Based • Koch ’15 • Vinyals et al. ‘16 • Snell et al. ‘17 • Shyam et al. ‘17 • Sung et al. ‘17 Optimization Based • Schmidhuber ’87, ’92 • Bengio et al. ’90, ‘92 • Hochreiter et al. ’01 • Li & Malik ‘16 • Andrychowicz et al. ’16 • Ravi & Larochelle ‘17 • Finn et al. ‘17 Adapted from Finn ‘17 図引⽤: Vinyals, Oriol. NIPS 2017 Meta-Learning symposium.

Baseline • 問題設定 – Training stage • Base classのデータを使って特徴抽出器 𝑓!
, 分類器 𝐶($ |𝐖" ) を学習する • Base class: Few-shotで分類したいクラスとは別のクラスのデータ（ラベル付きデータが⼤量にある前提） – Fine-tuning stage • 𝑓! は固定し、Novel classのデータを使って分類器 𝐶($ |𝐖# ) を学習する • Novel class: Few-shotで分類したいクラスのデータ（ラベル付きデータが数件しか無い前提） 7 Published as a conference paper at ICLR 2019 Baseline++ Baseline Training stage Classifier Feature extractor Novel class data (Few) Fine-tuning stage Fixed Feature extractor Base class data (Many) Linear layer Softmax ! Softmax ! Cosine distance Classifier Classifier … … Figure 1: Baseline and Baseline++ few-shot classiﬁcation methods. Both the baseline and

Baseline • モデル – Baseline • 𝑓! (𝐱$ ) と
𝐖 ∈ ℝ%×'との内積に基づくクロスエントロピー誤差を最⼩化する – Baseline++ • 𝑓! (𝐱$ ) と 𝐖" = 𝐖( , 𝐖) , … , 𝐖' ∈ ℝ%×'とのコサイン距離に基づくクロスエントロピー誤差を最⼩化する • Baselineと⽐べてクラス内変動（intra-class variation）を減らすことが狙い • [Hu+, CVPR2015], [Gidaris & Komodakis, CVPR2018] でも導⼊されている 8 Published as a conference paper at ICLR 2019 Baseline++ Baseline Training stage Classifier Feature extractor Novel class data (Few) Fine-tuning stage Fixed Feature extractor Base class data (Many) Linear layer Softmax ! Softmax ! Cosine distance Classifier Classifier … … Figure 1: Baseline and Baseline++ few-shot classiﬁcation methods. Both the baseline and

Few-shot classification • 問題設定（N-way K-shot） – Meta-testing stage • Support
set（N件ずつのKクラスラベル付きデータ）を⼿がかりに、Query set（ラベル無しデータ）をKクラスいずれかに分類する – Meta-training stage • Meta-testingでの状況に合わせて、Support set, Query setをBase classからサンプリングする • サンプリングされたQuery setが、Support setを参考に正しく分類できるように特徴抽出器 𝑓! を学習する 9 Published as a conference paper at ICLR 2019 Meta-training stage Meta-testing stage Support set conditioned model Novel support set (Novel class data ) Base query set Base support set ! " Sampled # classes Support set conditioned model Feature extractor MatchingNet Cosine distance RelationNet Relation Module $ ProtoNet Euclidean distance $ MAML Gradient Linear Linear Base class data (Many) Class mean Class mean Figure 2: Meta-learning few-shot classiﬁcation algorithms. The meta-learning classiﬁer M(·|S) is conditioned on the support set S. (Top) In the meta-train stage, the support set Sb and the query

Few-shot classification • モデル – MatchingNet [Vinyals+, NIPS2016] • Nクラス
x Kサンプルの Support set と Query set に対してそれぞれ 𝑓! で特徴抽出 • コサイン距離に基づくクロスエントロピー誤差を最⼩化 – ProtoNet [Snell+, NIPS2017] • Nクラス x Kサンプルの Support set と Query set に対してそれぞれ 𝑓! で特徴抽出 • Support set から得られた特徴ベクトルをクラス毎に平均し、N個のprototype（ベクトル）を作る • Query set のベクトルとprototypeとのユークリッド距離に基づくクロスエントロピー誤差を最⼩化 10 Published as a conference paper at ICLR 2019 Meta-training stage Meta-testing stage Support set conditioned model Novel support set (Novel class data ) Base query set Base support set ! " Sampled # classes Support set conditioned model Feature extractor MatchingNet Cosine distance RelationNet Relation Module $ ProtoNet Euclidean distance $ MAML Gradient Linear Linear Base class data (Many) Class mean Class mean Figure 2: Meta-learning few-shot classiﬁcation algorithms. The meta-learning classiﬁer M(·|S) is conditioned on the support set S. (Top) In the meta-train stage, the support set Sb and the query

Few-shot classification • モデル – RelationNet [Sung+, CVPR2018] • ⼤枠はProtoNetと同じ
• NNでパラメタライズされたRelation Moduleのスコアに基づくクロスエントロピー誤差を最⼩化 – MAML [Finn+, ICML2017] • Support set（⼩数のラベル付きデータ）でFine-tuningをした時に、Query setの予測誤差が⼩さくなるようなモデルパラメータの初期値を学習する 11 Published as a conference paper at ICLR 2019 Meta-training stage Meta-testing stage Support set conditioned model Novel support set (Novel class data ) Base query set Base support set ! " Sampled # classes Support set conditioned model Feature extractor MatchingNet Cosine distance RelationNet Relation Module $ ProtoNet Euclidean distance $ MAML Gradient Linear Linear Base class data (Many) Class mean Class mean Figure 2: Meta-learning few-shot classiﬁcation algorithms. The meta-learning classiﬁer M(·|S) is conditioned on the support set S. (Top) In the meta-train stage, the support set Sb and the query

アウトライン • 背景 • 関連研究 • ⽐較⼿法 • 実験・結果 12

データセット • mini-ImageNet – ImageNetをベースに、計算量削減のため解像度やクラス数を限定して作成したデータセット • 解像度: 84x84 •
データ数（計60,000件） – train: 64クラス x 600件 – valid: 16クラス x 600件 – test: 20クラス x 600件 • CUB – ⿃に関する粒度の細かいラベルがつけられた画像のデータセット • データ数（計11,788件） – train: 100クラス – valid: 50クラス – test: 50クラス 13 Russakovsky, Olga, et al. "Imagenet large scale visual recognition challenge." International journal of computer vision 115.3 (2015): 211-252. Wah C., Branson S., Welinder P., Perona P., Belongie S. “The Caltech-UCSD Birds-200-2011 Dataset.” Computation & Neural Systems Technical Report, CNS-TR-2011-001.

実験概要 • タスク概要 – 実験1: 論⽂の報告に多い標準の問題設定で、各⼿法を統⼀的に再評価 – 実験2: 特徴抽出器のNNを深くしたモデルを使って各⼿法を実験 –
実験3: ドメインシフトがある設定（mini-ImageNet→CUB）で各⼿法を実験 • ハイパーパラメータ – ⼿法特有のハイパーパラメータは下記のように設定 • baseline, baseline++ – training stage: batchsize: 16, epochs: 400 – testing stage: batchsize: 4, iterations: 100 • meta-learning – 1-shot: 60,000 episodes, 5-shot: 40,000 episodes 14

実験1: 標準設定による再評価 • 各⽐較⼿法について、実験設定を標準設定で揃えて再実験 – データセット: mini-ImageNet, 特徴抽出器 𝑓! :
4層CNN – 各⼿法の設定 • Baseline ⇔ Baseline*: data-augmentation有り ⇔ 無し • ProtoNet ⇔ ProtoNet#: 5-way ⇔ 30-way(1-shot), 20-way(5-shot) でmeta-train • 考察 – Baselineについては、data-augmentationすることにより改善可能であり、報告値は過⼩評価されている – Baseline++を含めて⽐較するとSOTA⼿法に匹敵する 15 Published as a conference paper at ICLR 2019 Table 1: Validating our re-implementation. We validate our few-shot classification implementation on the mini-ImageNet dataset using a Conv-4 backbone. We report the mean of 600 randomly generated test episodes as well as the 95% confidence intervals. Our reproduced results to all few-shot methods do not fall behind by more than 2% to the reported results in the literature. We attribute the slight discrepancy to different random seeds and minor implementation differences in each method. “Baseline⇤” denotes the results without applying data augmentation during training. ProtoNet# indicates performing 30-way classification in 1-shot and 20-way in 5-shot during the meta-training stage. 1-shot 5-shot Method Reported Ours Reported Ours Baseline - 42.11 ± 0.71 - 62.53 ±0.69 Baseline⇤3 41.08 ± 0.70 36.35 ± 0.64 51.04 ± 0.65 54.50 ±0.66 MatchingNet3 Vinyals et al. (2016) 43.56 ± 0.84 48.14 ± 0.78 55.31 ±0.73 63.48 ±0.66 ProtoNet - 44.42 ± 0.84 - 64.24 ±0.72 ProtoNet# Snell et al. (2017) 49.42 ± 0.78 47.74 ± 0.84 68.20 ±0.66 66.68 ±0.68 MAML Finn et al. (2017) 48.07 ± 1.75 46.47 ± 0.82 63.15 ±0.91 62.71 ±0.71 RelationNet Sung et al. (2018) 50.44 ± 0.82 49.31 ± 0.85 65.32 ±0.70 66.60 ±0.69 Table 2: Few-shot classification results for both the mini-ImageNet and CUB datasets. The Table 1: Validating our re-implementation. We validate our few-shot classification implementation on the mini-ImageNet dataset using a Conv-4 backbone. We report the mean of 600 randomly generated test episodes as well as the 95% confidence intervals. Our reproduced results to all few-shot methods do not fall behind by more than 2% to the reported results in the literature. We attribute the slight discrepancy to different random seeds and minor implementation differences in each method. “Baseline⇤” denotes the results without applying data augmentation during training. ProtoNet# indicates performing 30-way classification in 1-shot and 20-way in 5-shot during the meta-training stage. 1-shot 5-shot Method Reported Ours Reported Ours Baseline - 42.11 ± 0.71 - 62.53 ±0.69 Baseline⇤3 41.08 ± 0.70 36.35 ± 0.64 51.04 ± 0.65 54.50 ±0.66 MatchingNet3 Vinyals et al. (2016) 43.56 ± 0.84 48.14 ± 0.78 55.31 ±0.73 63.48 ±0.66 ProtoNet - 44.42 ± 0.84 - 64.24 ±0.72 ProtoNet# Snell et al. (2017) 49.42 ± 0.78 47.74 ± 0.84 68.20 ±0.66 66.68 ±0.68 MAML Finn et al. (2017) 48.07 ± 1.75 46.47 ± 0.82 63.15 ±0.91 62.71 ±0.71 RelationNet Sung et al. (2018) 50.44 ± 0.82 49.31 ± 0.85 65.32 ±0.70 66.60 ±0.69 Table 2: Few-shot classification results for both the mini-ImageNet and CUB datasets. The Baseline++ consistently improves the Baseline model by a large margin and is competitive with the state-of-the-art meta-learning methods. All experiments are from 5-way classification with a Conv-4 backbone and data augmentation. CUB mini-ImageNet Method 1-shot 5-shot 1-shot 5-shot Baseline 47.12 ± 0.74 64.16 ± 0.71 42.11 ± 0.71 62.53 ±0.69 Baseline++ 60.53 ± 0.83 79.34 ± 0.61 48.24 ± 0.75 66.43 ±0.63 MatchingNet Vinyals et al. (2016) 61.16 ± 0.89 72.86 ± 0.70 48.14 ± 0.78 63.48 ±0.66 ProtoNet Snell et al. (2017) 51.31 ± 0.91 70.77 ± 0.69 44.42 ± 0.84 64.24 ±0.72 MAML Finn et al. (2017) 55.92 ± 0.95 72.09 ± 0.76 46.47 ± 0.82 62.71 ±0.71 RelationNet Sung et al. (2018) 62.45 ± 0.98 76.11 ± 0.69 49.31 ± 0.85 66.60 ±0.69

実験2: 特徴抽出器のNNを深化させた実験 • 各⽐較⼿法について、特徴抽出器のNNを深化させた時のパフォーマンスを⽐較 – データセット: CUB, mini-ImageNet – 特徴抽出器
𝑓! : 4層CNN, 6層CNN, ResNet-10, ResNet-18, ResNet-34 • 考察 – CUBでは層の深さを深くした場合に⼿法間の差が⼩さくなっている – mini-ImageNetでは層を深くするとBaselineに負ける⼿法が出てくる 16 Published as a conference paper at ICLR 2019 45% 55% 65% 75% Conv-4 Conv-6 ResNet-10 ResNet-18 ResNet-34 60% 70% 80% 90% Conv-4 Conv-6 ResNet-10 ResNet-18 ResNet-34 40% 45% 50% 55% Conv-4 Conv-6 ResNet-10 ResNet-18 ResNet-34 60% 65% 70% 75% 80% Conv-4 Conv-6 ResNet-10 ResNet-18 ResNet-34 Baseline Baseline++ MatchingNet ProtoNet MAML RelationNet CUB 1-shot 5-shot mini-ImageNet 1-shot 5-shot Figure 3: Few-shot classiﬁcation accuracy vs. backbone depth. In the CUB dataset, gaps among different methods diminish as the backbone gets deeper. In mini-ImageNet 5-shot, some meta-learning methods are even beaten by Baseline with a deeper backbone. (Please refer to Figure A3 and Table A5 for larger ﬁgure and detailed statistics.)

実験3: ドメインシフトを含む実験 • 各⽐較⼿法について、ドメインシフトを含む場合のパフォーマンスを⽐較 – データセット: mini-ImageNet (meta-training) → CUB
(meta-testing) – 特徴抽出器 𝑓! : ResNet-18 • 考察 – BaselineがMeta-learningの⼿法全てを上回る結果に – ドメイン間の相違が増⼤するにつれ、Meta-learningの⼿法は相対的に有効でなくなるという結果になった 17 Published as a conference paper at ICLR 2019 mini-ImageNet !CUB Baseline 65.57±0.70 Baseline++ 62.04±0.76 MatchingNet 53.07±0.74 ProtoNet 62.02±0.70 MAML 51.34±0.72 RelationNet 57.71±0.73 Table 3: 5-shot accuracy under the cross-domain scenario with a ResNet-18 backbone. Baseline outperforms all other 40% 50% 60% 70% 80% 90% CUB miniImageNet miniImageNet -> CUB Baseline Baseline++ MatchingNet ProtoNet MAML RelationNet Domain Difference Large Small Figure 4: 5-shot accuracy in different scenarios with a ResNet-18 backbone. The Baseline model performs relative well with larger domain

結論 • Few-shot classificationの標準的な評価⽅法の問題を指摘 – Few-shot learningで標準的に⽤いられる実験設定において、Baseline++はSOTAに匹敵 – ⼤きなNNを⽤いた場合、CUB, mini-ImageNet両⽅のベンチマークデータセットで
BaselineがSOTAに匹敵 – データにドメインシフトがある場合、Baselineの⽅が優れている • 適切に⽐較可能な実験設定で⼀貫して実験できるよう、実験コードを公開 – まだ解かれていないドメインシフトのあるFew-shot Learningの問題が前進することを期待 18

References • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank
Wang, Jia-Bin Huang. A Closer Look at Few-shot Classification. In ICLR 2019. • Spyros Gidaris and Nikos Komodakis. Dynamic few-shot visual learning without forgetting. In CVPR 2018. • Junlin Hu, Jiwen Lu, and Yap-Peng Tan. Deep transfer metric learning. In CVPR 2015. • Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In NIPS 2016 • Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In NIPS 2017. • Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. Learning to compare: Relation network for few-shot learning. In CVPR 2018. • Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML 2017. 19

A closer look at few shot classification

A closer look at few shot classification

Kazuki Fujikawa

More Decks by Kazuki Fujikawa

Other Decks in Science

Featured

Transcript

1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ A Closer Look

サマリ • 書誌情報 – A Closer Look at Few-shot Classification

アウトライン • 背景 • ⽐較⼿法 • 実験・結果 3

アウトライン • 背景 • ⽐較⼿法 • 実験・結果 4

アウトライン • 背景 • ⽐較⼿法 • 実験・結果 6

Baseline • 問題設定 – Training stage • Base classのデータを使って特徴抽出器 𝑓!

Baseline • モデル – Baseline • 𝑓! (𝐱$ ) と

Few-shot classification • 問題設定（N-way K-shot） – Meta-testing stage • Support

Few-shot classification • モデル – MatchingNet [Vinyals+, NIPS2016] • Nクラス

Few-shot classification • モデル – RelationNet [Sung+, CVPR2018] • ⼤枠はProtoNetと同じ

アウトライン • 背景 • 関連研究 • ⽐較⼿法 • 実験・結果 12

データセット • mini-ImageNet – ImageNetをベースに、計算量削減のため解像度やクラス数を限定して作成したデータセット • 解像度: 84x84 •

実験概要 • タスク概要 – 実験1: 論⽂の報告に多い標準の問題設定で、各⼿法を統⼀的に再評価 – 実験2: 特徴抽出器のNNを深くしたモデルを使って各⼿法を実験 –

実験1: 標準設定による再評価 • 各⽐較⼿法について、実験設定を標準設定で揃えて再実験 – データセット: mini-ImageNet, 特徴抽出器 𝑓! :

実験2: 特徴抽出器のNNを深化させた実験 • 各⽐較⼿法について、特徴抽出器のNNを深化させた時のパフォーマンスを⽐較 – データセット: CUB, mini-ImageNet – 特徴抽出器

実験3: ドメインシフトを含む実験 • 各⽐較⼿法について、ドメインシフトを含む場合のパフォーマンスを⽐較 – データセット: mini-ImageNet (meta-training) → CUB

結論 • Few-shot classificationの標準的な評価⽅法の問題を指摘 – Few-shot learningで標準的に⽤いられる実験設定において、Baseline++はSOTAに匹敵 – ⼤きなNNを⽤いた場合、CUB, mini-ImageNet両⽅のベンチマークデータセットで

References • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank