Humpback whale identification challenge（通称クジラコンペ）反省会

鯨コンペ反省会あるいは1週間チャレンジ @yu4u

やるやる詐欺① 1

やるやる詐欺② 2

team merger deadline (2/21) に初サブ 3 • いきなり独⾃モデル初サブで死亡 • 元々前職の知り合いとやろうという話を
していたのでミジンコなのにマージしてもらう

4 • 最近の距離学習はクラス分類として学習できるらしいことを聞く • 速攻ArcFaceベースの⼿法でやってみるとまともな精度が出た︕ • ↓読みましょうモダンな深層距離学習
(deep metric learning) ⼿法: SphereFace, CosFace, ArcFace https://qiita.com/yu4u/items/078054dfb5592cbb80cc

本題

Summary 6 • 上位それぞれがかなり異なる⼿法で解いている • 各⼿法⾃体がシンプル、1モデルでも⾼い精度が出せる • 異なるモデルのアンサンブルも有効という、かなり良コンペだったのではないか

鯨コンペ概要 7 • train/test: 🐳ちゃんの可愛いしっぽ画像 – ある程度align, cropされている • trainの各画像には、whale_idがついている
• whale_idには識別されていない “new_whale” が存在 • 各test画像に対し、正解のwhale_idを当てる問題 • 精度指標はMAP@5, “new_whale” が正解となる🐳ちゃんが30%弱 train test

鯨コンペ概要 8 • train/test: 🐳ちゃんの可愛いしっぽ画像 – ある程度align, cropされている • trainの各画像には、whale_idがついている
• whale_idには識別されていない “new_whale” が存在 • 各test画像に対し、正解のwhale_idを当てる問題 • 精度指標はMAP@5, “new_whale” が正解となる🐳ちゃんが30%弱 train train: 25361 (unique id: 5004) new_whale 9664 w_23a388d 73 w_9b5109b 65 w_9c506f6 62 … test: 7960

罠 9 • new_whaleにもwhale_idが振られている🐳画像がある • 同じ🐳だが異なるwhale_idが振られているものがある（いっぱい）

鯨コンペ概要 10 • タスク – 問題としては顔認識と同じ – 実はGoogle Landmark Recognition
Challengeとも同じ • 考えられる解法 – 距離学習（顔認識デファクト、landmark challengeで使われた） – クラス分類として解く（new_whaleが課題） – 局所特徴マッチングで解く（landmark challengeで使われた）実際はどうだったのか︖

神kernel① 11 • 🐳ちゃんしっぽdetector • 顔認識においては、必ず顔検出が前処理として⼊る • どのアプローチでも必ず効果がある https://www.kaggle.com/martinpiotte/bounding-box-model Maskカーネルもあるよ
https://www.kaggle.com/c/humpback-whale-identification/discussion/78453

神kernel② 12 https://www.kaggle.com/seesee/siamese-pretrained-0-822

神kernel② 13 • みんなSiameseNet, SiameseNet⾔うようになったkernel • SiameseNetは通常contrastive lossを⽤いて距離学習を⾏う – 学習が⼤変、不安定
• このkernelのSiameseNetは画像を2枚⼊⼒しそれらが同⼀の🐳かどうかを出⼒する – クラス分類なので学習が簡単 – 精度も恐らくこちらのほうが⾼い CNN CNN d2 weight share 特徴ベクトル 🐳 🐳 CNN CNN weight share 🐳 🐳 contrastive loss x1 x2 x1 +x2 x1 *x2 |x1 -x2 | |x1 -x2 |2 全てpairwise の演算 CNN 0~1 binary crossentropy 通常のSiameseNet +contrastive loss Kernelの SiameseNet f(x1 , x2 ) = f(x2 , x1 ) となる設計 https://www.kaggle.com/seesee/siamese-pretrained-0-822

神kernel② 14 • みんなSiameseNet, SiameseNet⾔うようになったkernel • SiameseNetは通常contrastive lossを⽤いて距離学習を⾏う – 学習が⼤変、不安定
• このkernelのSiameseNetは画像を2枚⼊⼒しそれらが同⼀の🐳かどうかを出⼒する – クラス分類なので学習が簡単 – 精度も恐らくこちらのほうが⾼い CNN CNN d2 weight share 特徴ベクトル 🐳 🐳 CNN CNN weight share 🐳 🐳 contrastive loss x1 x2 x1 +x2 x1 *x2 |x1 -x2 | |x1 -x2 |2 全てpairwise の演算 CNN 0~1 binary crossentropy 通常のSiameseNet +contrastive loss Kernelの SiameseNet f(x1 , x2 ) = f(x2 , x1 ) となる設計特徴抽出ネットワーク分類ネットワーク https://www.kaggle.com/seesee/siamese-pretrained-0-822

神Kernel②の訓練 15 1. train🐳から特徴抽出 – 特徴抽出NWをforward 2. positive pair抽出 3.
negative pair抽出 a. 全🐳特徴ベクトル間のスコア(*-1)を計算しcost matrixとする（分類NWをforward。画像数 C2 。分類NWは軽いので可能） b. cost matrixの同じ🐳の部分を無限⼤に。対⾓も c. cost matrixに対しlinear assignment problem (LAP)を解いて costの⼩さいペアリストを取得＝違う🐳なのにスコアが⾼い組み合わせを作る使ったペアはコスト無限⼤に。5epoch使い回す 4. pos, negペアをネットワーク全体で学習同⼀🐳内画像で同⼀画像がペアにならないようにする最初はcost matrix に乱数を加えて⼿⼼を加える

神Kernel②の訓練 16 1. train🐳から特徴抽出 – 特徴抽出NWをforward 2. positive pair抽出 3.
negative pair抽出 a. 全🐳特徴ベクトル間のスコア(*-1)を計算しcost matrixとする（分類NWをforward。画像数 C2 。分類NWは軽いので可能） b. cost matrixの同じ🐳の部分を無限⼤に。対⾓も c. cost matrixに対しlinear assignment problem (LAP)を解いて costの⼩さいペアリストを取得＝違う🐳なのにスコアが⾼い組み合わせを作る使ったペアはコスト無限⼤に。5epoch使い回す 4. pos, negペアをネットワーク全体で学習同⼀🐳内画像で同⼀画像がペアにならないようにするこんなんだと重すぎて無理 CNN 🐳 🐳 0~1 最初はcost matrix に乱数を加えて⼿⼼を加える

神Kernel②の推論 17 1. train🐳から特徴抽出（特徴抽出NWをforward） 2. test🐳から特徴抽出（特徴抽出NWをforward） 3. test🐳 vs. train🐳のスコアを算出（分類NWをforward）
4. for each test🐳: スコア順にtrain🐳の🐳IDを正解に加える但し、スコアがしきい値以下の場合、正解にnew_whaleがなければnew_whaleを加える whale_id毎のmeanのほうが良いかも

1st Solution 18 • 5004クラスをflipして10008クラスしてそれぞれbinary classification 🐳 global average pooling
channel⽅向に pooling BCE+ lovasz_loss 512x256 BBOX RGB+mask https://www.kaggle.com/c/humpback-whale-identification/discussion/82366 test時はflipも⼊⼒して平均を取る（対応するクラスが分かっている） https://github.com/earhian/Humpback-Whale-Identification-1st-

3rd Solution 19 • Train original bbox regressor (5 fold
CV and trained 5 models) • 320x320 input, DenseNet121 + ArcFace (s=65, m=0.5), weight decay 0.0005, dropout 0.5 • Augmentation: average blur, motion blur; add, multiply, grayscale; scale, translate, shear, rotate; align (single bbox) or no-align • Inference – train: 各🐳毎に5 BBOXを利⽤して特徴ベクトルを出す 🐳ID毎に更に平均 – test: 各🐳毎に5 BBOXを利⽤して特徴ベクトルを出す↑と⽐較 https://www.kaggle.com/c/humpback-whale-identification/discussion/82484

未練 20

4th Solution 21 • SIFT+RANSACで全ペアbrute force! 1. Loop through all
test/train pairs 2. Match keypoints using faiss 3. Double homography filtering of keypoints (LMedS followed by RANSAC) 4. xgboost prediction to validate homography matrix 5. if # of matches > threshold, then use prediction • Top-1の結果を↑で算出し、top-2 ~ 5をSiameseNetで算出 https://www.kaggle.com/c/humpback-whale-identification/discussion/82356 Landmark コンペでやってたフル解像度の🐳 CLAHE (Contrast Limited Adaptive Histogram Equalization) で正規化 UNetでしっぽセグメンテーション

5th Solution 22 • SiameseNet (DenseNet121 backbone) • Original BBOX
regressor • Augmentation: shear, rotation, flipping, contrast, Gaussian noise, blurring, color augmentations, greying, random crops • LAPをサブブロックで⾏う。サブブロックは毎回ランダムに⽣成 • 4-fold stratified cross validation + 15-model ensemble • pseudo label -> update folds (e.g. LB 0.938 -> LB 0.950 -> LB 0.965, etc.) • Stacking（そこまで効果なし） https://www.kaggle.com/c/humpback-whale-identification/discussion/82352 https://weiminwang.blog/2019/03/01/whale-identification-5th-place-approach- using-siamese-networks-with-adversarial-training/ 半分くらいベースにしたカーネルの説明

7th Solution 23 • SE-ResNeXt-50 -> global concat (max, avg)
pool -> BN -> Dropout -> Linear -> ReLU -> BN -> Dropout -> clf (5004) • 4 head classification • use bbox • center loss, ring loss, GeM pooling • verification by local features (Hessian-AffNet + HardNet) https://github.com/ducha-aiki/mods-light-zmq • バックボーンは、⾊々試したが、チームメイトの距離学習を⾏ったネットワーク（SE-ResNeXt-50）をfinetuneするのが良かった • new_whale is inserted to softmaxed predictions with constant threshold, which is set on validation set by bruteforce search in range from 0 to 0.95. https://www.kaggle.com/c/humpback-whale-identification/discussion/82352 https://github.com/ducha-aiki/whale-identification-2018 距離学習のようなことをしているのでsoftmax閾値でもいけた︖

7th Solution 24 https://www.kaggle.com/c/humpback-whale-identification/discussion/82352

7th Solution 25 • 距離学習ベースのアプローチ – training on RGB images:
256x256, 384x384, 448x448, 360x720 – Augmentations: random erasing, affine transformations (scale, translation, shear), brightness/contrast – Models: resnet34, resnet50, resnet101, densenet121, densenet162, seresnext50 ̶ backbone architectures that weʼve tried, followed by GeM pooling layer +L2 + multiplier – Loss: hard triplet loss • 実際のサブミッションには利⽤されず、クラス分類ベースの⼿法のベースネットワークとして利⽤された https://www.kaggle.com/c/humpback-whale-identification/discussion/82502

9th Solution 26 • Summary: Adam, Cosine with restarts, CosFace,
ArcFace, High- resolution images, Weighted sampling, new_whale distillation, Pseudo labeled test, Resnet34, BNInception, Densenet121, AutoAugment, CoordConv, GAPNet • 1024x1024 resnet34, 512x152 BNInception, 640x640 DenseNet121 • CosFace: s=32, m=0.35. ArcFace: m1=1.0, m2=0.4, m3=0.15 • Augumentation: Horizontal Flip, Rotate with 16 degree limit, ShiftScaleRotate with 16 degree limit, RandomBrightnessContrast, RandomGamma, Blur, Perspective transform: tile left, right and corner, Shear, MotionBlur, GridDistortion, ElasticTransform, Cutout https://www.kaggle.com/c/humpback-whale-identification/discussion/82427 CosFace + ArcFace

10th Solution 27 （SiameseNet part） • Summary – Siamese architecture
– Metric learning featuring brand-new CVPR 2019 method (will be published soon) – Classification on features – Large blend for new whale/not new whale binary classification • Tricks – Flip augmentation for both positive and negative pairs – ResNet-18, ResNet-34, SE-ResNeXt-50, ResNet-50, image size: 299->384->512 – 0.929 LB -> ensemble 0.940 https://www.kaggle.com/c/humpback-whale-identification/discussion/82430

10th Solution 28 （Metric learning part）Another solution will be explained
later in detail by @asanakoy. In two words, it is metric learning with multiple branches and margin loss, trained on multiple resolution crops using bboxes, grayscale and RGB input images. He also used his brand-new method from CVPR which allowed for 1-2% score boost. らしい（Classification part）concat features from branch models and train classifcation model （Post processing）took their TOP-4 predictions for each whale. Then, for all of our models, we took their predictions on these set of classes. We used a blend of LogReg, SVM, several KNN models, and LightGBM to solve a binary classification problem. https://www.kaggle.com/c/humpback-whale-identification/discussion/82430

15th Solution 29 • At the beginning, we using pure
softmax to classification 5005 class. The best result we obtain is around 0.86X using seresnext50. • Then we resort to sphereface. To use sphereface, we abandon new whales, which means we only use around 19K images. This gives us 0.920 using seresnext-50 (multi-layer fusion, 384384), 0.919 using resnext50 (multi-layer fusion,384384). • We also tried arcface, which gives us 0.911 using seresnext-50 (multi-layer fusion, 384*384). https://www.kaggle.com/c/humpback-whale-identification/discussion/82361

My Solution① 30 • 768x256🐳(BBOX), resnext101_32x4d backbone, ArcFace • known🐳のみ、訓練時はduplicate🐳IDを1つにまとめる
• 10枚以下の画像の🐳は10枚以上になるようにover sampling • Augmentation: grayscale, Gaussian noise, Gaussian blur, rotation, shear, piecewise affine, color shift, contrast, crop • train🐳 vs. test🐳のcos類似度を同⼀IDに対して平均 768x256 24x8 6x2 24576 bn, avepool(4) flatten, dropout 512 FC 5004 FC ArcFace cross entropy ResNeXt101 Feature vector private LB: 0.92239 public LB: 0.91156 NO VALIDATION SET ;D due to time constraint

My Solution② 31 • Ensemble with 512x512 SiameseNet model •
test画像 vs. train画像のmatrixをtest画像 vs. 🐳IDのmatrixにする • TTA: bounding boxのスケールをオリジナル＋2スケール利⽤ 768x256 ArcFace 512x512 SiameseNet 5004 7960 768x256 ArcFace 𝑃 = # !"# #$ 𝑤! 𝑃! % 𝑃# 𝑃$ 𝑃& 𝑃#' 𝑃## 𝑃#$ 𝑃 𝛼は0~1 0に近づくとvotingぽくなる 1は普通のweighted average 個人的にはとりあえず0.5にする test🐳画像 train🐳”ID” 閾値で切って new_whaleを差し込み submissionファイル化個々の値は 0~1 private: 0.92239 public: 0.91156 512x512 ArcFace private: 0.92981 publoc: 0.91558 private: 0.90242 public: 0.88183 private: 0.89706 public: 0.86712 … 基本閾値未tuning

Milestones 32 • 2/21: 独⾃モデルミジンコ初サブ • 2/22: ArcFaceを知る • 2/24:
448x448 model 0.786 • 2/26: 768x256 model 0.879 • 2/27: 768x256 model 0.887 • 2/28: 768x256 model 0.910 • 2/28ド深夜: 3モデル完成、アンサンブル実装 private: 0.95448, public: 0.94632 • 超能⼒ハイパラ調整により3subでアンサンブルガチャに勝利 • スコアベースアンサンブル、全く違うモデルのアンサンブル

Humpback whale identification challenge（通称クジラコン...

Humpback whale identification challenge（通称クジラコンペ）反省会

yu4u

More Decks by yu4u

Other Decks in Technology

Featured

Transcript

鯨コンペ反省会あるいは1週間チャレンジ @yu4u

やるやる詐欺① 1

やるやる詐欺② 2

team merger deadline (2/21) に初サブ 3 • いきなり独⾃モデル初サブで死亡 • 元々前職の知り合いとやろうという話を

4 • 最近の距離学習はクラス分類として学習できるらしいことを聞く • 速攻ArcFaceベースの⼿法でやってみるとまともな精度が出た︕ • ↓読みましょうモダンな深層距離学習

本題

Summary 6 • 上位それぞれがかなり異なる⼿法で解いている • 各⼿法⾃体がシンプル、1モデルでも⾼い精度が出せる • 異なるモデルのアンサンブルも有効という、かなり良コンペだったのではないか

鯨コンペ概要 7 • train/test: 🐳ちゃんの可愛いしっぽ画像 – ある程度align, cropされている • trainの各画像には、whale_idがついている

鯨コンペ概要 8 • train/test: 🐳ちゃんの可愛いしっぽ画像 – ある程度align, cropされている • trainの各画像には、whale_idがついている

罠 9 • new_whaleにもwhale_idが振られている🐳画像がある • 同じ🐳だが異なるwhale_idが振られているものがある（いっぱい）

鯨コンペ概要 10 • タスク – 問題としては顔認識と同じ – 実はGoogle Landmark Recognition

神kernel① 11 • 🐳ちゃんしっぽdetector • 顔認識においては、必ず顔検出が前処理として⼊る • どのアプローチでも必ず効果がある https://www.kaggle.com/martinpiotte/bounding-box-model Maskカーネルもあるよ

神kernel② 12 https://www.kaggle.com/seesee/siamese-pretrained-0-822

神kernel② 13 • みんなSiameseNet, SiameseNet⾔うようになったkernel • SiameseNetは通常contrastive lossを⽤いて距離学習を⾏う – 学習が⼤変、不安定

神kernel② 14 • みんなSiameseNet, SiameseNet⾔うようになったkernel • SiameseNetは通常contrastive lossを⽤いて距離学習を⾏う – 学習が⼤変、不安定

神Kernel②の訓練 15 1. train🐳から特徴抽出 – 特徴抽出NWをforward 2. positive pair抽出 3.

神Kernel②の訓練 16 1. train🐳から特徴抽出 – 特徴抽出NWをforward 2. positive pair抽出 3.

神Kernel②の推論 17 1. train🐳から特徴抽出（特徴抽出NWをforward） 2. test🐳から特徴抽出（特徴抽出NWをforward） 3. test🐳 vs. train🐳のスコアを算出（分類NWをforward）

1st Solution 18 • 5004クラスをflipして10008クラスしてそれぞれbinary classification 🐳 global average pooling

3rd Solution 19 • Train original bbox regressor (5 fold

未練 20

4th Solution 21 • SIFT+RANSACで全ペアbrute force! 1. Loop through all

5th Solution 22 • SiameseNet (DenseNet121 backbone) • Original BBOX

7th Solution 23 • SE-ResNeXt-50 -> global concat (max, avg)

7th Solution 24 https://www.kaggle.com/c/humpback-whale-identification/discussion/82352

7th Solution 25 • 距離学習ベースのアプローチ – training on RGB images:

9th Solution 26 • Summary: Adam, Cosine with restarts, CosFace,

10th Solution 27 （SiameseNet part） • Summary – Siamese architecture

10th Solution 28 （Metric learning part）Another solution will be explained

15th Solution 29 • At the beginning, we using pure

My Solution① 30 • 768x256🐳(BBOX), resnext101_32x4d backbone, ArcFace • known🐳のみ、訓練時はduplicate🐳IDを1つにまとめる

My Solution② 31 • Ensemble with 512x512 SiameseNet model •

Milestones 32 • 2/21: 独⾃モデルミジンコ初サブ • 2/22: ArcFaceを知る • 2/24: