Backboneとしてのtimm2025

Backboneとしてのtimm2025 GOドライブ株式会社内田祐介 (@yu4u)

© GO Drive Inc. 4 backboneとしてのtimm入門 ◼ みんな大好きな神資料 ◼ 改めて見てみたらこれでいいじゃんになった（完）
https://www.slideshare.net/TakujiTahara/20210817-lt-introduction-to-pytorch-image-models-as- backbone-tawara-249996209

© GO Drive Inc. 5 timm ◼ PyTorch Image Models
◼ CNNからVision Transformerまで大量の画像認識モデルが学習済みの重みとセットで簡単に使える ▪ 便利な様々な機能がある https://github.com/huggingface/pytorch-image-models/

© GO Drive Inc. 7 pretrainedモデルも色々 ①Fully Convolutional Masked Autoencoder
(FCMAE) で学習して、 ②ImageNet-22kでfinetuneして、 ③最後にImageNet-1kでfinetuneしたよ ④入力画像サイズは384だよとりあえずin22k_in1kを使っておけば良い ① ② ③ ④

© GO Drive Inc. 9 クラス分類モデル (H, W, 3) (H/2,
W/2) (H/4, W/4) (H/8, W/8) (H/16, W/16) (H/32, W/32) stage1 stage2 stage3 stage4 stem pool (1000,) GAP (1000,)

© GO Drive Inc. 12 空間情報を活用したいとき (H, W, 3) (H/2,
W/2) (H/4, W/4) (H/8, W/8) (H/16, W/16) (H/32, W/32) stage1 stage2 stage3 stage4 stem pool (1000,) GAP (512, 7, 7)

© GO Drive Inc. 13 マルチスケール特徴マップを使いたいとき (H, W, 3) (H/2,
W/2) (H/4, W/4) (H/8, W/8) (H/16, W/16) (H/32, W/32) stage1 stage2 stage3 stage4 stem pool (1000,) GAP (256, 14, 14) (128, 28, 28) (512, 7, 7) (64, 56, 56) (64, 112, 112)

© GO Drive Inc. ◼ img_size ▪ 固定入力のみのモデル作成時に対応入力サイズを変更 ▪ 引数が非対応のモデルに入力するとエラーになる
◼ drop_rate ▪ クラス分類ヘッドのdropout率 ◼ drop_path_rate ▪ droppath (stochastic depth) のdrop率 15 その他のオプション

© GO Drive Inc. 16 Backboneだけじゃないtimm ◼ timm自体がImageNetでのモデル学習の再現を行うフレームワーク ▪
様々な便利モジュールがある

© GO Drive Inc. 17 ModelEmaV3 ◼ Exponential Moving Average
(EMA) ▪ 学習時に一定間隔 (step) 毎にEMAモデルをアップデート ▪ 汎化性能が向上 ▪ 最近は常に使っています

© GO Drive Inc. 18 ModelEmaV3 ◼ Exponential Moving Average
(EMA)

© GO Drive Inc. 21 create_optimizer_v2 ◼ optimizerのfactory関数 ▪ PyTorchにないoptimizerもあったりする
地味に便利 omegaconfのConfig コマンドラインから変更できて便利

© GO Drive Inc. 22 create_scheduler_v2 ◼ schedulerのfactory関数 epochではなく stepでupdate
pytorch-lightning で使うときのおまじない batch_size, GPU数, grad_accを考慮した step数

© GO Drive Inc. 24 他にも ◼ https://github.com/huggingface/pytorch-image- models/blob/main/timm/layers/__init__.py ◼
見てみましょう！ ◼ オレオレモデルの実装をシンプルに

© GO Drive Inc. 25 smpのTimmUniversalEncoderに学ぶbackboneのパターン ◼ segmentation_models.pytorch (smp) ▪
様々なセマンティックセグメンテーションモデルの実装 ▪ 個人的にはUNetばっかり。UPerNetも見ますね ▪ 損失関数や評価関数も色々ある ◼ TimmUniversalEncoder ▪ timmをセマンティックセグメンテーションモデルの汎用エンコーダとして利用するためのラッパークラス https://github.com/qubvel-org/segmentation_models.pytorch/

© GO Drive Inc. 26 TimmUniversalEncoderの大幅アップデートPR ◼ セマンティックセグメンテーションでは複数解像度の特徴マップを利用する必要がある ◼
timmの出力特徴マップの形式は様々で TimmUniversalEncoderで吸収している ◼ このPRで利用できるtimmのモデルが大幅に増加 ▪ https://github.com/qubvel- org/segmentation_models.pytorch/pull/1004

© GO Drive Inc. 27 ポイント特徴マップのテンソルが channel-firstかchannel-lastか特徴マップリストの空間解像度のパターン https://github.com/qubvel-
org/segmentation_models.pytorch/blob/main/segmentation_models_pytorch/encoders/timm_universal.py

© GO Drive Inc. 32 2.5Dモデル ◼ CZII 4th Solution
▪ 2Dエンコーダの各stageを部分的にforwardしつつ stage毎の出力をdepth方向にpoolする ▪ 2Dエンコーダ内でdepth方向のreceptive fieldを拡大できる

Backboneとしてのtimm2025

Backboneとしてのtimm2025

yu4u

More Decks by yu4u

Other Decks in Technology

Featured

Transcript

Backboneとしてのtimm2025 GOドライブ株式会社内田祐介 (@yu4u)

© GO Drive Inc. 2 モデルの構造に興味があります！ https://www.slideshare.net/ren4yu/presentations https://speakerdeck.com/yu4u

© GO Drive Inc. 3 backboneとしてのtimm入門 ◼ みんな大好きな神資料 https://www.slideshare.net/TakujiTahara/20210817-lt-introduction-to-pytorch-image-models-as- backbone-tawara-249996209

© GO Drive Inc. 4 backboneとしてのtimm入門 ◼ みんな大好きな神資料 ◼ 改めて見てみたらこれでいいじゃんになった（完）

© GO Drive Inc. 5 timm ◼ PyTorch Image Models

© GO Drive Inc. 6 どんなモデルがあるのか

© GO Drive Inc. 7 pretrainedモデルも色々 ①Fully Convolutional Masked Autoencoder

© GO Drive Inc. 8 詳細はモデルカード参照 ◼ https://huggingface.co/timm/convnextv2_tiny.fcmae_ft_in22k_in1k_384

© GO Drive Inc. 9 クラス分類モデル (H, W, 3) (H/2,

© GO Drive Inc. 10 入力チャネル・出力クラス数変更 (H, W, 5) (H/2,

© GO Drive Inc. 11 特徴抽出後のヘッドを自分で定義したいとき (H, W, 3) (H/2,

© GO Drive Inc. 12 空間情報を活用したいとき (H, W, 3) (H/2,

© GO Drive Inc. 13 マルチスケール特徴マップを使いたいとき (H, W, 3) (H/2,

© GO Drive Inc. 14 特徴マップの情報

© GO Drive Inc. ◼ img_size ▪ 固定入力のみのモデル作成時に対応入力サイズを変更 ▪ 引数が非対応のモデルに入力するとエラーになる

© GO Drive Inc. 16 Backboneだけじゃないtimm ◼ timm自体がImageNetでのモデル学習の再現を行うフレームワーク ▪

© GO Drive Inc. 17 ModelEmaV3 ◼ Exponential Moving Average

© GO Drive Inc. 18 ModelEmaV3 ◼ Exponential Moving Average

© GO Drive Inc. 19 ModelEmaV3 https://github.com/yu4u/kaggle-czii-4th/blob/main/src/pl_module.py 減衰パラメータは学習step数に応じて調整

© GO Drive Inc. 20 ModelEmaV3 ◼ Post-hoc EMAもちゃんと試してみたい https://speakerdeck.com/yu4u/post-hoc-ema-emanojian-shuai-parametanoshi-hou-zui-shi-hua

© GO Drive Inc. 21 create_optimizer_v2 ◼ optimizerのfactory関数 ▪ PyTorchにないoptimizerもあったりする

© GO Drive Inc. 22 create_scheduler_v2 ◼ schedulerのfactory関数 epochではなく stepでupdate

© GO Drive Inc. 23 DropPath ◼ Stochastic depthとも ◼

© GO Drive Inc. 24 他にも ◼ https://github.com/huggingface/pytorch-image- models/blob/main/timm/layers/init.py ◼

© GO Drive Inc. 25 smpのTimmUniversalEncoderに学ぶbackboneのパターン ◼ segmentation_models.pytorch (smp) ▪

© GO Drive Inc. 26 TimmUniversalEncoderの大幅アップデートPR ◼ セマンティックセグメンテーションでは複数解像度の特徴マップを利用する必要がある ◼

© GO Drive Inc. 27 ポイント特徴マップのテンソルが channel-firstかchannel-lastか特徴マップリストの空間解像度のパターン https://github.com/qubvel-

© GO Drive Inc. 28 特徴マップリストの空間解像度のパターン ◼ 調べてみた https://www.kaggle.com/code/ren4yu/eda-timm-backbone-feature-info/notebook TimmUniversalEncoder対応バックボーン

© GO Drive Inc. 29 FeatureListNet? ◼ features_only=Trueでモデルを作ると FeatureListNet等でwrapされたモデルになる https://github.com/huggingface/pytorch-image-models/blob/main/timm/models/_builder.py#L479-L491

© GO Drive Inc. 30 前回の発表タイトルは？

© GO Drive Inc. 31 今なら作れる！ ◼ TimmUniversal2.5DEncoder

© GO Drive Inc. 32 2.5Dモデル ◼ CZII 4th Solution

© GO Drive Inc. 33 2Dエンコーダの各stageを部分的にforward CZIIの実装はSwinTransformer前提の実装汎用的なエンコーダにできるはず