RSNA STR Pulmonary Embolism Detection Solution Overview

RSNA STR Pulmonary Embolism Detection Solution Overview いのうえ

Data • 1Tに近いデータセットのサイズ • 形式はDICOM • Trainデータ ◦ 画像の数：1,790,594枚 ◦
患者の数：7,279人 ◦ 患者あたりの平均画像数：245枚 • Publicデータ (28%) ◦ 画像の数：146,853枚 ◦ 患者の数：650人 ◦ 患者あたりの平均画像数：226枚 Train data Test data

Data • PEの有無に関わらず患者あたりの画像の数は同じ感じ • PEがある場合、その数は100枚以下が80％以上

Data • 昨年と同じように病変スライドは連続している。 • PEは中央に存在する可能性が高い。 • この性質を利用するとスコアは0.33くらいでる。Link to kernel PE
present on image

Evaluation; Label consistency この条件を満たしていないものはPrizeに入れない。。。

Metrics Exam-level (患者レベル) Image-level (画像レベル) 画像レベルのLossはExamがPEじゃなかった場合は加算されない。 Link to Discussion

Image Preprocessing CT画像はWindowを決めて組織がうまく可視化される範囲を切り取ってくる方法が今年も使われていた。参考：Qiita《Kaggleコンペ紹介》RSNA Intracranial Hemorrhage Detection ~CT画像から頭蓋内出血のタイプ分類~ Link to
kernel Link to discussion

High Scoring Kernel 終了8日前に銀メダル真ん中くらいのKernelと詳細な手法が書かれたDiscussionが出てコンペがざわつく。 B0 Image level pretrain B0
Exam level retrain GRU Image level prediction Exam level prediction Post processing for label consistenct

High Scoring Kernel High scoring kernelからPost processingなくしたらPrivate 0.208で銀メダル...\(^ o ^)/
B0 Image level pretrain B0 Exam level retrain GRU Image level prediction Exam level prediction Post processing for label consistenct

High Scoring Kernel 推論時、モデルごとにテストデータをロードしていて無駄に時間がかかっていた。そのせいでみんなアンサンブルできないと思っていた。（Discussionで「時間ギリギリだからこれ以上のアンサンブルは無理だ」って言ってる人いた）データロード時のバッチサイズを下げて、1回のロードで複数モデルを推論させればアンサンブルは可能。 B0
Image level pretrain B0 Exam level retrain GRU Image level prediction Exam level prediction Post processing for label consistenct

My ﬁnal models

Top solutions Link to Discussion

画像サイズが大きいほど精度が向上。 Lung localizerを作成し、Lung領域を特定して拡大することで効率よく解像度の高い画像を作成。 Bounding Boxはハンドラベルで作成。 1st place solution Discussion:
https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/194145 Github: https://github.com/GuanshuoXu/RSNA-STR-Pulmonary-Embolism-Detection B0 Regression (x_min, y_min, x_max, y_max)

2nd stageのモデルでシークエンス長が違うものについては、ResizeやPaddingで長さを揃えていた。前後のベクトルも加えて2048 -> 6144次元のベクトルを作成。（昨年の大越さんのSolution） 1st place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/194145
Github: https://github.com/GuanshuoXu/RSNA-STR-Pulmonary-Embolism-Detection

Step 1: Feature Extraction 前後の画像をStackして真ん中のラベルを予測。（RV/LV ratioはターゲットから除外） 2nd place solution
Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

Step 2: Sequence Modeling CNNで特徴抽出したSequenceに対してTransformer で学習。必要に応じてResize(zoom)とPadding。 Negative examはスコアに加算されないので、 Positive examのみで学習。さらにPositive
PEの画像の割合に応じてLossを重み付けした。 2nd place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

Step 3: Time-Distributed CNN 多様な予測をするためにCNN+Transformerを組み合わせたモデルをEnd-to-Endで学習した。データはPEの確率が高い画像を30％使用。PEの確率の順番に並べて、32枚サンプリングしてCNNに入れた。パラメータはStep
1とStep 2でpretrainされたものを使用した。 2nd place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

Step 4: Heart Slice Prediction RV/LV ratioを予測するためのPipeline。心臓の有無で1,000枚ハンドラベル。1番下と1番上の心臓の場所がわかれば真ん中は全部心臓。 EﬃcientNetB1
pruned 256x256 -> crop 224x224 ⇒ AUC 0.998 2nd place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

Step 5: RV/LV 3D CNN Positive Examの中から心臓が含まれた画像を使い RV/LV ratioのラベルを学習。InstagramのVideoで pretrainされた101-layer
channel separated network(arXive, Github)を使用。この段階ではあまりloss は下がってない(0.44~0.48）。 Right PE、Left PEの情報はRV/LV ratioを予測する上でとても重要な情報なので、Step 2のPE labelを上記モデルの中間層の特徴量を組み合わせて線形モデルで学習。 Lossは0.22 ~ 0.25に。 2nd place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

最終モデル -2x ResNeSt50 feature extractors -6x exam transformers (3 for
each extractor) -6x slice transformers -5x ResNeSt50 TD-CNN -1x EﬃcientNet-B1 pruned heart slice classiﬁer -5x ip-CSN-101 3D CNN RV/LV feature extractor -5x RV/LV linear model 2nd place solution Discussion: https://www.kaggle.com/c/rsna-str-pulmonary-embolism-detection/discussion/193401 Github: https://github.com/i-pan/kaggle-rsna-pe

RSNA STR Pulmonary Embolism Detection Solution ...

RSNA STR Pulmonary Embolism Detection Solution Overview

Inoichan

More Decks by Inoichan

Other Decks in Technology

Featured

Transcript