icebee16
September 23, 2019
1.3k

# Aptos2019 48th Solution

## icebee16

September 23, 2019

## Transcript

3. ### 1ϖʔδͰΘ͔Δ APTOS 2019 Input: ؟ఈը૾ Output: ओ؍ධՁ஋ ݈߁ ةݥ 0

1 2 3 4 ౶೘ප໢ບ঱ͷॏ঱౓෼ྨίϯϖςΟγϣϯ ࢀߟ: https://www.kaggle.com/c/aptos2019-blindness-detection https://en.wikipedia.org/wiki/Ophthalmoscopy

8%)

8. ### Problems & Approaches ෼ྨͰղ͔͘ʁճؼͰղ͍ͨޙʹᮢ஋ॲཧ͢Δ͔ʁ ճؼͰղ͍ͨޙʹɼᮢ஋࠷దԽ Ξϯαϯϒϧͷ΍Γํ͕Ṗ ճؼϞσϧͷग़ྗฏۉ + ᮢ஋΋ฏۉԽ ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ

ಉλεΫ޲͚σʔλͰਫ૿͠ ը૾αΠζ΍ً౓͕όϥόϥ Original CropͰରԠ Our Solution
9. ### ճؼͰղ͍ͨޙʹɼᮢ஋࠷దԽ ݁Ռతʹɼճؼ+ᮢ஋ॲཧ͕ڧ͔ͬͨɽ ᮢ஋ͷ࠷దԽख๏: https://www.kaggle.com/abhishek/optimizer-for-quadratic-weighted-kappa ෼ྨ (Classiﬁcation) ճؼ (Regression) + ᮢ஋ॲཧ

Our Solution: ෼ྨͰղ͔͘ʁճؼͰղ͍ͨޙʹᮢ஋ॲཧ͢Δ͔ʁ Class ֬཰ 0 1 2 3 4 0.1 0.4 0.3 0.15 0.05 Class 1 0.0 0 1 2 3 Value Class 0.8 1.2 2.6 3.2 4.0 ग़ྗ: 1.4 Class 2 4
10. ### ճؼϞσϧͷग़ྗฏۉ + ᮢ஋΋ฏۉԽ Our Solution: Ξϯαϯϒϧͷ΍Γํ͕Ṗ ֤ϞσϧຖʹInference + શϞσϧͷฏۉᮢ஋Ͱ෼ྨ ộ

Ϟσϧ1ͷ༧ଌ஋ Ϟσϧ2ͷ༧ଌ஋ Ϟσϧnͷ༧ଌ஋ 1.54 2.16 1.87 }ฏۉ༧ଌ஋ 1.93 ộ ᮢ஋1 ᮢ஋2 ᮢ஋3 ᮢ஋4 ộ ộ ộ Ϟσϧ1ͷᮢ஋ 0.90 1.62 2.64 3.31 Ϟσϧ2ͷᮢ஋ 0.82 1.53 2.36 3.39 ộ Ϟσϧnͷᮢ஋ 0.76 1.77 2.75 3.46 ฏۉᮢ஋ 0.85 1.89 2.55 3.33 Class 2 ֤ϞσϧͰClassΛ֬ఆ͔ͯ͠ΒฏۉऔΔλΠϓͷΞϯαϯϒϧ͸৳ͼͳ͔ͬͨ
11. ### Traditional EDA of APTOS Data • Class 0 ͕൒਺Ҏ্ •

Class 4 ͕ѹ౗తʹগͳ͍ Our Solution: ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ
12. ### ಉλεΫ޲͚σʔλͰਫ૿͠ 2015೥։࠵ͷྨࣅίϯϖDataΛҰ෦ར༻ (ҎԼɼDRD 2015) ࢀߟ: https://www.kaggle.com/c/diabetic-retinopathy-detection Resizeࡁ: https://www.kaggle.com/benjaminwarner/resized-2015-2019-blindness-detection-images ࢖͍ํ ύλʔϯ1:

DRD 2015ͷClass 0Ҏ֎Λશͯ࢖༻ ύλʔϯ2: APTOSͷ֤Classͷׂ߹Λอͬͨ··ɼՄೳͳݶΓDRD 2015Ͱਫ૿͠ ͦͷଞTraining Dataʹର͢ΔରԠ APTOS͔Βॏෳը૾Λ͢΂ͯআڈ ࢀߟ: https://www.kaggle.com/maxwell110/duplicated-list-csv-ﬁle Maxwell͞Μ͋Γ͕ͱ͏͍͟͝·͢ Our Solution: ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ
13. ### Traditional EDA of APTOS Data • Class 0 ͕൒਺Ҏ্ •

Class 4 ͕ѹ౗తʹগͳ͍ • ը૾αΠζ΋όϥόϥ… • ϦʔΫͯ͠Δ…ʁ • 480x640ͷը૾ଟ͗͢ Our Solution: ը૾αΠζ΍ً౓͕όϥόϥ
14. ### Step 1 (in) Step 2 Step 3 Step 4 Step

5 (out) Ben’s CropͰً౓౷Ұ (1st Solution@DRD 2015) Original CropͰରԠ Our Solution: ը૾αΠζ΍ً౓͕όϥόϥ • த৺ͱΤοδ͔Β൒ܘऔಘ • தԝʹը૾഑ஔ&ϦαΠζ APTOS ؟ఈը૾ͷ໰୊఺ ܗঢ়͕҆ఆ͠ͳ͍ (ԁܗ or ΍͚ʹ֦େ͞Ε͍ͯΔ) ը૾αΠζͱً౓͕όϥόϥ Input: ؟ఈը૾ Output: ը૾αΠζ, ً౓, (͋Δఔ౓)ܗঢ়͕Ұఆͷը૾ Assumption: ؟ఈ͕΄΅ԁܗ Pros: όϥόϥ໰୊Λ͍͍ͩͨղܾ Cons: ͪΐͬͱ஗͍ Ben’s Edge Crop • ෳ਺ճɼ௚ઢऔಘ • ަ఺͔Βԁத৺Λܭࢉ ԁ্ೋ఺Λ݁Ϳઢ෼͔Βɼ ԁͷத৺Λ௨Δ௚ઢΛಋग़ Τοδݕग़ (Canny๏)
15. ### Our Total Solution Mean Average Blending Private: 48th / 2943

LB: 0.9259 (Top 2%) Public: 230th / 2943 LB: 0.8078 (Top 8%) Link: https://www.kaggle.com/muhakabartay/bump-to-0-800-e-2ln-more-eﬃcient Most stable Kernel DRD 2015 APTOS (duplicated removed) Remove 0 label Preserve the label ratio of 2019 Train Train Preprocessing Preprocessing Ben’s Edge Crop (Size: 320) Normalization with ImageNet stats Normalization with ImageNet stats Train Training (regression) • EﬃcientNet-b4 (pretrained) • Batch size: 16 • Adam (lr: 0.001, weight_decay: 1e-5) • MSELoss • Augmentation • RandomRotation • Random HorizontalFlip • Threshold optimization • EﬃcientNet-b4 (pretrained) • Batch size: 32 • Adam (lr: 0.001, weight_decay: 1e-5) • MSELoss • Augmentation • RandomRotation • Random HorizontalFlip • Threshold optimization Training (regression) Training (regression) • EﬃcientNet-b5 (pretrained) • Batchsize: 64 • Adam (lr: 0.001, weight_decay: 1e-5) • QWK (early stopping: MSELoss) • Augmentation: • Defaults of fast.ai • Random VerticalFlip • Threshold optimization Preprocessing [fast.ai’s transform] Resize: squish Padding: reﬂection Only Resizing (Size: 256) Stratiﬁed 1fold Validation Stratiﬁed 5fold Validation Random split Validation Our Solution Prediction Prediction Prediction • Fold: 5 • TTA: 5 • RandomRotate • RandomHorizontalFlip • Fold: 1 • TTA: 10 • RandomRotate • RandomHorizontalFlip • Fold: 1 • TTA: None • Weight: 5 Private LB: 0.923899 Public LB: 0.806519 Private LB: 0.917967 Public LB: 0.790956 Private LB: 0.910… Public LB: 0.792… 25 10 5 @icebee__ @mocobt Ben’s Edge Crop (Size: 256)
16. ### ࢼͯ͠μϝͩͬͨ͜ͱ DRD2015ͰPre-training → APTOSͰ Training Validation͕ѱ͔ͬͨͷ͔ɼ৳ͼͣ Messidor Databaseͷར༻ ϥϕϧఆ͕ٛҧ͏͍͔ͤɼ৳ͼͣ Class

0Λ൑ผ͢ΔϞσϧ࡞੒ →ผϞσϧͷ༧ଌʹ࢖༻ Public͸Class 1, 2, 3ΛClass 0ͱޡ൑அ͢Δ͜ͱ͕ଟ͘ɼ͋·Γޮ͔ͣ ঘɼPrivateͰ͸ޮ͍͍ͯͨ໛༷ Weighted MSE LossΛద༻ (਺͕গͳ͍Classʹ܏ࣼ) ΄΅৳ͼͣ Our Solution
17. ### ࢼͤͳ͔ͬͨ͜ͱ Our Solution Batch size > 32 or Image size

> 320 Memory଍Γͣɽͨͩfast.aiͰ͸Batch size 64͕P100Ͱී௨ʹಈ͍͍ͯͨ… ্Ґ੎͸ը૾αΠζ͕େ͖͔ͬͨɽ࣮૷͕ແཧͩͱఘΊ͍ͯͨͷͰɼ൓ল… Inception, SEResNeXt, DenseNet165ͳͲଞϞσϧ EﬃcientNetͰखҰഋͩͬͨ…(Dense͸ࢼ͕ͨ͠ɼฃΘͳ͔ͬͨ) GradCAMͳͲͷՄࢹԽख๏Λ༻͍ͨEDA ࣮૷ؒʹ߹Θͣ… ͜Ε͸ຊ౰ʹࢼͨ͠΄͏͕ྑ͔ͬͨ… (ࢀߟ: Nakama>ω</ ͞Μͷϒϩά) Pseudo labeling ๨Εͯͨ

22. ### ύΠϓϥΠϯུ֓ਤ Preprocessing Training Prediction Conﬁg Train ✓Save as caches Preprocessed

Log Model ✓Main log ✓Learning curve ✓Confusion Matrix ✓Pytorch Model ✓Threshold data Conﬁg Preprocessed Test Conﬁg Model Submit ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ
23. ### ύΠϓϥΠϯ·ͱΊ Kernel࢖༻ن੍໰୊΋ɼTraining͸΄΅Өڹͳ͠ ίϯϖޙظ͸GCPͰΠϯελϯεΛཱͯ·ͬͯͨ͘ Submit࣌ͷʮKernel্ཱ͕ͪΒͶ͑…ͳΜ΍͜Ε…ʯ໰୊͸ྲྀੴʹରॲͰ͖ͣ ύΠϓϥΠϯ͔ΒAPIܦ༝ͰDatasetΛUploadͯ͠ɼखؒ࡟ݮ Kaggle APIͷDocument͕ෆ଍͍ͯͯ͠ɼͦͦۤ͜͜࿑ͨ͠ धཁ͕͋ΔͳΒɼ·ͱΊͯهࣄʹ͠·͢ ͨͩ͠ɼύΠϓϥΠϯपΓ͸ڽΓ͗͢ΔͱίέΔ (ܦݧஊ)

࣮૷ίετ͹͔Γ͔͞Έɼຊےͷίϯϖ΁ͷऔΓ૊Έ͕ૄ͔ʹͳΓ͕ͪ ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ

25. ### ࠷ڧͷ࣮ݧ؅ཧ ~ Grand Excel Master ~ ࣮ݧID: Kaggle Datasetͷ؅ཧ༻ LB

& CV Ծઆ & ࣮૷಺༰ & ݁Ռ ࣮ݧޮ཰Խ: ࣮ݧ؅ཧ
26. ### ࣮ݧID: Kaggle Datasetͷ؅ཧ༻ LB & CV Ծઆ & ࣮૷಺༰ &

݁Ռ ࣮ݧ؅ཧ͔ΒΘ͔Δ൓ল఺ Ծઆͷཱͯํ͕Ṗ • ଞॴͰݟͨ஌ݟʹج͍ͮͨԾઆ͚ͩ • EDAʹج͍ͮͨԾઆΛཱͯΒΕ͍ͯͳ͍ ۭཝଟ͠ • ݁ՌΛ൓ᢸ͍ͯ͠ͳ͍ • ͦ΋ͦ΋ࠞಉߦྻΛݟͨͷ͕ʒ5೔લͱ͔ • ·͋…औΕͨͷͰ… ࣮ݧޮ཰Խ: ࣮ݧ؅ཧ

28. ### ·ͱΊ APTOSͰԿͱ͔ۜϝμϧήοτ Original CropͰ্ͦͦ͜͜Ґߦ͚ͨ ը૾αΠζͱPseudo Labeling্͕Ґ੎ͱͷେ͖ͳࠩͩͬͨ ΞʔΩςΫνϟͷҧ͍͸ͦ͜·Ͱॏཁͳ໰୊Ͱ͸ͳ͍…ͱࢥ͏ ࣮ݧޮ཰Խʹ͸ύΠϓϥΠϯศརɼKaggleAPI΋ศར (չ͍͕͠) Excel͸݁ՌͷৼΓฦΓָ͕

ΪοΫϦࠊ͸೑ମతʹ΋ۚમతʹ΋ͱͯ΋ͭΒ͍ (͓·͚) RepositoryͱৼΓฦΓϒϩά • https://github.com/icebee16/kaggle_APTOS2019 • http://icebee.hatenablog.com/entry/2019/09/10/221351 • https://mocobt.hatenablog.com/entry/2019/09/09/013658
29. ### References ίϯϖͱಉ͡λεΫΛѻͬͨ࿦จ [Sayres et al. Ophthalmology 2019, Volume 126, Issue

4, Pages 552-564] [Krause et al. Ophthalmology 2018, Volume 125, Issue 8, Pages 1264-1272] [Poplin et al. Nature Biomedical Engineering 2018] [Gulshan et al. JAMA, The Journal of American Medical Association 2016, 316(22)] Survey Papers [Qureshi et al. Symmetry 2019, 11(6), 749] [Erfuth et al. Progress in Retinal and Eye Research 2018, Volume 67, Pages 1-29] [Fenner et al. Ophthalmology and Therapy 2018, Volume 7, Issue 2, Pages 333-346] [Almotiri et al. Applied Sciences 2018, 8(2), 155]