Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Aptos2019 48th Solution

icebee16
September 23, 2019

Aptos2019 48th Solution

icebee16

September 23, 2019
Tweet

More Decks by icebee16

Other Decks in Technology

Transcript

  1. @icebee__ @mocobt
    UI4PMVUJPOίϯϖͷ;Γ͔͑Γ
    ࣗ࡞ύΠϓϥΠϯͱ͋Ε͜Ε
    "1504#MJOEOFTT%FUFDUJPO

    View full-size slide

  2. ࣗݾ঺հ
    @icebee__
    •ۚ༥ܥͷݚڀһ
    •ࠓճͷߩݙ͸ɼύΠϓϥΠϯͷ࡞੒ʹΑΔ࣮ݧͷޮ཰Խ
    •CGͷR&D
    •ࠓճͷߩݙ͸ɼจݙௐࠪ΍EDAʹΑΔfinding
    @mocobt
    Kaggleʹ͸νʔϜݻఆͰࢀՃ͍ͯ͠·͢ʂʂ
    ࢀߟ: https://mocobt.hatenablog.com/entry/2019/07/14/013922

    View full-size slide

  3. 1ϖʔδͰΘ͔Δ APTOS 2019
    Input: ؟ఈը૾ Output: ओ؍ධՁ஋
    ݈߁ ةݥ
    0 1 2 3 4
    ౶೘ප໢ບ঱ͷॏ঱౓෼ྨίϯϖςΟγϣϯ
    ࢀߟ: https://www.kaggle.com/c/aptos2019-blindness-detection
    https://en.wikipedia.org/wiki/Ophthalmoscopy

    View full-size slide

  4. APTOSͷ݁Ռ
    …ͳΜͰνʔϜ໊͕ʮΈ͔Μʯͳͷʁ
    Private 4IBLF͠ͱΔʙ
    4IBLF͠ͱΔʙ
    (Top 2%)
    Public ͠Ί͖Γ౰೔ʹΑ͏΍͘
    (Top 8%)

    View full-size slide

  5. Έ͔Μͷ༝དྷ
    ※ը૾͸ΠϝʔδͰ͢ɽҎԼɼϓϥΠόγʔΛߟྀͯ͠ɼΈ͔Μͷը૾Λ༻͍·͢ɽ
    ؟ఈը૾͕Έ͔Μʹݟ͖͑ͯͨɽ
    ඒຯͦ͠͏… Έ͔Μ৯΂͍ͨ…
    αΠί͔ͳʁ (໧ͬͯνʔϜ໊ʹઃఆ)

    View full-size slide

  6. Agenda
    ࣮ݧޮ཰Խ
    Our Solution

    View full-size slide

  7. લ൒: Our Solution
    ࣮ݧޮ཰Խ
    Our Solution
    Our Solution

    View full-size slide

  8. Problems & Approaches
    ෼ྨͰղ͔͘ʁճؼͰղ͍ͨޙʹᮢ஋ॲཧ͢Δ͔ʁ
    ճؼͰղ͍ͨޙʹɼᮢ஋࠷దԽ
    Ξϯαϯϒϧͷ΍Γํ͕Ṗ
    ճؼϞσϧͷग़ྗฏۉ + ᮢ஋΋ฏۉԽ
    ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ
    ಉλεΫ޲͚σʔλͰਫ૿͠
    ը૾αΠζ΍ً౓͕όϥόϥ
    Original CropͰରԠ
    Our Solution

    View full-size slide

  9. ճؼͰղ͍ͨޙʹɼᮢ஋࠷దԽ
    ݁Ռతʹɼճؼ+ᮢ஋ॲཧ͕ڧ͔ͬͨɽ
    ᮢ஋ͷ࠷దԽख๏: https://www.kaggle.com/abhishek/optimizer-for-quadratic-weighted-kappa
    ෼ྨ (Classification)
    ճؼ (Regression) + ᮢ஋ॲཧ
    Our Solution: ෼ྨͰղ͔͘ʁճؼͰղ͍ͨޙʹᮢ஋ॲཧ͢Δ͔ʁ
    Class
    ֬཰
    0 1 2 3
    4
    0.1 0.4 0.3 0.15 0.05
    Class 1
    0.0
    0 1 2 3
    Value
    Class
    0.8 1.2 2.6 3.2 4.0
    ग़ྗ: 1.4
    Class 2
    4

    View full-size slide

  10. ճؼϞσϧͷग़ྗฏۉ + ᮢ஋΋ฏۉԽ
    Our Solution: Ξϯαϯϒϧͷ΍Γํ͕Ṗ
    ֤ϞσϧຖʹInference + શϞσϧͷฏۉᮢ஋Ͱ෼ྨ

    Ϟσϧ1ͷ༧ଌ஋
    Ϟσϧ2ͷ༧ଌ஋
    Ϟσϧnͷ༧ଌ஋
    1.54
    2.16
    1.87
    }ฏۉ༧ଌ஋
    1.93

    ᮢ஋1 ᮢ஋2 ᮢ஋3 ᮢ஋4
    ộ ộ ộ
    Ϟσϧ1ͷᮢ஋ 0.90 1.62 2.64 3.31
    Ϟσϧ2ͷᮢ஋ 0.82 1.53 2.36 3.39

    Ϟσϧnͷᮢ஋ 0.76 1.77 2.75 3.46
    ฏۉᮢ஋ 0.85 1.89 2.55 3.33
    Class 2
    ֤ϞσϧͰClassΛ֬ఆ͔ͯ͠ΒฏۉऔΔλΠϓͷΞϯαϯϒϧ͸৳ͼͳ͔ͬͨ

    View full-size slide

  11. Traditional EDA of APTOS Data
    • Class 0 ͕൒਺Ҏ্
    • Class 4 ͕ѹ౗తʹগͳ͍
    Our Solution: ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ

    View full-size slide

  12. ಉλεΫ޲͚σʔλͰਫ૿͠
    2015೥։࠵ͷྨࣅίϯϖDataΛҰ෦ར༻ (ҎԼɼDRD 2015)
    ࢀߟ: https://www.kaggle.com/c/diabetic-retinopathy-detection
    Resizeࡁ: https://www.kaggle.com/benjaminwarner/resized-2015-2019-blindness-detection-images
    ࢖͍ํ
    ύλʔϯ1: DRD 2015ͷClass 0Ҏ֎Λશͯ࢖༻
    ύλʔϯ2: APTOSͷ֤Classͷׂ߹Λอͬͨ··ɼՄೳͳݶΓDRD 2015Ͱਫ૿͠
    ͦͷଞTraining Dataʹର͢ΔରԠ
    APTOS͔Βॏෳը૾Λ͢΂ͯআڈ
    ࢀߟ: https://www.kaggle.com/maxwell110/duplicated-list-csv-file
    Maxwell͞Μ͋Γ͕ͱ͏͍͟͝·͢
    Our Solution: ϥϕϧ͕͋Γ͑Μ΄Ͳෆۉߧ

    View full-size slide

  13. Traditional EDA of APTOS Data
    • Class 0 ͕൒਺Ҏ্
    • Class 4 ͕ѹ౗తʹগͳ͍
    • ը૾αΠζ΋όϥόϥ…
    • ϦʔΫͯ͠Δ…ʁ
    • 480x640ͷը૾ଟ͗͢
    Our Solution: ը૾αΠζ΍ً౓͕όϥόϥ

    View full-size slide

  14. Step 1 (in) Step 2 Step 3 Step 4 Step 5 (out)
    Ben’s CropͰً౓౷Ұ
    (1st Solution@DRD 2015)
    Original CropͰରԠ
    Our Solution: ը૾αΠζ΍ً౓͕όϥόϥ
    • த৺ͱΤοδ͔Β൒ܘऔಘ
    • தԝʹը૾഑ஔ&ϦαΠζ
    APTOS ؟ఈը૾ͷ໰୊఺
    ܗঢ়͕҆ఆ͠ͳ͍ (ԁܗ or ΍͚ʹ֦େ͞Ε͍ͯΔ)
    ը૾αΠζͱً౓͕όϥόϥ
    Input: ؟ఈը૾ Output: ը૾αΠζ, ً౓, (͋Δఔ౓)ܗঢ়͕Ұఆͷը૾
    Assumption: ؟ఈ͕΄΅ԁܗ
    Pros: όϥόϥ໰୊Λ͍͍ͩͨղܾ Cons: ͪΐͬͱ஗͍
    Ben’s Edge Crop
    • ෳ਺ճɼ௚ઢऔಘ
    • ަ఺͔Βԁத৺Λܭࢉ
    ԁ্ೋ఺Λ݁Ϳઢ෼͔Βɼ
    ԁͷத৺Λ௨Δ௚ઢΛಋग़
    Τοδݕग़
    (Canny๏)

    View full-size slide

  15. Our Total Solution
    Mean
    Average
    Blending
    Private: 48th / 2943
    LB: 0.9259 (Top 2%)
    Public: 230th / 2943
    LB: 0.8078 (Top 8%)
    Link: https://www.kaggle.com/muhakabartay/bump-to-0-800-e-2ln-more-efficient
    Most stable Kernel
    DRD 2015
    APTOS
    (duplicated removed)
    Remove 0 label
    Preserve the label ratio of 2019
    Train
    Train
    Preprocessing
    Preprocessing
    Ben’s Edge Crop
    (Size: 320)
    Normalization with ImageNet stats
    Normalization with ImageNet stats
    Train
    Training (regression)
    • EfficientNet-b4 (pretrained)
    • Batch size: 16
    • Adam (lr: 0.001, weight_decay: 1e-5)
    • MSELoss
    • Augmentation
    • RandomRotation
    • Random HorizontalFlip
    • Threshold optimization
    • EfficientNet-b4 (pretrained)
    • Batch size: 32
    • Adam (lr: 0.001, weight_decay: 1e-5)
    • MSELoss
    • Augmentation
    • RandomRotation
    • Random HorizontalFlip
    • Threshold optimization
    Training (regression)
    Training (regression)
    • EfficientNet-b5 (pretrained)
    • Batchsize: 64
    • Adam (lr: 0.001, weight_decay: 1e-5)
    • QWK (early stopping: MSELoss)
    • Augmentation:
    • Defaults of fast.ai
    • Random VerticalFlip
    • Threshold optimization
    Preprocessing
    [fast.ai’s transform]
    Resize: squish
    Padding: reflection
    Only Resizing
    (Size: 256)
    Stratified 1fold
    Validation
    Stratified 5fold
    Validation
    Random split
    Validation
    Our Solution
    Prediction
    Prediction
    Prediction
    • Fold: 5
    • TTA: 5
    • RandomRotate
    • RandomHorizontalFlip
    • Fold: 1
    • TTA: 10
    • RandomRotate
    • RandomHorizontalFlip
    • Fold: 1
    • TTA: None
    • Weight: 5
    Private LB: 0.923899
    Public LB: 0.806519
    Private LB: 0.917967
    Public LB: 0.790956
    Private LB: 0.910…
    Public LB: 0.792…
    25
    10
    5
    @icebee__
    @mocobt
    Ben’s Edge Crop
    (Size: 256)

    View full-size slide

  16. ࢼͯ͠μϝͩͬͨ͜ͱ
    DRD2015ͰPre-training → APTOSͰ Training
    Validation͕ѱ͔ͬͨͷ͔ɼ৳ͼͣ
    Messidor Databaseͷར༻
    ϥϕϧఆ͕ٛҧ͏͍͔ͤɼ৳ͼͣ
    Class 0Λ൑ผ͢ΔϞσϧ࡞੒ →ผϞσϧͷ༧ଌʹ࢖༻
    Public͸Class 1, 2, 3ΛClass 0ͱޡ൑அ͢Δ͜ͱ͕ଟ͘ɼ͋·Γޮ͔ͣ
    ঘɼPrivateͰ͸ޮ͍͍ͯͨ໛༷
    Weighted MSE LossΛద༻ (਺͕গͳ͍Classʹ܏ࣼ)
    ΄΅৳ͼͣ
    Our Solution

    View full-size slide

  17. ࢼͤͳ͔ͬͨ͜ͱ
    Our Solution
    Batch size > 32 or Image size > 320
    Memory଍Γͣɽͨͩfast.aiͰ͸Batch size 64͕P100Ͱී௨ʹಈ͍͍ͯͨ…
    ্Ґ੎͸ը૾αΠζ͕େ͖͔ͬͨɽ࣮૷͕ແཧͩͱఘΊ͍ͯͨͷͰɼ൓ল…
    Inception, SEResNeXt, DenseNet165ͳͲଞϞσϧ
    EfficientNetͰखҰഋͩͬͨ…(Dense͸ࢼ͕ͨ͠ɼฃΘͳ͔ͬͨ)
    GradCAMͳͲͷՄࢹԽख๏Λ༻͍ͨEDA
    ࣮૷ؒʹ߹Θͣ… ͜Ε͸ຊ౰ʹࢼͨ͠΄͏͕ྑ͔ͬͨ… (ࢀߟ: Nakama>ω ͞Μͷϒϩά)
    Pseudo labeling
    ๨Εͯͨ

    View full-size slide

  18. ޙ൒: ࣮ݧޮ཰Խ
    ࣮ݧޮ཰Խ
    Our Solution
    ࣮ݧޮ཰Խ

    View full-size slide

  19. ࣮ݧͷޮ཰Խ
    ࣮ݧޮ཰Խ
    ύΠϓϥΠϯͰࣗಈԽ ࣮ݧ؅ཧ

    View full-size slide

  20. ύΠϓϥΠϯུ֓ਤ
    ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ
    Preprocessing Training Prediction

    View full-size slide

  21. ύΠϓϥΠϯུ֓ਤ
    Preprocessing Training Prediction
    ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ

    View full-size slide

  22. ύΠϓϥΠϯུ֓ਤ
    Preprocessing Training Prediction
    Config
    Train
    ✓Save as caches
    Preprocessed
    Log
    Model
    ✓Main log
    ✓Learning curve
    ✓Confusion Matrix
    ✓Pytorch Model
    ✓Threshold data
    Config
    Preprocessed
    Test
    Config
    Model
    Submit
    ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ

    View full-size slide

  23. ύΠϓϥΠϯ·ͱΊ
    Kernel࢖༻ن੍໰୊΋ɼTraining͸΄΅Өڹͳ͠
    ίϯϖޙظ͸GCPͰΠϯελϯεΛཱͯ·ͬͯͨ͘
    Submit࣌ͷʮKernel্ཱ͕ͪΒͶ͑…ͳΜ΍͜Ε…ʯ໰୊͸ྲྀੴʹରॲͰ͖ͣ
    ύΠϓϥΠϯ͔ΒAPIܦ༝ͰDatasetΛUploadͯ͠ɼखؒ࡟ݮ
    Kaggle APIͷDocument͕ෆ଍͍ͯͯ͠ɼͦͦۤ͜͜࿑ͨ͠
    धཁ͕͋ΔͳΒɼ·ͱΊͯهࣄʹ͠·͢
    ͨͩ͠ɼύΠϓϥΠϯपΓ͸ڽΓ͗͢ΔͱίέΔ (ܦݧஊ)
    ࣮૷ίετ͹͔Γ͔͞Έɼຊےͷίϯϖ΁ͷऔΓ૊Έ͕ૄ͔ʹͳΓ͕ͪ
    ࣮ݧޮ཰Խ: ύΠϓϥΠϯͰࣗಈԽ

    View full-size slide

  24. SlackʹΑΔ࣮ݧ؅ཧ
    ࣮ݧޮ཰Խ: ࣮ݧ؅ཧ
    …ͳΔ΄Ͳ..Α͘Θ͔Βͳ͍…
    Log༻ͷChannel΋࡞͕ͬͨɼฤू͠ʹ͍͘ & աڈͷ݁ՌΛৼΓฦΓʹ͔ͬͨ͘

    View full-size slide

  25. ࠷ڧͷ࣮ݧ؅ཧ
    ~ Grand Excel Master ~
    ࣮ݧID: Kaggle Datasetͷ؅ཧ༻ LB & CV Ծઆ & ࣮૷಺༰ & ݁Ռ
    ࣮ݧޮ཰Խ: ࣮ݧ؅ཧ

    View full-size slide

  26. ࣮ݧID: Kaggle Datasetͷ؅ཧ༻ LB & CV Ծઆ & ࣮૷಺༰ & ݁Ռ
    ࣮ݧ؅ཧ͔ΒΘ͔Δ൓ল఺
    Ծઆͷཱͯํ͕Ṗ
    • ଞॴͰݟͨ஌ݟʹج͍ͮͨԾઆ͚ͩ
    • EDAʹج͍ͮͨԾઆΛཱͯΒΕ͍ͯͳ͍
    ۭཝଟ͠
    • ݁ՌΛ൓ᢸ͍ͯ͠ͳ͍
    • ͦ΋ͦ΋ࠞಉߦྻΛݟͨͷ͕ʒ5೔લͱ͔
    • ·͋…औΕͨͷͰ…
    ࣮ݧޮ཰Խ: ࣮ݧ؅ཧ

    View full-size slide

  27. ࠓճɼ࣮ࡍʹ͔͔͓ͬͨۚ
    ࣮ݧޮ཰Խ
    35,214ԁ
    ΪοΫϦࠊ࣏ྍඅ
    (ίϯϖऴྃ௚ޙʹෛই)
    53,030ԁ

    View full-size slide

  28. ·ͱΊ
    APTOSͰԿͱ͔ۜϝμϧήοτ
    Original CropͰ্ͦͦ͜͜Ґߦ͚ͨ
    ը૾αΠζͱPseudo Labeling্͕Ґ੎ͱͷେ͖ͳࠩͩͬͨ
    ΞʔΩςΫνϟͷҧ͍͸ͦ͜·Ͱॏཁͳ໰୊Ͱ͸ͳ͍…ͱࢥ͏
    ࣮ݧޮ཰Խʹ͸ύΠϓϥΠϯศརɼKaggleAPI΋ศར
    (չ͍͕͠) Excel͸݁ՌͷৼΓฦΓָ͕
    ΪοΫϦࠊ͸೑ମతʹ΋ۚમతʹ΋ͱͯ΋ͭΒ͍
    (͓·͚) RepositoryͱৼΓฦΓϒϩά
    • https://github.com/icebee16/kaggle_APTOS2019
    • http://icebee.hatenablog.com/entry/2019/09/10/221351
    • https://mocobt.hatenablog.com/entry/2019/09/09/013658

    View full-size slide

  29. References
    ίϯϖͱಉ͡λεΫΛѻͬͨ࿦จ
    [Sayres et al. Ophthalmology 2019, Volume 126, Issue 4, Pages 552-564]
    [Krause et al. Ophthalmology 2018, Volume 125, Issue 8, Pages 1264-1272]
    [Poplin et al. Nature Biomedical Engineering 2018]
    [Gulshan et al. JAMA, The Journal of American Medical Association 2016, 316(22)]
    Survey Papers
    [Qureshi et al. Symmetry 2019, 11(6), 749]
    [Erfuth et al. Progress in Retinal and Eye Research 2018, Volume 67, Pages 1-29]
    [Fenner et al. Ophthalmology and Therapy 2018, Volume 7, Issue 2, Pages 333-346]
    [Almotiri et al. Applied Sciences 2018, 8(2), 155]

    View full-size slide