Upgrade to Pro — share decks privately, control downloads, hide ads and more …

APTOS2019 Top Solutions

APTOS2019 Top Solutions

tereka114

March 15, 2022
Tweet

More Decks by tereka114

Other Decks in Programming

Transcript

  1. APTOS 2019 Blindness Detection Top Solutions Detect diabetic retinopathy to

    stop blindness before it's too late tereka114 https://www.kaggle.com/tereka114
  2. Hiroki Yamamoto • Acroquest Technology Co. Ltd • Machine Learning/Computer

    Vision • Kaggle Master • Write Technical Magazine • Techbook fest Elasticsearch NEXT STEP • Interface
  3. Asia Pacific Tele-Ophthalmology Society 2019 Detect Diabetic Retinopathy to stop

    blindness before it’s too late Build a model to help identify diabetic retinopathy automatically ü Aravind Eye Hospital technicians travel to rural areas to capture images ü Shortage of high trained doctors to review the images and provide diagnosis in rural areas of India Aravind Eye Hospital Madurai, Tamil Nadu Rural areas
  4. 2. Data ü train : 3,662 public : 1,928 private

    : ~ 11,000 ü png format images ü target label 0 : No DR 1 : Mild 2 : Moderate 3 : Severe 4 : Proliferative DR 0 : No DR 1 : Mild 2 : Moderate 3 : Severe 4 : Proliferative DR
  5. 2. External Data 1. External data can be used 1.

    2015 competition https://www.kaggle.com/c/diabetic-retinopathy-detection 2. IDIRD https://ieee-dataport.org/open-access/indian-diabetic-retinopathy- image-dataset-idrid
  6. 3. Challenge 1. Inconsistent performance between CV and Public LB

    2. Kernel Limit 9hours for prediction 3. Dataset is noisy
  7. 4. 10th place solution 1. Model 1. 2 x EfficientNet

    B5(RMSE), 2 x Se-ResNext101(MSE) 2. Dataset 1. 2015 + 2019 2. 5 Folds(2019 4/5 Training + 2016, Validation is 2019 ¼ Training) 3. Augmentation 1. Bens Crop, Horizonal Flip/Vertical Flip, Random Rotation, Random Contrast 4. Pseudo Labeling 1. Soft label 2. Hard label(finetune only trained model) 5. Ensemble
  8. 4. 8th place solution 1. Model ü EfficientNet B5(MSE), EfficientNet

    B4(MSE) 2. Dataset ü Use 2015 + 2019 while training(for generalization) ü Remove confusing label(duplicate, confusing) 3. Augmentation ü Rotate, Horizonal/Vertical Flip, Zoom, Lightning 4. Pseudo Labeling ü Model Average ü Add training data, and retrain models 5. Ensemble ü Average
  9. 4. 7th place solution 1. Model 1. Se-ResNeXt50, 101, Inception

    V4 2. Used GAP(Coord-Conv, GWAP, Average + Max didn’t work) 2. Dataset 1. 2015 + 2019, Messidor, Messidor2, IDIRD 3. Training 1. Pseudo labeling to test dataset(2015, Messidor, IDIRD) 2. Finetune model 4. Loss 1. Focal + Kappa(Classification) 2. Wing Loss(Regression) 3. Cauchy loss(Ordinal Regression)
  10. 4. 5th place solution 1. Model ü EfficientNet B3(300) ü

    EfficientNet B4(400) ü EfficientNet B5(456) 2. Preprocessing ü Crop From Gray(Both Training/Predicting) ü Apply Image Type 3. Augmentation ü Dihedral, Random Crop, Rotation, Contrast Brightness, Cutout, PerspectiveTransform, CLAHE
  11. 4. 4th place solution 1. Model ü EfficientNet-B2-B7 2. Dataset

    ü Remove Crop ü Apply Type using image boundary 3. Augmentations(Albumentations) ü Dihedral, Random Crop, Rotation, Contrast Brightness, Cutout, Perspective Transform, CLAHE 4. Simple Average Ensemble
  12. 4. 2nd place solution 1. Model ü EfficientNet B3(300),EfficientNet B4(400),

    EfficientNet B5(456) 2. Dataset ü Remove Crop and resize images 3. Augmentations(Albumentations) ü Blur, Flip, RandomBrightnessContrast, ShiftScaleRotate, Elastic Transform, Transpose, GridDistortion, HueSaturationValue 4. Pseudo Labeling(Repeat) 5. Simple Average Ensemble
  13. 4. 1st place solution 1. Model ü 2 * Inception

    ResNetV2, 2 * inception V4, 2 * SE-ResNeXt50, 2 * SE-ResNeXt101 ü Loss is Smooth L1 ü Replace Average Pooling to Generalized Mean 2. Dataset 3. Preprocessing 4. Pseudo Labeling(Soft) 5. Validation is Public LB!
  14. 5. Common techniques 1. Combinations of models and image size

    2. Almost team try regression task 3. Heavy augmentations 4. Pseudo Labeling 1. Hard or soft 5. Ensemble 1. Average, Blending etc.. 6. Trust your CV
  15. 5. Combinations of models and image sizes 1. Top competitors

    use many different models and image size 1. e.g. SE-ResNeXt 50 + 101 + Inception ResNetV2 Image sizes are different 2. Recently, use Efficient Net 2. Get model diversity, and get robustness score. 3. We get high score using ensemble(averaging/stacking/blending)
  16. 5. Pseudo Labeling 1. In this competition, Almost used pseudo

    labeling. 2. Pseudo labeling 1. Hard 2. Soft( I haven’t never choose) 3. Many team use training data + pseudo labeling testing data with CV 1. Our team(28th) only used pretrain(2015 + train) phase. but Public LB is low.(we get high score on Private LB .) So we couldn’t choose that submission. 2. We have to get high score pseudo labeling. (If pseudo labeling is low, we cannot get high score)