APTOS2019 Top Solutions

APTOS 2019 Blindness Detection Top Solutions Detect diabetic retinopathy to
stop blindness before it's too late tereka114 https://www.kaggle.com/tereka114

Hiroki Yamamoto • Acroquest Technology Co. Ltd • Machine Learning/Computer
Vision • Kaggle Master • Write Technical Magazine • Techbook fest Elasticsearch NEXT STEP • Interface

Outline 1. Goal 2. Competition Overview 3. Challenge 4. Top
Solutions 5. Common techniques

1. Goal 1. ⾃分に⾜りなかったものが何かを学ぶ 2. 上位者の⼿法から他コンペに応⽤できる知識を学ぶ 3. 次こそは⽬標（⾦）を達成する。

Asia Pacific Tele-Ophthalmology Society 2019 Detect Diabetic Retinopathy to stop
blindness before it’s too late Build a model to help identify diabetic retinopathy automatically ü Aravind Eye Hospital technicians travel to rural areas to capture images ü Shortage of high trained doctors to review the images and provide diagnosis in rural areas of India Aravind Eye Hospital Madurai, Tamil Nadu Rural areas

2. Data ü train : 3,662 public : 1,928 private
: ~ 11,000 ü png format images ü target label 0 : No DR 1 : Mild 2 : Moderate 3 : Severe 4 : Proliferative DR 0 : No DR 1 : Mild 2 : Moderate 3 : Severe 4 : Proliferative DR

2. External Data 1. External data can be used 1.
2015 competition https://www.kaggle.com/c/diabetic-retinopathy-detection 2. IDIRD https://ieee-dataport.org/open-access/indian-diabetic-retinopathy- image-dataset-idrid

3. Challenge 1. Inconsistent performance between CV and Public LB
2. Kernel Limit 9hours for prediction 3. Dataset is noisy

4. 10th place solution 1. Model 1. 2 x EfficientNet
B5(RMSE), 2 x Se-ResNext101(MSE) 2. Dataset 1. 2015 + 2019 2. 5 Folds(2019 4/5 Training + 2016, Validation is 2019 ¼ Training) 3. Augmentation 1. Bens Crop, Horizonal Flip/Vertical Flip, Random Rotation, Random Contrast 4. Pseudo Labeling 1. Soft label 2. Hard label(finetune only trained model) 5. Ensemble

4. 8th place solution 1. Model ü EfficientNet B5(MSE), EfficientNet
B4(MSE) 2. Dataset ü Use 2015 + 2019 while training(for generalization) ü Remove confusing label(duplicate, confusing) 3. Augmentation ü Rotate, Horizonal/Vertical Flip, Zoom, Lightning 4. Pseudo Labeling ü Model Average ü Add training data, and retrain models 5. Ensemble ü Average

4. 7th place solution 1. Model 1. Se-ResNeXt50, 101, Inception
V4 2. Used GAP(Coord-Conv, GWAP, Average + Max didn’t work) 2. Dataset 1. 2015 + 2019, Messidor, Messidor2, IDIRD 3. Training 1. Pseudo labeling to test dataset(2015, Messidor, IDIRD) 2. Finetune model 4. Loss 1. Focal + Kappa(Classification) 2. Wing Loss(Regression) 3. Cauchy loss(Ordinal Regression)

4. 5th place solution 1. Model ü EfficientNet B3(300) ü
EfficientNet B4(400) ü EfficientNet B5(456) 2. Preprocessing ü Crop From Gray(Both Training/Predicting) ü Apply Image Type 3. Augmentation ü Dihedral, Random Crop, Rotation, Contrast Brightness, Cutout, PerspectiveTransform, CLAHE

4. 4th place solution 1. Model ü EfficientNet-B2-B7 2. Dataset
ü Remove Crop ü Apply Type using image boundary 3. Augmentations(Albumentations) ü Dihedral, Random Crop, Rotation, Contrast Brightness, Cutout, Perspective Transform, CLAHE 4. Simple Average Ensemble

4. 2nd place solution 1. Model ü EfficientNet B3(300),EfficientNet B4(400),
EfficientNet B5(456) 2. Dataset ü Remove Crop and resize images 3. Augmentations(Albumentations) ü Blur, Flip, RandomBrightnessContrast, ShiftScaleRotate, Elastic Transform, Transpose, GridDistortion, HueSaturationValue 4. Pseudo Labeling(Repeat) 5. Simple Average Ensemble

4. 1st place solution 1. Model ü 2 * Inception
ResNetV2, 2 * inception V4, 2 * SE-ResNeXt50, 2 * SE-ResNeXt101 ü Loss is Smooth L1 ü Replace Average Pooling to Generalized Mean 2. Dataset 3. Preprocessing 4. Pseudo Labeling(Soft) 5. Validation is Public LB!

5. Common techniques 1. Combinations of models and image size
2. Almost team try regression task 3. Heavy augmentations 4. Pseudo Labeling 1. Hard or soft 5. Ensemble 1. Average, Blending etc.. 6. Trust your CV

5. Combinations of models and image sizes 1. Top competitors
use many different models and image size 1. e.g. SE-ResNeXt 50 + 101 + Inception ResNetV2 Image sizes are different 2. Recently, use Efficient Net 2. Get model diversity, and get robustness score. 3. We get high score using ensemble(averaging/stacking/blending)

5. Pseudo Labeling 1. In this competition, Almost used pseudo
labeling. 2. Pseudo labeling 1. Hard 2. Soft( I haven’t never choose) 3. Many team use training data + pseudo labeling testing data with CV 1. Our team(28th) only used pretrain(2015 + train) phase. but Public LB is low.(we get high score on Private LB .) So we couldn’t choose that submission. 2. We have to get high score pseudo labeling. (If pseudo labeling is low, we cannot get high score)

Thank you for listening

APTOS2019 Top Solutions

APTOS2019 Top Solutions

tereka114

More Decks by tereka114

Other Decks in Programming

Featured

Transcript

APTOS 2019 Blindness Detection Top Solutions Detect diabetic retinopathy to

Hiroki Yamamoto • Acroquest Technology Co. Ltd • Machine Learning/Computer

Outline 1. Goal 2. Competition Overview 3. Challenge 4. Top

1. Goal 1. ⾃分に⾜りなかったものが何かを学ぶ 2. 上位者の⼿法から他コンペに応⽤できる知識を学ぶ 3. 次こそは⽬標（⾦）を達成する。

Asia Pacific Tele-Ophthalmology Society 2019 Detect Diabetic Retinopathy to stop

2. Data ü train : 3,662 public : 1,928 private

2. External Data 1. External data can be used 1.

3. Challenge 1. Inconsistent performance between CV and Public LB

4. 10th place solution 1. Model 1. 2 x EfficientNet

4. 8th place solution 1. Model ü EfficientNet B5(MSE), EfficientNet

4. 7th place solution 1. Model 1. Se-ResNeXt50, 101, Inception

4. 5th place solution 1. Model ü EfficientNet B3(300) ü

4. 4th place solution 1. Model ü EfficientNet-B2-B7 2. Dataset

4. 2nd place solution 1. Model ü EfficientNet B3(300),EfficientNet B4(400),

4. 1st place solution 1. Model ü 2 * Inception

5. Common techniques 1. Combinations of models and image size

5. Combinations of models and image sizes 1. Top competitors

5. Pseudo Labeling 1. In this competition, Almost used pseudo

Thank you for listening