Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kaggle Bengali.AI 6 th place solution

Maxwell
March 17, 2020

Kaggle Bengali.AI 6 th place solution

Higher resolution model pipeline picture

Maxwell

March 17, 2020
Tweet

More Decks by Maxwell

Other Decks in Science

Transcript

  1. Bengali.AI Handwritten Grapheme Classification ইঁদুর এবং ভালুক

  2. Train Images - 200,840 images - 168 / 11 /

    7 classes - 137 x 236 x 1 Test Images - About 200,000 images - 137 x 236 x 1 - 4 parquets Copyright 2020 @ Maxwell_110 Bengali Model Pipeline Customized SE-ResNet 50 - `NOT` pretrained - Iterative Stratified 5 folds - 137 x 236 x 1 Input Image Size - Images divided by 255 - Adam - 3 Stage Learning 1. 13 - 18 CyclicLR (Triangle, 8 epoch/cycle, 1e-3~-4), Xentropy 2. LRonP (36 epoch, 3 pat, 6 ES, 1e-5), Reduced Focal Loss 3. LRonP (36 epoch, 3 pat, 6 ES, 1e-5), Xentropy - Batch Size: 64 - Augmenatation: Width Shift (20%), Erosion, CutOut (holes=8, 3 types of Size), GridMask (rotate=15deg) - CutMixUp CutMix (p=1/3, alpha=0.5) / MixUp (p=1/3, alpha=0.2) Inception ResNet V2 - Pretrained - Iterative Stratified 5 folds ( same folds as Maxwell ) - 180 x 180 x 3 Input Image Size - Images divided by 255 - Adam - 2 Stage Cyclic Learning (4 epoch/cycle, 5e-5 ~ 2e-3) 1. Weighted Reduced Focal Loss, 40 epoch, weight = 1 / observation counts 2. LRonPlateau (1e-5), Xentropy - Batch Size: 64 - Augmenatation: Rotate (8deg), Zoom (1.2), height/width shift (15%), CutOut (holes=20, max_h=25, max_w=40) - MixUp MixUp (p=1, alpha=0.4) 137 236 137 236 Prediction 1 (Maxwell) Prediction 2 (Nejumi) * 0.75 + - Resize to (180, 180) * 0.25 = Blended Prediction Public : 0.9871 / 78 th Private : 0.9557 / 6 th Submission with Post Processing - Multiply coefficients to each predicted probability - Coefficients for each label (186 types) - Use NM solver to calculate optimal coeffcients grapheme_root : [c_g_1, c_g_2, ..., c_g_168] vowel_diacritic : [c_v_1, c_v_2, ... , c_v_11] consonant_diacritic : [c_c_1, c_c_2, ... , c_c_7] Apply correction coefficients to blended predictions Inference Limitations: - Inference on kernel - GPU inference time <= 2 hours - Memory Limit <= 13 GB Our Resources Maxwell: TITAN RTX, Geforce 1080Ti x 2, GCP Nejumi: TITAN RTX, Geforce 1080Ti x 2, Vast.ai ( https://vast.ai/ )*1 *1 Nejumi is cloud addict Prediction 1 CV(w/ pp) : 0.9888 Public : 0.9864 Private : 0.9527 (12th) Input (1ch) GeM 2D 512 fc Grapheme 168 nodes Vowel 11 nodes Consonant 7 nodes SoftMax + + + Add 1280 fc 512 fc 512 fc 512 fc Customized SE-ResNet 50 Block ( 3 x 3 bottom kernel ) Inception ResNet V2 Block Input (1ch) Input (3ch) Prediction 2 CV (w/ pp) : 0.9845 Public : 0.9810 Private : 0.9449 (19th) Convert to 3 ch image Separable Conv 2D BN ReLu GAP 2D Grapheme 168 nodes Vowel 11 nodes Consonant 7 nodes SoftMax 180 180 180 180
  3. None