Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bengali.AI Handwritten Grapheme Classification...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Bengali.AI Handwritten Grapheme Classification Explanation of Summary and Solutions

Bengali.AI Handwritten Grapheme Classification Explanation of Summary and Solutions

Kaggle: https://www.kaggle.com/c/bengaliai-cv19
Youtube: https://www.youtube.com/watch?v=5zrWsxaIRPg&feature=youtu.be

Avatar for tereka114

tereka114

March 15, 2022
Tweet

More Decks by tereka114

Other Decks in Programming

Transcript

  1. Self Introduction • tereka(@tereka114) • https://www.kaggle.com/tereka • 6 years •

    Computer Vision • Kaggle Competitions Master • Home Credit Default Risk 2nd • IEEE's Signal Processing Society 10th
  2. Goal • You will get knowledge of top solutions and

    key factor of winning • Learn the idea using that competition and use that idea to other competitions
  3. Competition Overview • Predict components of Bengali hand writing images

    • Grapheme root(169 class) • Vowel diacritic(11 class) • Consonant diacritic(7 class) • Grapheme is 1295 class • But test data have unseen graphemes • Code Competition • 9hours(CPU)/2hours(GPU)
  4. Distributions Vowel diacritic Consonant diacritic 3 and 6 label is

    low Grapheme root https://www.kaggle.com/pestipeti/bengali-quick-eda
  5. Why huge shake up? • Unseen Grapheme prediction is important.

    • Guess there are only few unseen public leaderboard, but private leader board have many unseen grapheme. • Itʼs key of solution.
  6. Interesting points of competition • How to split Unseen/seen patterns.

    • Arc Face/BCE/Individual model.. • Vs Unseen Grapheme • How did top competitorʼs prediction unseen grapheme • Font generation is better?
  7. Top Solution Summaries • Unseen and Seen model • Arc

    Face • Binary Cross Entropy • Model • Efficient Net/Se-ResNeXt • 3 target or individual 3 target(1 target per model) • Data Augmentation • FMix • Generate Character • Cycle GAN • From Font
  8. 5th Place Solution https://www.kaggle.com/c/bengaliai-cv19/discussion/136129 • Model • SE-ResNeXt50 + Head

    • Loss • Grapheme Root(CE) +Consonant Loss(MLBC) + Vowel Loss(CE) • Grapheme Loss: ArcCos + CE • Separate Consonant Components ※Multi Label Binary CrossEntropy
  9. 4th Place Solution https://www.kaggle.com/c/bengaliai-cv19/discussion/136982 • ArcFace Model • InceptionResNetV2, SE-ResNeXt101

    • Target 1295 • Cutout • Unseen Model • 3 head for vowel and consonant, 1-head for Grapheme root • Cross Entropy • Mixup
  10. 3rd Place Solution https://www.kaggle.com/c/bengaliai-cv19/discussion/135982 • Unseen/Grapheme • Use ArcFace and

    cosine similarity • Unseen • Predict individual components • Seen • Predict three components • Model • pc-softmax • Blending/Weighted average
  11. 2nd Place Solution • Model • SE-ResNeXt 50 & 101

    • FMix is better than cutmix • https://arxiv.org/abs/2002.12047 • Individual Prediction (3 target) • Unseen • train model from font data. • Blending https://www.kaggle.com/c/bengaliai-cv19/discussion/135966
  12. 1st Place Solution • Out of Distribution Model • Predict

    1295 class(grapheme) • If all class predictions are low, itʼs judged unseen class. • Seen Model • Efficientnet B7 based 14784 class (Grapheme * Consonant * Vowel) • Unseen Model • Cycle GAN(Font and Image) + Efficientnet B0(2 font models) https://www.kaggle.com/c/bengaliai-cv19/discussion/135984
  13. CycleGAN Tranining https://www.kaggle.com/c/bengaliai-cv19/discussion/135984 • Training font and image • 2ttf

    is better than 1ttf.(ttf file is font) • Convert original image to Font image and predict Efficient net b0 • Get a narrow portion of the handwriting. • Abstract character structure. • Detail in the interview article. • https://medium.com/kaggle-blog/top-marks-for-student-kaggler-in-bengali- ai-a-winners-interview-with-linsho-kaku-dd321b324c74
  14. What I learned? • Unseen prediction is very important in

    this competition. • Arc Face/Binary Cross Entropy • Character Generation can get generalized feature • Cycle GAN can get a absolute feature and impressive. • Character generation from font also can get generalization. • Mixed augmentation can get better result(like FMix)