Upgrade to Pro — share decks privately, control downloads, hide ads and more …

model_pipeline_final.pdf

F6c0cb53d72908942998923f1a05c71b?s=47 Maxwell
September 18, 2018

 model_pipeline_final.pdf

model pipeline and others in Home Credit Default Risk competition.
Thanks to team mates.

F6c0cb53d72908942998923f1a05c71b?s=128

Maxwell

September 18, 2018
Tweet

Transcript

  1. ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (

    LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features + LGBM 5 3 tosh 5 + CatBoost 5 2 1 + LGBM * 4 3 1 + CNN 7 Residual 2 + ExtTree 4 3 1 Residual 1 ( corrected with residual regression ) Blending CV 0.8094 Adversarial Stochastic Blending CV 0.8096 Adversarial Stochastic Blending CV 0.81050 * model drawn in next page + NN 1 3 ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 1 or 2 or 5 + LGBM 1 or 2 or 5 + CatBoost or + LGBM 5 1 or 2 5 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + RGF 1 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 + LGBM 3 1 5 or 3 2 5 + LGBM 8 1 12 or 8 2 12 Public 0.8085 17 th Private 0.8017 18 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8093 10 th Private 0.8016 18 th Public 0.8080 23 th Private 0.8028 14 th + RNN 7 1 Public 0.8110 3 rd Private 0.8042 5 th Giba Post Processing Public 2nd 0.81241 Private 2nd 0.80561 Home Credit Default Risk partial partial partial + LGBM 8 1 or 2 8 or 12 + LGBM 3 1 or 2 3 or 12 3 + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending
  2. ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (

    LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features tosh + LGBM * 4 3 1 + CNN 7 Residual 2 Residual 1 ( corrected with residual regression ) Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8097 * model drawn in next page ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 Public 0.8071 26 th Private 0.8009 37 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8082 23 th Private 0.8022 18 th Public 0.8080 23 th Private 0.8028 14 th Public 0.8099 7 th Private 0.8040 6 th Giba Post Processing Home Credit Default Risk partial + LGBM 8 1 12 or 8 2 12 partial 1 or 2 + LGBM + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending + ExtTree 4 3 1 + NN 1 3 + RGF 1 + LGBM 4 3 2 + XGB 4 3 1 + NN 1 + RNN 7 1 + hidden + Res3 + LGBM 1 6 + Res1 + LGBM 1 6 + hidden + Res4 + LGBM 1 6 stacking with LGBM CV 0.8080 Public 0.8070 / Private 0.8015 Stacking prediction Stacking + LGBM 3 1 or 2 3
  3. application bureau bureau balance AUC : 0.683 (SEED71) 0.683 (SEEDs

    avg) AUC 0.772 (SEED71) 0.773 (SEEDs avg) XGBoost app meta feature XGBoost prev meta feature 229 features 300 features all data stacking-like Light GBM  5 stratified fold ( shuffle = True )  5 / 8 SEEDs rank averaged  SEED : 71 for model fit  SEED : 710, 711, 712, 713, 714 ( 715, 716, 717 ) for OOF prediction  hyper parameter tuned for 603 features (reflected on meta features) XGBoost bureau meta feature ONODERA BASIC FEATURES 600 features NEJUMI FEATURES ( interest rate ) 1 feature 603 ( 604 ) features Local CV 0.80641 Public LB / Private LB 0.80569 / 0.79853 100 th / 105 th AUC 0.710 (SEED71) 0.712 (SEEDs avg) previous inst POS_CASH credit 952 features Local CV 0.80646 LB 0.804 ( ~ 0.805 ) Maxwell 603 ( 604 ) selected features based on ONODERA criteria w/o feature selection Stacking-like Light GBM