Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
model_pipeline_final.pdf
Search
Maxwell
September 18, 2018
Science
240
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
model_pipeline_final.pdf
model pipeline and others in Home Credit Default Risk competition.
Thanks to team mates.
Maxwell
September 18, 2018
More Decks by Maxwell
See All by Maxwell
Causal Impact -paper summary-
hoxomaxwell
3
990
Great Barrier Reef Model Pipeline: 15th place
hoxomaxwell
1
250
Lecture materials at the University of Tokyo School of Medicine
hoxomaxwell
1
200
Kaggle Hungry Geese
hoxomaxwell
1
160
HuBMAP 17th place model pipeline
hoxomaxwell
1
160
LT: Shallow Dive into Bayes Factor
hoxomaxwell
6
1.4k
Kaggle APTOS 2019 @ U-Tokyo Med
hoxomaxwell
1
450
Cornell Birdcall 36th place solution
hoxomaxwell
2
270
Kaggle Bengali.AI 6 th place solution
hoxomaxwell
4
8.9k
Other Decks in Science
See All in Science
1. CPC理論の展開と集合的知能モデル(JSAI2026 KS-27 集合的予測符号化と新たな知性の時代)
hayashiyus884
1
200
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
270
明治薬科大学講義_ビッグデータ解析を支えるデータベース技術とクラウドコンピューティング
ktatsuya
1
110
医療 LLM ベンチマークの現在地:多面的評価 と日本ローカライズ
analokmaus
1
520
機械学習 - K-means & 階層的クラスタリング
trycycle
PRO
0
1.7k
検索と推論タスクに関する論文の紹介
ynakano
1
230
データベース01: データベースを使わない世界
trycycle
PRO
1
1.3k
[NLP2026 参加報告会] AI for Science まとめ / NLP2026
lychee1223
0
1.9k
データベース10: 拡張実体関連モデル
trycycle
PRO
0
1.1k
なぜ21は素因数分解されないのか? - Shorのアルゴリズムの現在と壁
daimurat
0
450
チュートリアル:世界モデル
hf149
0
1.8k
(CVPR2026) Back to Basics: Let Denoising Generative Models Denoise
shumpei777
0
150
Featured
See All Featured
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.5k
From π to Pie charts
rasagy
0
210
What's in a price? How to price your products and services
michaelherold
247
13k
Paper Plane
katiecoart
PRO
1
51k
The Invisible Side of Design
smashingmag
302
52k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
490
Reality Check: Gamification 10 Years Later
codingconduct
0
2.2k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
160
The untapped power of vector embeddings
frankvandijk
2
1.8k
Context Engineering - Making Every Token Count
addyosmani
9
970
Transcript
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features + LGBM 5 3 tosh 5 + CatBoost 5 2 1 + LGBM * 4 3 1 + CNN 7 Residual 2 + ExtTree 4 3 1 Residual 1 ( corrected with residual regression ) Blending CV 0.8094 Adversarial Stochastic Blending CV 0.8096 Adversarial Stochastic Blending CV 0.81050 * model drawn in next page + NN 1 3 ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 1 or 2 or 5 + LGBM 1 or 2 or 5 + CatBoost or + LGBM 5 1 or 2 5 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + RGF 1 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 + LGBM 3 1 5 or 3 2 5 + LGBM 8 1 12 or 8 2 12 Public 0.8085 17 th Private 0.8017 18 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8093 10 th Private 0.8016 18 th Public 0.8080 23 th Private 0.8028 14 th + RNN 7 1 Public 0.8110 3 rd Private 0.8042 5 th Giba Post Processing Public 2nd 0.81241 Private 2nd 0.80561 Home Credit Default Risk partial partial partial + LGBM 8 1 or 2 8 or 12 + LGBM 3 1 or 2 3 or 12 3 + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features tosh + LGBM * 4 3 1 + CNN 7 Residual 2 Residual 1 ( corrected with residual regression ) Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8097 * model drawn in next page ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 Public 0.8071 26 th Private 0.8009 37 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8082 23 th Private 0.8022 18 th Public 0.8080 23 th Private 0.8028 14 th Public 0.8099 7 th Private 0.8040 6 th Giba Post Processing Home Credit Default Risk partial + LGBM 8 1 12 or 8 2 12 partial 1 or 2 + LGBM + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending + ExtTree 4 3 1 + NN 1 3 + RGF 1 + LGBM 4 3 2 + XGB 4 3 1 + NN 1 + RNN 7 1 + hidden + Res3 + LGBM 1 6 + Res1 + LGBM 1 6 + hidden + Res4 + LGBM 1 6 stacking with LGBM CV 0.8080 Public 0.8070 / Private 0.8015 Stacking prediction Stacking + LGBM 3 1 or 2 3
application bureau bureau balance AUC : 0.683 (SEED71) 0.683 (SEEDs
avg) AUC 0.772 (SEED71) 0.773 (SEEDs avg) XGBoost app meta feature XGBoost prev meta feature 229 features 300 features all data stacking-like Light GBM 5 stratified fold ( shuffle = True ) 5 / 8 SEEDs rank averaged SEED : 71 for model fit SEED : 710, 711, 712, 713, 714 ( 715, 716, 717 ) for OOF prediction hyper parameter tuned for 603 features (reflected on meta features) XGBoost bureau meta feature ONODERA BASIC FEATURES 600 features NEJUMI FEATURES ( interest rate ) 1 feature 603 ( 604 ) features Local CV 0.80641 Public LB / Private LB 0.80569 / 0.79853 100 th / 105 th AUC 0.710 (SEED71) 0.712 (SEEDs avg) previous inst POS_CASH credit 952 features Local CV 0.80646 LB 0.804 ( ~ 0.805 ) Maxwell 603 ( 604 ) selected features based on ONODERA criteria w/o feature selection Stacking-like Light GBM