Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kaggle日記について

285cc95c4c9fe056b03afca581533266?s=47 fkubota
June 02, 2021

 Kaggle日記について

Kaggle日記をなぜ作ったのかとか話してます。

Kaggle日記の詳細はこちら↓

Kaggle日記という戦い方: https://zenn.dev/fkubota/articles/3d8afb0e919b555ef068

285cc95c4c9fe056b03afca581533266?s=128

fkubota

June 02, 2021
Tweet

Transcript

  1. Kaggle೔هʹ͍ͭͯ ྺ࢙ͱ͔എܠͱ͔ fkubota

  2. ࣗݾ঺հ 02 fkubota (Twitter, Kaggle) - όϯυϧΧʔυͷձࣾ(ΧϯϜ)ͰػցֶशΤϯδχΞ - Kaggle Expert

    - ԭೄग़਎(౦ژʹग़͖ͯͯ3೥൒) - ෺ཧֶՊग़਎(ڧ૬ؔిࢠܥɺ4.2KͷӷମϔϦ΢Ϝʹ࣓ੑମಥͬࠐΜͰͨ) - ϓϩάϥϛϯάྺ͸2೥൒͙Β͍ - झຯ - ૣى͖(4࣌൒ىচ) - Kaggle - ίʔώʔɺϏʔϧɺ΢ΟεΩʔ - ಡॻɺ෺ཧɺ఩ֶ
  3. ࠓ೔࿩͢͜ͱ Kaggle೔هͬͯԿʁ ॳظʹ ίϯϖਏ͔ͬͨ࿩ ͔ͩΒ Kaggle೔ه࢝Ίͨ ͱΓ͋͑ͣͲΜͳ΋ͷ͔ ؆୯ʹ঺հ͠·͢ɻ هࣄ͕͋ΔΑɻ(url) ίϯϖ࢝Ίͨࠒɺશવ΍

    Δؾى͖ͳ͔ͬͨͷͰɺ ͦͷ๊͍࣌͑ͯͨ໰୊Λ ͪΐͬͱ੔ཧɻ Kaggle೔هΛ࢝Ίͨཧ༝Λ ఻͑Ε͹ΑΓ࢖͍΍͘͢ͳ Δ͔ͳͱɻ ͦ΋ͦ΋Kaggle೔هͰ͋Δ ඞཁ͢Βͳ͍ͱ͍͏࿩ɻ ओʹϏΪφʔʹ޲͚ͯʂ 03
  4. Kaggle೔هͬͯԿʁ

  5. ͜Μͳײ͡ʂ Kaggle೔ه ࣮ݧܭը ࣮ݧ݁Ռͱߟ࡯ ࿦จ/ࢀߟจݙ Discussion KaggleCode ΞΠσΞ KaggleࢀՃதʹੜ·ΕΔ৘ใ͸શͯKaggle೔هʹ ྫ)

    ௗίϯϖͷKaggle೔ه 05
  6. Kaggleͷେมͳͱ͜ difficult toooooo long Կ͔ͯͨ͠๨ΕͪΌͬͨ ໋໊نଇͷഁ୼ Ϟνϕଓ͔ͳ͍Α... ΞΠσΞͷރׇ too long໰୊΁ͷରࡦ

    06
  7. Kaggle೔ه͕औΓ૊ΜͰ͍Δ͜ͱ Ϟνϕҡ࣋ ϑΝΠϧ໊Ͱ؅ཧ͠ͳ͍ ೴ͷϝϞϦͰ؅ཧ͠ͳ͍ τϨʔαϏϦςΟ ৄࡉ͸ʮKaggle೔هͱ͍͏ઓ͍ํʯΛݟ͍ͯͩ͘͞m(_ _)m ௕͍ͷʹ΄ͱΜͲ͏·͍͔͘ͳ ͍ͱ͔஍ࠈ͡ΌͶʔ͔ɻ Ͳ͏ͤഁ୼͢ΔΜͰ͠ΐʁ

    ͳΒ࢝Ί͔Β؅ཧ͢Δ͜ͱΛ͋ ͖ΒΊΑ͏ͥɻ 3ϲ݄΋͋ΔΜͩΑʁ ֮͑ΒΕΔΘ͚͕ͳ͍ɻ ݴ͍͍͚ͨͩɻ 07
  8. ॳظʹίϯϖਏ͔ͬͨ࿩

  9. Kaggleʹڵຯ࣋ͪ࢝Ίͨ࣌ ڵຯ ਓ޻஌ೳΛڝ͍͍͋ ͳ͕Βֶ΂Δͩͱʂʁ ΍Δ͔͠Ͷ͐ʂ ׬શཧղ ܾఆ໦ͱSVMΛཧղͨ͠ɻ ͑ʁsklearnͰ͙͢࢖͑Δͷʁ ༨༟͡ΌΜww ੩ऐ

    ܾఆ໦ͱSVMࢼ͔ͨ͠ Β΋͏΍Δ͜ͱͳ͍Αʁ 1αϒͰऴྃʂ σΟεΧογϣϯ΋Կॻ͍ͯΔ ͔Θ͔ΒΜ͜͠ΕҎ্͸ਏ͍ʂ 09
  10. ޛͬͨ๻ ·ͩ...ͦͷ࣌Ͱ͸ͳ͍ 10

  11. ۭനͷ2ϲ݄ ͦͷ͕࣌๚ΕΔͷΛษڧ͠ͳ͕Β଴ͪ·ͨ͠ 11

  12. ຊ౰ʹޛͬͨ๻ ͦͷ࣌ ͸Ұੜ΍ͬͯ͜ͳ͍ 12

  13. ࣮ફ͸΍ͬͺҧ͏ΜͩΖ͏ͳ͊ ͓ษڧ ࣮ફ ࣮ફͰֶ͔͠΂ͳ͍͜ͱͬͯ͋ΔΑͶ͐ - EDA - ϦʔΫ - CVͷ੾Γํ

    - ಛ௃ྔબ୒ - ͳͲͳͲ... Θ͔Δ͚Ͳ͠ΜͲ͍΋Μ͸͠ΜͲ͍ ޲͍ͯͳ͍ͷ͔ͳʁ 13
  14. ίϛοτྔ ָ͠͞ Կֶ͔Ϳ͍͍࣌ͬͯͩͨ͜Μͳײ͡ۂઢ 0 φχίϨ? φχίϨ? φχίϨ? ݴޠԽΉ͔͍͚ͣ͠Ͳ... - ஌ࣝͷମܥԽ͕࢝·Δ?

    - ఺ͱ఺͕ܨ͕Γ࢝ΊΔ? - Θ͔Βͳ͍͕Θ͔Δ? - ήʔϜͷϧʔϧ͕Θ͔Δʁ - ….Έ͍ͨͳײ͡ʁ ͋͘·Ͱ๻ͷܦݧྫͳΜͰ͕͢ 14
  15. Կ͕ݴ͍͍͔ͨͱݴ͏ͱ ָ͠͞ 0 ͻͱ·ͣ͜͜ΛͲ͏ʹ͔ ৐Γӽ͑Α͏ͱࢥͬͨ ͜͜Ͱ޲͍͍ͯͳ͍ͱ൑அ͢Δͷ ͸΋͍ͬͨͳ͍͔ͳ͊ ͋͘·Ͱ๻͕ͦ͏ࢥ͚ͬͨͬͯͩͶʂʂ 15 φχίϨظ

  16. ໰୊͕গ͠໌֬ʹͳͬͯ ίϯϖ͠ΜͲ͍ Ͳ͏͠Α͏໰୊ 16 φχίϨظΛ Ͳ͏৐Γӽ͑Δ͔໰୊

  17. ͔ͩΒKaggle೔ه͸͡Ίͨ

  18. ετϨεʹͳΓͦ͏ͳ΋ͷ ໋໊نଇͷഁ୼ ࡞ۀ࠶։ίετ ΞΠσΞͷރׇ ࢀߟจݙͷ؅ཧ outputϑΝΠϧͷ؅ཧ ࠶ݱੑ͕ͳ͍ 18 2೔ۭ͚ͪΌͬͯ࠶։͠Α ͏ͱࢥ͚ͬͨͲΊΜͲ͘͞

    ͍͔Β໌೔΍Ζ͏ɻ ΋͏ΞΠσΞ͕ਚ͖·ͨ͠ɻ A.pyಈ͔ͯ͠Ͱ͖ͨa.csv Λ࢖ͬͯB.pyΛಈ͔ͤ͹࠶ ݱͰ͖Δ͸ͣͩΑɻ ·͋ɺͰ͖ͳ͔ͬͨΜͰ͢ ͚ͲͶ:) ࢀߟʹͳΓͦ͏ͳ࿦จ΍Βه ࣄ΍ΒΛอଘ͚ͨ͠ͲͲΕΛ ಡΜͰͲΕΛಡΜͰͳ͍ͷ͔ Θ͔ΒΜɻ exp_5fold.ipynb exp_6fold_3.ipynb exp_5fold_seed42.ipynb exp_5fold_nn_v5.ipynb exp_4fold_nn_prepro.ipynb exp_5fold_nn_postproc_v2.ipynb exp_5fold_fix_bug.ipynb ແݶϧʔϓ ͜ͷ ultra_super_feature.csv ͸Ͳ͏΍ͬͯ࡞ΒΕͨΜͰ͢ ͔Ͷʁ Todo Doing Done
  19. ରࡦͰ͖Ε͹ԿͰ΋ྑ͍ લϖʔδʹ্͛ͨΑ͏ͳ໰୊ʹͰ͖Δ͚ͩແཧͳ͘ରॲ͢Δ ͨΊʹKaggle೔هͱ͍͏ํ๏ΛऔΓ·ͨ͠ ͨͩɺͪΐͬͱ஫ҙͳͷͰ͕͢ - Kaggle೔هͰ͋Δඞཁ͸ͳ͍͠ - ͦ΋ͦ΋ࡉ͔͍໰୊ʹͦΕͧΕରॲΛແཧͯ͢͠Δඞཁ ΋ͳ͍͠ -

    ͳΜͳΒ͠ΜͲ͍ͳΒKaggle΍Βͳ͍બ୒ࢶ΋͋Δ ͷͰ๻ͷҙݟ͸બ୒ࢶͷҰ͙ͭΒ͍ʹड͚औ͍ͬͯͩ͘͞ 19
  20. ෼ࢠίϯϖͷࢥ͍ग़ ॳࢀઓͷίϯϖʂʂ Kaggle೔هσϏϡʔʂʂ ࣌ؒ public LB (score) 0 pandasͬͯ φϯμεʁ

    3ϲ݄ ಔϝμϧϥΠϯ groupby ŧŔŕŪżŞƂŜŽūŘ oofͬͯԿʂʁ lightGBMʁʁʁ ςʔϒϧ͕ͳΜͰ ͨ͘͞Μ͋Δͷʂʁ ٙ໰͕ͨ·ͬͨࠒʹͨ·ͨ· ΧϨʔ͞Μͷ ʮKaggleͷνϡʔτϦΞϧʯ ʹग़ձ͍׬શʹཧղ͢Δ ࢒Γ2೔͙Β͍ ͰॳΊͯ ಔϝμϧݍ಺ʹ ͜ͷ࣌ظ͸codeͱdiscussionΛશ෦ݟͯͱΓ͋ ͑ͣ࢖͏ͱ͍͏ͷΛߦ͍είΞ্͕͕ͬͨɻ ͨͩެ։ϊʔτϒοΫະຬɻ ͜͜·ͰஷΊ͍ͯͨΦϦδφϧͷ ΞΠσΞͷ࣮૷Λ͸͡Ίͨɻ ͱʹ͔࣮͘૷଎౓͕஗͍ɻ 15ݸఔ౓ࢼͯ͠ 3ݸ͙Β͍͕౰ͨͬͨɻ ࠷ޙͷ7೔ͰΞϯαϯϒ ϧΛษڧͯ͠ࢼͨ͠ɻ ಔϝμϧ(9%) ϑΟχογϡʂ 20
  21. Thanks :)