Upgrade to Pro — share decks privately, control downloads, hide ads and more …

学習・推論パイプラインを構築する上で大切にしていること

 学習・推論パイプラインを構築する上で大切にしていること

kaggleやSIGNATEなどの分析コンペで使用する学習・推論パイプラインを構築する上で大切にしていること

C5c09bfd9ee31f5aef8ce257643d50ea?s=128

Takanobu Nozawa

November 30, 2019
Tweet

Transcript

  1. ֶशɾਪ࿦ύΠϓϥΠϯΛߏங͢Δ্Ͱ େ੾ʹ͍ͯ͠Δ͜ͱ $POOFIJUP*OD໺ᖒ఩র  ෼ੳίϯϖ-5ձ

  2. ͜Μʹͪ͸ʂ 

  3. ࠓ೔ൃද͢Δ͜ͱ͸Լهͱएׯॏෳ͠·͢  IUUQTTQFBLFSEFDLDPNUBLBQZEFUBGFOYJLPOQFOJPJUFUF[IFOHMJBOH HVBOMJOJQJCJTJUFJSVRVBOSFOMFJOJDIVBOFUBJYJBOHJ

  4.  ΞδΣϯμ  ࣗݾ঺հ  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ  େ੾ʹ͍ͯ͠Δ͜ͱ ‣ ֶशͷ࠶ݱੑ

    ‣ 1%$"ͷߴ଎Խ ‣ ҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έ  ࣮૷ৄࡉ
  5. ࣗݾ঺հ 

  6.  ࣗݾ঺հ ໊લɿ໺ᖒ఩রʢ/P[BXB5BLBOPCVʣ ॴଐɿίωώτגࣜձࣾ ɹɹɿ͔ͨͺ͍!UBLBQZ w ʙίωώτʹ.-ΤϯδχΞͱͯ͠+0*/ w ػցֶशʢ/-1ɺਪનγεςϜʣΛϝΠϯʹ΍ΓͭͭΠϯϑϥʢ"84ʣ΋ษڧத w

    ,BHHMFͨ͠ΓɺϒϩάʢIUUQTXXXUBLBQZXPSLʣॻ͍ͨΓɺ໺ٿͨ͠Γɺ ϥʔϝϯ৯΂ͨΓ͍ͯ͠·͢ w ΦεεϝͷυϥϜࣜચ୕ػ͋ͬͨΒڭ͍͑ͯͩ͘͞
  7.  ࣗݾ঺հ ˞ʮӾཡ਺ʯʮར༻ऀ਺ʯ͸ϝσΟΞͱΞϓϦͷ߹ܭ஋ʢ೥݄݄ͷฏۉ஋ʣ ˞ʮϚϚ޲͚/PΞϓϦʯ͸೥݄Πϯςʔδௐ΂ɹௐࠪର৅ɿ೛৷தʙ̎ࡀ̌ϲ݄ͷࢠڙΛ࣋ͭঁੑ O  Λநग़ ˞*OTUBHSBNͷϑΥϩϫʔ਺ɺ'BDFCPPLͷ͍͍Ͷ਺ɺ-*/&ͷͱ΋ͩͪ਺ͷ߹ܭ஋ ೥݄࣌఺ 

    ϚϚϦ ΞϓϦɾ8FC 4/4 *OTUBHSBNɾ-*/&ɾ'BDFCPPL هࣄ ϚϚಉ࢜Ͱ೰ΈΛ૬ஊ͠߹͏2"ίϛϡχςΟΛத৺ʹ ϢʔβʔΛ֦େ͍ͯ͠·͢ ʮϚϚϦʯͰϢʔβʔಉ͕࢜ ͲΜͲΜܨ͕͍ͬͯ·͢ ϚϚͷੜ׆ʹ໾ཱͭهࣄΛ ෯޿͍δϟϯϧͰ഑৴͍ͯ͠·͢ ϚϚ޲͚/P̍ΞϓϦʹબग़  ਓͷϚϚ͕બͿʮݱࡏ࢖͍ͬͯΔΞϓϦʯʹ ͯɺ߲໨ ଞͷϚϚʹΦεεϝ͍ͨ͠ɺೝ஌౓ɺ ར༻཰ɺརศੑɺ޷ײ౓ Ͱ̍ҐΛ֫ಘ͠·ͨ͠ هࣄ਺ 6,000 هࣄҎ্ ྦྷܭϑΝϯ਺ ໿ 85 ສਓ ˞ ݄ؒӾཡ਺ ໿ 1.5ԯճ ˞ ݄ؒར༻ऀ਺ ໿ 650ສਓ ˞ ˞ l೰ΈzͱzڞײzΛ࣠ʹϚϚʹدΓఴ͍ ΞϓϦɾ8FCɾ4/4ͱଟ֯తʹαʔϏεΛల։͍ͯ͠·͢
  8.  ࣗݾ঺հ 0 450,000 900,000 1,350,000 1,800,000 2014/4 2014/5 2014/6

    2014/7 2014/8 2014/9 2014/10 2014/11 2014/12 2015/1 2015/2 2015/3 2015/4 2015/5 2015/6 2015/7 2015/8 2015/9 2015/10 2015/11 2015/12 2016/1 2016/2 2016/3 2016/4 2016/5 2016/6 2016/7 2016/8 2016/9 2016/10 2016/11 2016/12 2017/1 2017/2 2017/3 2017/4 2017/5 2017/6 2017/7 2017/8 2017/9 2017/10 2017/11 2017/12 2018/1 2018/2 2018/3 2018/4 2018/5 2018/6 2018/7 2018/8 ݄ؒ౤ߘ਺ ໿ 150ສ݅ िʹ೔Ҏ্ىಈ͢Δ ΞΫςΟϒϢʔβʔ ໿ 50 ਓʹਓ 57$. ์ө ΞϓϦ૯%-਺ສ ਓʹਓ ਓʹਓ ਓʹਓ ਓʹਓ ˞ ˞ʮϚϚϦʯ಺ͷग़࢈༧ఆ೔Λઃఆͨ͠Ϣʔβʔ਺ͱɺްੜ࿑ಇলൃදʮਓޱಈଶ౷ܭʯͷग़ੜ਺͔Βࢉग़ ˞िʹճҎ্ىಈ͢ΔϢʔβʔ ˞ ೥ʹग़࢈ͨ͠ϚϚͷʮਓʹਓʯ͕ϚϚϦΛར༻த ೔ຊ࠷େڃن໛ΛތΔϒϥϯυ΁ͱ੒௕͍ͯ͠·͢ ˞
  9. ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ 

  10.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ ΊͬͪΌྑ͍είΞͷOPUFCPPL͕׬੒ ˣ ͜ͷOPUFCPPLΛ%VQMJDBUFͯ͠ɺ΋ͬͱྑ͍Ϟσϧ࡞ͬͪΌ͏ͧʂ

  11.  ҰํɺOPUFCPPLͷத਎͸ʜ

  12.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> hogehoge <> hogehoge <> hogehoge ɾ ɾ ɾ
  13.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> hogehoge <> hogehoge <> hogehoge ɾ ɾ ɾ ηϧͷ࣮ߦॱ͕ͪ͝Όͪ͝Ό ˣ ࠶ݱੑ͕ͳ͍
  14.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  15.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False) ηϧ͕ଟ͘ɺಉ͡ܭࢉΛෳ਺࣮ߦ͠ͳ͚Ε͹ͳΒͳ͍ ˣ 1%$"͕஗͘ͳΓ͕ͪ
  16.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC

    dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC ʜʜʜʜʜ
  17.  ύΠϓϥΠϯ͕ͳ͔ͬͨ࣌ͷπϥϛ dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC dOPUFCPPL-JHIU#(.@TDPSF@$PQZJQZOC

    dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC dOPUFCPPL-JHIU#(.@TDPSF@JQZOC ʜʜʜʜʜ OPUFCPPL͕ࡍݶͳ͘૿৩͠ ؅ཧ͕ΊͪΌͪ͘Ό൥ࡶʹ
  18.  ͜ΕΒͷπϥϛΛղফ͢΂͘ ୤ɾOPUFCPPLʹνϟϨϯδ ˣ ͦΜͳதͰେ੾ʹ͍ͯ͠Δ͜ͱΛ͓఻͑͠·͢ ˞&%"ͳͲ͸OPUFCPPLΛ࢖༻

  19.  ౔୆͸͜ͷຊͰ͢

  20. େ੾ʹ͍ͯ͠Δ͜ͱ 

  21.  ͦͷ ʙֶशͷ࠶ݱੑʙ

  22.  ͦͷ ʙֶशͷ࠶ݱੑʙ ϙΠϯτ͸̏ͭ

  23.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͲΜͳಛ௃ྔΛ࢖ͬͯ ʙֶशͷ࠶ݱੑʙ

  24.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͲΜͳಛ௃ྔΛ࢖ͬͯ w ͲΜͳύϥϝʔλʔΛ࢖ͬͯ ʙֶशͷ࠶ݱੑʙ

  25.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͲΜͳಛ௃ྔΛ࢖ͬͯ w ͲΜͳύϥϝʔλʔΛ࢖ͬͯ w ͲΜͳ$7Λ࢖ͬͯ ֶश͔ͨ͠Λอଘ͓ͯ͘͠ɻ ʙֶशͷ࠶ݱੑʙ

  26.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͲΜͳಛ௃ྔΛ࢖ͬͯ w ͲΜͳύϥϝʔλʔΛ࢖ͬͯ w ͲΜͳ$7Λ࢖ͬͯ ֶश͔ͨ͠Λอଘ͓ͯ͘͠ɻ ʙֶशͷ࠶ݱੑʙ

    %0//"อଘଇ
  27.  ͦͷ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ

  28.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͭͷQZUIPOεΫϦϓτͰ%0//"ΛঠѲ  ࢖༻͢Δಛ௃ྔ  ࢖༻͢Δύϥϝʔλ  ࢖༻͢Δ$7

    ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ
  29.  େ੾ʹ͍ͯ͠Δ͜ͱ w ͭͷQZUIPOεΫϦϓτͰ%0//"ΛঠѲ  ࢖༻͢Δಛ௃ྔ  ࢖༻͢Δύϥϝʔλ  ࢖༻͢Δ$7

    w GFBUVSFJNQPSUBODFͷݟ͑ΔԽ  ࣍ͷ࣮ݧ΁ͷצॴΛ͔ͭΊΔΑ͏ʹ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ
  30.  ͦͷ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  31.  େ੾ʹ͍ͯ͠Δ͜ͱ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w %0///" Ϟσϧ΍ϩάϑΝΠϧΛɺ࣮ߦ͝ͱʹʮ೔࣌ʯ TV⒏YΛ͚ͭͨσΟϨΫτϦɾϑΝΠϧͰࣗಈ؅ཧ  ࢖༻͢Δಛ௃ྔ ύϥϝʔλ

    $7  ֶशͨ͠Ϟσϧ  ֶशϩάϑΝΠϧ  GFBUVSFJNQPSUBODF
  32.  େ੾ʹ͍ͯ͠Δ͜ͱ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w %0///" Ϟσϧ΍ϩάϑΝΠϧΛɺ࣮ߦ͝ͱʹʮ೔࣌ʯ TV⒏YΛ͚ͭͨσΟϨΫτϦɾϑΝΠϧͰࣗಈ؅ཧ  ࢖༻͢Δಛ௃ྔ ύϥϝʔλ

    $7  ֶशͨ͠Ϟσϧ  ֶशϩάϑΝΠϧ  GFBUVSFJNQPSUBODF w 1VCMJDείΞΛσΟϨΫτϦͷ1SFpYʹ෇͚Δ͜ͱͰɺ 1VCMJDͱMPDBMͷείΞͷରԠΛ෼͔Γ΍͘͢ʢৄࡉ͸ޙ΄Ͳʣ
  33.  େ੾ʹ͍ͯ͠Δ͜ͱ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w %0///" Ϟσϧ΍ϩάϑΝΠϧΛɺ࣮ߦ͝ͱʹʮ೔࣌ʯ TV⒏YΛ͚ͭͨσΟϨΫτϦɾϑΝΠϧͰࣗಈ؅ཧ  ࢖༻͢Δಛ௃ྔ ύϥϝʔλ

    $7  ֶशͨ͠Ϟσϧ  ֶशϩάϑΝΠϧ  GFBUVSFJNQPSUBODF w 1VCMJDείΞΛσΟϨΫτϦͷ1SFpYʹ෇͚Δ͜ͱͰɺ 1VCMJDͱMPDBMͷείΞͷରԠΛ෼͔Γ΍͘͢ʢৄࡉ͸ޙ΄Ͳʣ ৄࡉΛݟ͍͖ͯ·͢
  34. ࣮૷ৄࡉ 

  35.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒
  36.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ جຊతʹ1%$"Λճ͢ͱ͖ʹ͍͡Δͷ͸ ͜ͷIPHFQZͷΈʹ͢Δ
  37.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ֶशʹ࢖༻͢Δಛ௃ྔ
  38.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ϋΠύʔύϥϝʔλ
  39.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ $7ͷઃఆ
  40.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ֶशɾਪ࿦ɾTVC࡞੒
  41.  ྫʣIPHFQZ features = [ "age", "pclass", "family_size", "fare", "sibsp",

    "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } runner = Runner(run_name, ModelLGB, features, dataset.get('target'), params_lgb, cv, FEATURE_DIR_NAME, MODEL_DIR_NAME) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ֶशɾਪ࿦ɾTVC࡞੒ w ࠶ݱੑΛ୲อ͢Δ޻෉ w ߴ଎ͳ1$%"Λճͨ͢Ίͷ޻෉ w ҙࣝ͠ͳͯ͘΋ॾʑ͕؅ཧͰ͖Δ޻෉ ʹ͍͓ͭͯ࿩͠͠·͢
  42.  ʙֶशͷ࠶ݱੑΛ୲อ͢ΔͨΊͷ޻෉ʙ

  43.  ࣮૷ৄࡉ IPHFQZ ʙֶशͷ࠶ݱੑʙ features = [ "age", "pclass", "family_size",

    "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, }
  44.  ࣮૷ৄࡉ IPHFQZ ʙֶशͷ࠶ݱੑʙ features = [ "age", "pclass", "family_size",

    "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } QZUIPOIPHFQZ
  45.  ࣮૷ৄࡉ IPHFQZ ʙֶशͷ࠶ݱੑʙ features = [ "age", "pclass", "family_size",

    "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } { "use_features": [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ], "model_params": { "boosting_type": "gbdt", "objective": "fair", "metric": "fair", "num_boost_round": 20000, "early_stopping_rounds": 1000, "verbose": 1000, "random_state": 999 }, "cv": { "method": "KFold", "n_splits": 5, "random_state": 42, "shuffle": true }, "dataset": { "run_name": "lgb_1128_2003", "feature_directory": "../data/features/remove_outlier/", "target": "salary" } } IPHF@QBSBNKTPO QZUIPOIPHFQZ
  46.  ࣮૷ৄࡉ IPHFQZ ʙֶशͷ࠶ݱੑʙ features = [ "age", "pclass", "family_size",

    "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } { "use_features": [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ], "model_params": { "boosting_type": "gbdt", "objective": "fair", "metric": "fair", "num_boost_round": 20000, "early_stopping_rounds": 1000, "verbose": 1000, "random_state": 999 }, "cv": { "method": "KFold", "n_splits": 5, "random_state": 42, "shuffle": true }, "dataset": { "run_name": "lgb_1128_2003", "feature_directory": "../data/features/remove_outlier/", "target": "salary" } } IPHF@QBSBNKTPO QZUIPOIPHFQZ IPHFQZΛ࣮ߦ͢Δ͜ͱʹΑΓ ࣗಈతʹKTPOϑΝΠϧ͕ੜ੒͞Ε ࢖༻ͨ͠ಛ௃ྔɾύϥϝʔλʔͳͲ͕શͯอଘ͞ΕΔ
  47.  ࣮૷ৄࡉ IPHFQZ ʙֶशͷ࠶ݱੑʙ features = [ "age", "pclass", "family_size",

    "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } { "use_features": [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ], "model_params": { "boosting_type": "gbdt", "objective": "fair", "metric": "fair", "num_boost_round": 20000, "early_stopping_rounds": 1000, "verbose": 1000, "random_state": 999 }, "cv": { "method": "KFold", "n_splits": 5, "random_state": 42, "shuffle": true }, "dataset": { "run_name": "lgb_1128_2003", "feature_directory": "../data/features/remove_outlier/", "target": "salary" } } IPHF@QBSBNKTPO QZUIPOIPHFQZ ͜Εͧ࠶ݱੑ
  48.  ʙߴ଎ͳ1%$"Λճͨ͢Ίͷ޻෉ʙ

  49.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ
  50.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ ಛ௃ྔΛݮΒֶͯ͠श͍ͤͨ͞
  51.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", # "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } IPHFQZ ର৅ͷಛ௃ྔΛίϝϯτΞ΢τ
  52.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", # "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } IPHFQZ QZUIPOIPHFQZ ର৅ͷಛ௃ྔΛίϝϯτΞ΢τ
  53.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", # "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } IPHFQZ QZUIPOIPHFQZ ର৅ͷಛ௃ྔΛίϝϯτΞ΢τ ؆୯
  54.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ
  55.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ $7มֶ͑ͯश͍ͤͨ͞
  56.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'GroupKFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, 'cv_target':'user' } IPHFQZ DWͷهࡌΛมߋ
  57.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'GroupKFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, 'cv_target':'user' } IPHFQZ DWͷهࡌΛมߋ QZUIPOIPHFQZ
  58.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'GroupKFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, 'cv_target':'user' } IPHFQZ DWͷهࡌΛมߋ QZUIPOIPHFQZ ؆୯
  59.  ࣮૷ৄࡉ IPHFQZ features = [ "age", "pclass", "family_size", "fare",

    "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'KFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, } ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ features = [ "age", "pclass", "family_size", "fare", "sibsp", "parch", "cabin" ] params_lgb = { 'boosting_type': 'gbdt', 'objective': 'fair', 'metric': 'fair', 'num_boost_round': 20000, 'early_stopping_rounds': 1000, 'verbose': 1000, 'random_state': 999 } cv = { 'method': 'GroupKFold', 'n_splits': 5, 'random_state': 42, 'shuffle': True, 'cv_target':'user' } IPHFQZ DWͷهࡌΛมߋ QZUIPOIPHFQZ GFBUVSFJNQPSUBODFʹ͍ͭͯ
  60.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ

  61.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ ͜ͷંΕઢάϥϑ͕ద੾Ͱ͸ͳ͍ͷ͸ঝ஌͓ͯ͠Γ·͢ʜ ࠓ͸ΦϨΦϨӡ༻ͳͷͰ͜ͷ··์ஔ͓ͯ͠Γ·ͯ͠ Ͳ͏͔ࢗͣ͞ʹ͓ئ͍͠·͢

  62.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ ಛ௃ྔ

  63.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ GPMEຖͷฏۉͱඪ४ภࠩ

  64.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ มಈ܎਺ʢඪ४ภࠩฏۉʣ

  65.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ ͜ͷ͋ͨΓͷಛ௃ྔ͸࡟ͬͯྑͦ͞͏ ͱ͍͏צॴ͕௫ΊΔ

  66.  ࣮૷ৄࡉ ֶशͱಉ࣌ʹGFBUVSFJNQPSUBODF͕ը૾ϑΝΠϧͱͯ͠ग़ྗ͞ΕΔ ʙߴ଎ͳ1%$"Λ໨ࢦͯ͠ʙ ͜ͷ͋ͨΓͷಛ௃ྔ͸࡟ͬͯྑͦ͞͏ ͱ͍͏צॴ͕௫ΊΔ 1%$"଎͘ճͤͦ͏ײ

  67.  ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έͷ޻෉ʙ

  68.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  69.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ QMHC@@ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  70.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ QMHC@@ 1VCMJDͷείΞ ʢʣ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  71.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ QMHC@@ ֶशΛҰҙʹಛఆ͢Δ໊લ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  72.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ QMHC@@ ΞϧΰϦζϜͷࣝผࢠ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  73.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ QMHC@@ IPHFQZΛ࣮ߦͨ͠೔࣌ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  74.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ

  75.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w ಛ௃ྔ w ύϥϝʔλ w GFBUVSFJNQPSUBODF

    w ֶशϩά w Ϟσϧ w ਪ࿦ϑΝΠϧ ͕อଘ͞ΕΔ
  76.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w ಛ௃ྔ w ύϥϝʔλ w GFBUVSFJNQPSUBODF

    w ֶशϩά w Ϟσϧ w ਪ࿦ϑΝΠϧ ͕อଘ͞ΕΔ ҙࣝͯ͠ͳ͍͚Ͳউखʹ؅ཧͰ͖ͯΔ
  77.  ࣮૷ৄࡉ IPHFQZΛ࣮ߦͨ͋͠ͱͷϑΥϧμɾϑΝΠϧ ʙҙࣝ͠ͳͯ͘΋؅ཧͰ͖Δ࢓૊Έʙ w ಛ௃ྔ w ύϥϝʔλ w GFBUVSFJNQPSUBODF

    w ֶशϩά w Ϟσϧ w ਪ࿦ϑΝΠϧ ͕อଘ͞ΕΔ
  78. ·ͱΊ 

  79.  ·ͱΊ w ֶशɾਪ࿦ύΠϓϥΠϯ͍͍ͧʂ  ύΠϓϥΠϯΛߏங͢Δ͜ͱͰԼهͷΑ͏ͳϝϦοτ͕͋Γʢࠓͷॴʣݸਓత ʹ͸ΊͬͪΌྑ͍ɻ ‣ ࠶ݱੑ ‣

    ߴ଎ͳ1%$" ‣ ॾʑͷ؅ཧ w ্هͷΑ͏ͳ͜ͱ͕୲อ͞ΕΔͷͰ৺ཧత҆શੑ΋
  80.  ϫΠ͸͜͏΍ͬͯΔͥʂ ͱ͍͏ͷ͕͋Ε͹ੋඇ࠙਌ձͰڭ͑ͯԼ͍͞ʂ

  81.  ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠