ロジスティック回帰 Part 2 - 係数、オッズ比、平均限界効果

19fc8f6113c5c3d86e6176362ff29479?s=47 Kan Nishida
September 26, 2019

ロジスティック回帰 Part 2 - 係数、オッズ比、平均限界効果

19fc8f6113c5c3d86e6176362ff29479?s=128

Kan Nishida

September 26, 2019
Tweet

Transcript

  1. ϩδεςΟοΫճؼ Part 2 ܎਺ɺΦοζൺɺฏۉݶքޮՌ Exploratory Seminar #20

  2. EXPLORATORY

  3. 3 εϐʔΧʔ ੢ా צҰ࿠ CEO EXPLORATORY ུྺ 2016೥ɺσʔλαΠΤϯεͷຽओԽͷͨΊɺExploratory, Inc Λ

    ্ཱͪ͛Δɻ Exploratory, Inc.ͰCEOΛ຿ΊΔ͔ͨΘΒɺσʔλαΠΤϯεɾ ϒʔτΩϟϯϓɾτϨʔχϯάͳͲΛ௨ͯ͠γϦίϯόϨʔͰ ߦΘΕ͍ͯΔ࠷ઌ୺ͷσʔλαΠΤϯεͷීٴͱڭҭʹऔΓ૊ Ήɻ ถΦϥΫϧຊࣾͰɺ16೥ʹΘͨΓσʔλαΠΤϯεͷ։ൃνʔ ϜΛ཰͍ɺػցֶशɺϏοάɾσʔλɺϏδωεɾΠϯςϦδΣ ϯεɺσʔλϕʔεʹؔ͢Δ਺ଟ͘ͷ੡඼ΛੈʹૹΓग़ͨ͠ɻ @KanAugust
  4. Vision ΑΓΑ͍ҙࢥܾఆΛ͢ΔͨΊʹ σʔλΛ࢖͏͜ͱ͕౰ͨΓલʹͳΔ

  5. Mission σʔλαΠΤϯεͷຽओԽ

  6. 6 ୈ̏ͷ೾ σʔλαΠΤϯεɺAIɺػցֶश͸౷ܭֶऀɺ։ൃऀͷͨΊ͚ͩͷ΋ͷͰ͸͋Γ·ͤΜɻ σʔλʹڵຯͷ͋ΔਓͳΒ୭΋͕ੈքͰ࠷ઌ୺ͷΞϧΰϦζϜΛ࢖ͬͯ ϏδωεσʔλΛ؆୯ʹ෼ੳͰ͖Δ΂͖Ͱ͢ɻ Exploratory͕ͦ͏ͨ͠ੈքΛՄೳʹ͠·͢ɻ

  7. ୈ1ͷ೾ ୈ̎ͷ೾ ୈ̏ͷ೾ ϓϥΠϕʔτ(ߴ͍/ݹ͍) Φʔϓϯɾιʔε(ແྉ/࠷ઌ୺) UI & ϓϩάϥϛϯά ϓϩάϥϛϯά 2016

    2000 1976 ϚωλΠθʔγϣϯ ίϞσΟςΟԽ ຽओԽ ౷ܭֶऀ σʔλαΠΤϯςΟετ Exploratory ΞϧΰϦζϜ Ϣʔβʔɾ ମݧ πʔϧ Φʔϓϯɾιʔε(ແྉ/࠷ઌ୺) UI & ࣗಈԽ ϏδωεɾϢʔβʔ ςʔϚ σʔλαΠΤϯεͷຽओԽ
  8. 質問 ExploratoryɹϞμϯˍγϯϓϧ UI 伝える データアクセス データ ラングリング 可視化 アナリティクス 統計/機械学習

  9. ϩδεςΟοΫճؼ Part 2 ܎਺ɺΦοζൺɺฏۉݶքޮՌ Exploratory Seminar #20

  10. 質問 伝える データアクセス データ ラングリング 可視化 アナリティクス 統計/機械学習

  11. USͷ੺ͪΌΜσʔλ

  12. ڵຯͷର৅ ਺஋ ΧςΰϦʔ/ೋ߲ 12 ΧςΰϦʔ/ଟ߲

  13. ໰୊ • ෕਌ͷ೥ྸ͸͍͔ͭ͘ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ • ෕਌͸35ࡀΑΓ্ͳͷ͔ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ

  14. ໰୊ • ෕਌ͷ೥ྸ͸͍͔ͭ͘ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ • ෕਌͸35ࡀΑΓ্ͳͷ͔ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ

  15. ڵຯͷର৅ ਺஋ ΧςΰϦʔ/ೋ߲ 15 ΧςΰϦʔ/ଟ߲

  16. ઢܗճؼ

  17. 17 Father_Age = a * Mother_Age + b ܎਺ʢ܏͖ʣ ੾ย

    ઢܗճؼͷϞσϧʢܭࢉࣜʣ
  18. 18 Father_Age = a * Mother_Age + b ܎਺ʢ܏͖ʣ ੾ย

    ܎਺ͱ੾ยΛௐઅ͢Δ͜ͱͰ࣮σʔλͱ Ϛον͢ΔΑ͏ͳ௚ઢ͕ඳ͚Δɻ
  19. 19 ܎਺ʢ܏͖ʣ ੾ย

  20. 20 Father_Age = 0.87 * Mother_Age + 6.28 ܎਺ʢ܏͖ʣ ੾ย

    ઢܗճؼͷϞσϧʢܭࢉࣜʣ
  21. None
  22. ෕਌ͷ೥ྸ ฼਌ͷ೥ྸ ฼਌ͷ೥ྸ͕1্͕Δͱɺ෕਌ͷ೥ྸ͸0.87্͕Δɻ

  23. ෕਌ͷ೥ྸ ฼਌ͷ೥ྸ ઢܗճؼͷϞσϧ͸࣮σʔλͱϑΟοτ͢ΔΑ͏ʹ࡞ΒΕΔɻ

  24. ໰୊ • ෕਌ͷ೥ྸ͸͍͔ͭ͘ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ • ෕਌͸35ࡀΑΓ্ͳͷ͔ɺ฼਌ͷ೥ྸΛ΋ͱʹ༧ଌ͍ͨ͠ɻ

  25. ڵຯͷର৅ ਺஋ ΧςΰϦʔ/ೋ߲ 25 ΧςΰϦʔ/ଟ߲

  26. • ͜ͷϢʔβʔ͸ίϯόʔτ͢Δ͔ʁ • ͜ͷऔҾ͸ෆਖ਼͔ʁ • ͜ͷैۀһ͸΍ΊΔ͔ʁ • ͜ͷ੺ͪΌΜ͸ະख़ࣇͰੜ·ΕΔ͔ʁ ೋ߲ͷ࣭໰

  27. 27 ෕਌͕35Ҏ্ͷ֬཰ = logistic(a * Mother_Age + b) ܎਺ʢ܏͖ʣ ੾ย

    ϩδεςΟοΫճؼͷϞσϧʢܭࢉࣜʣ
  28. 28 ෕਌͕35Ҏ্ͷ֬཰ = logistic(a * Mother_Age + b) ܎਺ʢ܏͖ʣ ੾ย

    ܎਺ͱ੾ยΛௐઅ͢Δ͜ͱͰ࣮σʔλͱ Ϛον͢ΔΑ͏ͳۂઢ͕ඳ͚Δɻ
  29. ࣮σʔλ

  30. දܭࢉͷʮׂ߹ʢˋ of ߹ܭ஋ʣʯΛ࢖ͬͯ TRUE/FALSEͷׂ߹Λදࣔ͢Δɻ

  31. ฼਌ͷ೥ྸ͝ͱͷTRUE/FALSEͷׂ߹

  32. ຌྫͷதͷFALSEΛΫϦοΫͯ͠ɺFALSEͷ෦෼ͷόʔΛফ͢ɻ

  33. ଞʹ΋ʢ΋ͬͱ؆୯ʹʣಉ͡Α͏ͳ νϟʔτΛඳ͘ํ๏͕͋Δɻ

  34. Y࣠ʹϩδΧϧܕͷྻΛબͼʮ% of TRUEʯͷܭࢉΛબͿɻ

  35. ϥΠϯνϟʔτʹม͑ͯΈΔɻ

  36. ͜ͷ࣮σʔλʹϑΟοτ͢ΔϩδεςΟοΫۂઢΛग़͍ͨ͠ɻ

  37. 37 ෕਌͕35Ҏ্ͷ֬཰ = logistic(a * Mother_Age + b) ܎਺ʢ܏͖ʣ ੾ย

    ܎਺ͱ੾ยΛௐઅ͢Δ͜ͱͰ࣮σʔλͱ Ϛον͢ΔΑ͏ͳۂઢ͕ඳ͚Δɻ
  38. 38 ϩδεςΟοΫճؼͷϞσϧ

  39. 39 ෕਌͕35Ҏ্ͷ֬཰ = logistic(0.29 * Mother_Age - 10.12) ੾ย ܎਺

  40. None
  41. ϩδεςΟοΫճؼʹΑΔ༧ଌ஋ͷྻΛY࣠ʹׂΓ౰ͯɺ ʮฏۉʯͷܭࢉΛબͿɻ

  42. ϩδεςΟοΫճؼʹΑΔ༧ଌ஋͸0͔Β1ͷؒͷ஋ͳͷͰɺ Y2࣠ʹׂΓ౰ͯΔɻ

  43. ࣮σʔλ Ϟσϧ (ϩδεςΟοΫۂઢ) ͍͍ײ͡Ͱ࣮σʔλʹϑΟοτͯ͠Δɻ

  44. ͱ͜ΖͰɺ͜ͷۂઢɺͲ͏ղऍͨ͠Β͍͍ͷ͔ʁ P(Father > 35) = Logistic(0.29 * Mother_Age - 10.12)

  45. 45 ϩδεςΟοΫճؼ ༧ଌม਺ͷӨڹ౓ͷղऍ

  46. 46 ϩδεςΟοΫճؼ • ܎਺ʢCoefficientʣ • ΦοζൺʢOdds Ratioʣ • ฏۉݶքޮՌʢAverage Marginal

    Effectʣ
  47. 47 ϩδεςΟοΫճؼ • ܎਺ʢCoefficientʣ • ΦοζൺʢOdds Ratioʣ • ฏۉݶքޮՌʢAverage Marginal

    Effectʣ
  48. 48 ม਺ͷࢦඪͱͯ͠ɺ܎਺Λબ୒͢Δɻ

  49. None
  50. None
  51. ܎਺͕খ͍͞ͱɺ༧ଌม ਺͇ͷ஋ͷมԽ͕͈ͷ֬ ཰ͷมԽʹ͋ͨ͑ΔӨڹ ͕খ͍͞ɻ 51 y = logistic(0.1 * x)

  52. ܎਺͕େ͖͍ͱɺ༧ଌม ਺͇ͷ஋ͷมԽ͕͈ͷ֬ ཰ͷมԽʹ͋ͨ͑ΔӨڹ ͕େ͖͍ɻ 52 y = logistic(10 * x)

  53. P(Father > 35) = Logistic(0.29 * Mother_Age - 10.12)

  54. P(Father > 35) = Logistic(0.29 * Mother_Age - 10.12) Pr(Father

    > 35) = Logit (0.29 * Mother_Age - 10.12) -1
  55. Logit( P(Father > 35) ) = 0.29 * Mother_Age -

    10.12 P(Father > 35) = Logistic(0.29 * Mother_Age - 10.12) P(Father > 35) = (0.29 * Mother_Age - 10.12) Logit -1
  56. ϩδοτؔ਺͸֬཰ΛϩάɾΦοζม׵͢Δ Logit( P(y) ) = Log(Odds(y)) Logit( P(Father > 35)

    ) = 0.29 * Mother_Age - 10.12 Log(Odds(Father > 35)) = 0.29 * Mother_Age - 10.12
  57. Log(Odds((Father > 35))) = 0.29 * 20 - 10.12 =

    -4.32 ฼਌͕20 Log(Odds(Father > 35)) = 0.29 * Mother_Age - 10.12
  58. Log(Odds((Father > 35))) = 0.29 * 20 - 10.12 =

    -4.32 ฼਌͕21 Log(Odds((Father > 35))) = 0.29 * 21 - 10.12 = -4.03 ฼਌͕20 Log(Odds(Father > 35)) = 0.29 * Mother_Age - 10.12
  59. Log(Odds((Father > 35))) = 0.29 * 20 - 10.12 =

    -4.32 ฼਌͕21 Log(Odds((Father > 35))) = 0.29 * 21 - 10.12 = -4.03 ฼਌͕20 Log(Odds(Father > 35)) = 0.29 * Mother_Age - 10.12 0.29 ࠩ ฼਌ͷ೥ྸ͕1ࡀ্͕Δͱɺ෕਌͕35ࡀҎ্Ͱ͋Δ ϩάɾΦοζ͕0.29্͕Δɻ
  60. ϩάɾΦοζͬͯԿ͚ͩͬʁ

  61. ΋͏গ͠ਓؒతͳࢦඪ͕͋Δɻ

  62. 62 ϩδεςΟοΫճؼ • ܎਺ʢCoefficientʣ • ΦοζൺʢOdds Ratioʣ • ฏۉݶքޮՌʢAverage Marginal

    Effectʣ
  63. None
  64. 64 Φοζൺ͸܎਺ (Coefficient) ʹࢦ਺ؔ਺(logͷٯ)Λద༻ͨ͠஋ɻ Φοζൺ = exp(܎਺)

  65. 65 ෕਌͕35Ҏ্ͷ֬཰ = logistic(a * Mother_Age + b) ܎਺ʢ܏͖ʣ ੾ย

    ܎਺ͱ੾ยΛௐઅ͢Δ͜ͱͰ࣮σʔλͱ Ϛον͢ΔΑ͏ͳۂઢ͕ඳ͚Δɻ
  66. 66 ϩδεςΟοΫճؼͷϞσϧ

  67. 67 ෕਌͕35Ҏ্ͷ֬཰ = logistic(0.29 * Mother_Age - 10.12) ੾ย ܎਺

  68. None
  69. ֬཰ (Father > 35) ฼਌ͷ೥ྸ

  70. ϩδεςΟοΫۂઢ

  71. ϩδεςΟοΫۂઢ͔ΒΦοζΛܭࢉͯ͠ΈΔɻ

  72. 72 Φοζ Φοζ = TRUEͷ֬཰ / FALSEͷ֬཰

  73. 73 ૣ࢈ʹͳΔΦοζ Φοζ = TRUEͷ֬཰ / FALSEͷ֬཰ ૣ࢈ʹͳΔ֬཰͕10% ૣ࢈ʹͳΒͳ͍֬཰͕90% 10

    / 90 = 0.1111…
  74. 74 50% 50% 100% 0% mother_age(฼਌ͷ೥ྸ) 34 When Mother is

    34, what is the odds of Father being older than 35?
  75. 75 Φοζ 1 50% 50% 50/50 100% 0% mother_age(฼਌ͷ೥ྸ) 34

  76. 76 Φοζ 1 50% 50% 50/50 34 mother_age(฼਌ͷ೥ྸ) 100% 0%

  77. 77 1 50% 50% 66.7/33.3 2 33.3% 66.7% 34 35

    Φοζ mother_age(฼਌ͷ೥ྸ) 100% 0%
  78. 78 1 50% 50% 80/20 2 33.3% 66.7% 34 35

    20% 80% 36 4 Φοζ mother_age(฼਌ͷ೥ྸ) 100% 0%
  79. 79 1 50% 50% 88.9/11.1 33.3% 66.7% 34 35 20%

    80% 36 11.1% 88.9% 37 2 4 8 Φοζ mother_age(฼਌ͷ೥ྸ) 100% 0%
  80. 80 ม਺ͷ஋͕1૿͑ΔͱɺΦοζ͸ԿഒʹͳΔ͔ɻ Φοζൺ (Odds Ratio)

  81. 81 TRUE FALSE 1 50% 50% 33.3% 66.7% 20% 80%

    11.1% 88.9% 2 4 8 Φοζ 2x Φοζൺ mother_age(฼਌ͷ೥ྸ) 34 35 36 37
  82. 82 TRUE FALSE 1 50% 50% 33.3% 66.7% 20% 80%

    11.1% 88.9% 2 4 8 Φοζ 2x Φοζൺ mother_age(฼਌ͷ೥ྸ)͕ 1্͕Δͱŋŋŋ TRUEͱͳΔΦοζ͕2ഒʹͳΔɻ mother_age(฼਌ͷ೥ྸ) 34 35 36 37
  83. 83 TRUE FALSE 1 50% 50% 33.3% 66.7% 20% 80%

    11.1% 88.9% 2 4 8 Φοζ 2x Φοζൺ mother_age(฼਌ͷ೥ྸ) 34 35 36 37 Logistic Curve guarantee that this Odds Ratio is constant.
  84. ม਺͕ΧςΰϦʔͷ࣌͸Ͳ͏ղऍ͢Ε͹Α͍͔ɻ

  85. ༧ଌม਺͕฼਌ͷਓछʢΧςΰϦʔʣ

  86. தࠃਓͷ฼਌ͷΦοζൺ͸0.5952ɻ

  87. ΧςΰϦʔͷ࣌͸ϕʔεϨϕϧͱൺ΂Δɻ

  88. தࠃਓͷ฼਌͸നਓͷ฼਌ʹൺ΂ͯΦοζൺ͸0.5952ߴ͍ɻ

  89. தࠃਓͷ฼਌͸നਓͷ฼਌ʹൺ΂ͯΦοζൺ͸0.5952ߴ͍ɻ ʁʁʁ

  90. ϐϘοτςʔϒϧΛ࡞ͬͯߟ͑ͯΈΔɻ

  91. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954

  92. ֬཰Λܭࢉ

  93. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 தࠃਓͷ฼਌ͷ࣌ʹTRUEʹͳΔ֬཰͸ʁ 296

    (TRUE) / (296+3,839) (Total) = 0.072 (7.2%)
  94. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 39,221 (TRUE)

    / (39,221+311,954) (Total) = 0.112 (11.2%) നਓͷ฼਌ͷ࣌ʹTRUEʹͳΔ֬཰͸ʁ
  95. ΦοζΛܭࢉ

  96. 96 Φοζ Φοζ = TRUEͷ֬཰ / FALSEͷ֬཰

  97. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 TRUEͷ֬཰: 296

    / (296+3,839) = 0.072 FALSEͷ֬཰: 1 - 0.072 = 0.928 Φοζ: 0.072 / 0.928 = 0.077 தࠃਓͷ฼਌ͷ࣌ʹTRUEʹͳΔΦοζ͸ʁ
  98. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 TRUEͷ֬཰: 39,221

    / (39,221 + 311,954) = 0.112 FALSEͷ֬཰: 1 - 0.112 = 0.888 Φοζ: 0.112 / 0.888 = 0.126 നਓͷ฼਌ͷ࣌ʹTRUEʹͳΔΦοζ͸ʁ
  99. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 0.126 0.077

    Φοζ
  100. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 0.126 0.077

    നਓʹൺ΂ͯதࠃਓ͕TRUEʹͳΔΦοζ͸ʁ Φοζ
  101. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 0.126 0.077

    നਓʹൺ΂ͯதࠃਓ͕TRUEʹͳΔΦοζ͸ʁ 0.077 / 0.126 = 0.611 Φοζ Φοζൺ
  102. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 0.126 0.077

    നਓʹൺ΂ͯதࠃਓ͕TRUEʹͳΔΦοζ͸0.611ഒʁ 0.077 / 0.126 = 0.611 Φοζ Φοζൺ
  103. தࠃਓ നਓ TRUE 296 39,221 FALSE 3,839 311,954 0.126 0.077

    നਓʹൺ΂ͯதࠃਓ͕TRUEʹͳΔΦοζ͸40ˋ௿͍ʁ 0.077 / 0.126 = 0.611 Φοζ Φοζൺ
  104. The odds of Chinese Mothers having premature babies is 40%

    less likely compared to White Mothers.
  105. ࣮͸͜͏͍͏දݱ͸αΠΤϯεؔ࿈ͷ ൃදͰΑ͘Έ͔͚Δɻ

  106. Source: More meat, more problems: Bacon may increase breast cancer

    risk in Latinas. U of South Carolina News, Zen Vuong, March 3 2016 “ϕʔίϯΛຖ೔20άϥϜ΄Ͳ৯΂Δϥςϯܥͷঁੑ͕ೕ͕Μ ʹͳΔՄೳੑ͸ϕʔίϯΛ৯΂ͳ͍ϥςϯܥͷঁੑʹൺ΂ͯ 42ˋߴ͘ͳΔ͜ͱ͕ݚڀͷ݁ՌΘ͔ͬͨɻ”
  107. Source: More meat, more problems: Bacon may increase breast cancer

    risk in Latinas. U of South Carolina News, Zen Vuong, March 3 2016 “ϕʔίϯΛຖ೔20άϥϜ΄Ͳ৯΂Δϥςϯܥͷঁੑ͕ೕ͕Μ ʹͳΔΦοζ͸ϕʔίϯΛ৯΂ͳ͍ϥςϯܥͷঁੑʹൺ΂ͯ 1.42ഒͰ͋Δ͜ͱ͕ݚڀͷ݁ՌΘ͔ͬͨɻ”
  108. 108 ΦοζൺͷՄࢹԽ ม਺ͷࢦඪͱͯ͠ɺΦοζൺΛબ୒͢Δɻ

  109. Odds Ratio = exp(Coefficient)

  110. 110 ฼਌ͷ೥ྸ͕1ࡀ্͕Δͱɺ෕਌͕35ࡀҎ্Ͱ͋Δ Φοζ͕1.3ഒ্͕Δɻ

  111. Φοζൺ͕Α͘ཧղग़དྷͳ͍ਓɻ ৺഑͠ͳ͍Ͱ͍ͩ͘͞ɻ

  112. ΋͏গ͠௚ײతͳࢦඪ͕͋Γ·͢ɻ

  113. 113 ϩδεςΟοΫճؼ • ܎਺ʢCoefficientʣ • ΦοζൺʢOdds Ratioʣ • ฏۉݶքޮՌʢAverage Marginal

    Effectʣ
  114. 114 ฏۉݶքޮՌ (Average Marginal Effect)

  115. ฏۉݶքޮՌ (Average Marginal Effect) ม਺͕1্͕Δͱɺ֬཰͕ฏۉͯ͠ͲΕ্͚͕ͩΔͷ͔Λࣔ͢஋ɻ

  116. ϩδεςΟοΫۂઢ

  117. None
  118. ͋Δۃ఺ͷ܏͖ ݶքޮՌ

  119. 119 • ݶքޮՌ͸ɺ֤σʔλ఺ʹΑͬͯҧ͏ͷͰɺ͜ͷ·· Ͱ͸Ұͭͷม਺ͷࢦඪʹͳΒͳ͍ɻ • ͢΂ͯͷσʔλʹ͍ͭͯݶքޮՌΛฏۉͯ͠Ұͭͷม ਺ͷࢦඪʹͨ͠ͷ͕ฏۉݶքޮՌɻ ฏۉݶքޮՌ

  120. ݶքޮՌ ͢΂ͯͷσʔλ఺ͷݶքޮՌͷฏۉ

  121. 121 ม਺ͷࢦඪʹฏۉݶքޮՌΛબͿɻʢσϑΥϧτʣ

  122. 122 ฏۉݶքޮՌ ฼਌ͷ೥ྸ͕1ࡀ্͕Δͱɺ෕਌͕̏̑ࡀҎ্ Ͱ͋Δ֬཰͕ฏۉͯ͠3%΄Ͳ͕͋Δɻ

  123. ม਺ͷӨڹ౓ʹؔ͢Δ౷ܭςετ ʢԾઆݕఆʣ

  124. P஋ʢP Valueʣ

  125. 125 • ؼແԾઆ͸ɺʮ͜ͷม਺͸ɺ࣮͸༧ଌ͍ͨ͠஋ͱؔ܎ͳ͍ɻʢͦ͏Έ ͑Δͷ͸ۮવͰ͋Δʣʯ • P஋ ͸ɺؼແԾઆ͕ͳΓͨͭͱͨ͠ͱ͖ʹɺ࣮ࡍʹग़͍ͯΔ஋ͱಉఔ ౓͔ͦΕҎ্ʹม਺ͱ݁Ռ͕ؔ࿈͍ͯ͠ΔΑ͏ʹݟ͑Δ֬཰ɻ • P஋͕

    5%ҎԼͰ͋Ε͹ɺؼແԾઆ͸غ٫ग़དྷΔͷͰɺม਺͸݁Ռͱؔ ࿈͕͋Δͱߟ͑Δɻ P஋ʢP Valueʣ
  126. 126 ༧ଌม਺͕1͚ͭͩͷ৔߹ΛΈ͖ͯͨɻ

  127. Simple Logistic Regression P(y) = logistic(a * x + b)

  128. ੺ͪΌΜͷ਺͕1૿͑Δͱɺૣ࢈ʹͳΔΦοζ͕ 13ഒʹͳΔɻ Φοζൺͷ৔߹

  129. ੺ͪΌΜͷ਺͕1૿͑Δͱɺૣ࢈ʹͳΔ֬཰͕ฏۉͰ 23.67%্͕Δɻ ฏۉݶքޮՌͷ৔߹

  130. ฏۉݶքޮՌͷ৔߹

  131. 131 ༧ଌม਺͕ෳ਺ͷ৔߹ɻ

  132. Multiple Logistic Regression P(y) = logistic(a1 * x1 + a2

    * x2 + b)
  133. ෳ਺ͷྻΛ༧ଌม਺ͱͯ͠બͿɻ

  134. ଞͷม਺ͷ஋͕ҰఆͰ͋Ε͹ɺ ੺ͪΌΜͷ਺͕૿͑Δͱૣ࢈ʹͳΔΦοζ͸2.68ഒʹͳΔɻ Φοζൺͷ৔߹

  135. ଞͷม਺ͷ஋͕ҰఆͰ͋Ε͹ɺ ੺ͪΌΜͷ਺͕૿͑Δͱૣ࢈ʹͳΔ֬཰͸ฏۉͰ7ˋ্͕Δɻ ฏۉݶքޮՌͷ৔߹

  136. ฏۉݶքޮՌͷ৔߹

  137. Q & A

  138. None
  139. • ϓϩάϥϛϯάͳ͠ RݴޠͷUIͰ͋ΔExploratoryΛ෼ੳπʔϧͱͯ͠࢖༻͢ΔͨΊडߨத͸ɺϏδωεͷ໰୊ Λղܾ͢ΔͨΊʹඞཁͳσʔλαΠΤϯεͷख๏ͷशಘʹ100ˋूதͰ͖Δ • πʔϧͷ࢖͍ํͰ͸ͳ͘ɺ෼ੳख๏ͷशಘ ݱ৔Ͱ࢖͑Δ෼ੳख๏ΛάϧʔϓԋशΛ௨࣮ͯ͠ࡍʹखΛಈ͔͠ͳ͕Βɺ਎ʹ͚ͭͯߦ͘ ͜ͱ͕Ͱ͖Δɻ • ࢥߟྗͱεΩϧͷशಘ

    σʔλαΠΤϯεͷεΩϧशಘ͚ͩͰͳ͘ɺσʔλ෼ੳʹඞཁͳࢥߟྗ΋शಘͰ͖Δ ಛ௃
  140. ࿈བྷઌ ϝʔϧ kan@exploratory.io ΢ΣϒαΠτ https://ja.exploratory.io ϒʔτΩϟϯϓɾτϨʔχϯά https://ja.exploratory.io/training-jp Twitter @KanAugust

  141. EXPLORATORY