Upgrade to Pro — share decks privately, control downloads, hide ads and more …

正規分布と簡単な統計理論/t分布と信頼区間 / Normal distribution, si...

正規分布と簡単な統計理論/t分布と信頼区間 / Normal distribution, simple statistical theory, t-distribution and confidence intervals

早稲田大学大学院経営管理研究科「企業データ分析」2024 冬の第5-6回で使用したスライドです。

Kenji Saito

December 14, 2024
Tweet

More Decks by Kenji Saito

Other Decks in Technology

Transcript

  1. Corporate data analysis — generated by Stable Diffusion XL v1.0

    2024 5-6 t (WBS) 2024 5-6 t — 2024-12-16 – p.1/39
  2. ( ) 1 12 2 • 2 12 2 (B

    A ) • 3 12 9 • 4 12 9 • 5 12 16 • 6 12 16 t • 7 12 23 2 ( ) t 8 12 23 2 ( ) t 9 1 6 P 10 1 6 11 1 20 12 1 20 13 1 27 14 1 27 W-IOI 2024 5-6 t — 2024-12-16 – p.3/39
  3. ( 20 25 ) 1 (20 ) • 2 R

    ( 55 ) • 3 (32 ) • 4 (14 ) • 5 ( Git) (22 ) • 6 ( ) (24 ) • 7 (1) (25 ) • 8 (2) (25 ) • 9 R ( ) (1) — Welch (17 ) • 10 R ( ) (2) — (21 ) • 11 R ( ) (1) — (15 ) • 12 R ( ) (2) — (19 ) • 13 GPT-4 (19 ) • 14 GPT-4 (29 ) • 15 ( ) LaTeX Overleaf (40 ) • 8 (12/16 ) / (2 ) OK / 2024 5-6 t — 2024-12-16 – p.4/39
  4. 3 1 ( ) 2 ( ) 1 2 4

    σ2 σ s2 s df 2024 5-6 t — 2024-12-16 – p.5/39
  5. 5 1 : 2 : 3 : 4 : 2

    6 t µ 95% Student t σ 95% 95% 2024 5-6 t — 2024-12-16 – p.6/39
  6. 2. 1 2 (1) 1 2 (2) 2023 12 12

    ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) (1) Discord 2024 5-6 t — 2024-12-16 – p.8/39
  7. . . . . . . 17 19 (12/13( )

    ) ( ) 1 2 ( ) ( ) . . . . . . . . . → . . . 1 → 2 → ( ) RStudio Windows macOS 2024 5-6 t — 2024-12-16 – p.9/39
  8. K (1/2) A B A A 8 C B B

    A ( 5% 1% ) 2024 5-6 t — 2024-12-16 – p.10/39
  9. K (1/2) A B 1 A B A B 2

    β A B 1 2 ⇒ 2024 5-6 t — 2024-12-16 – p.11/39
  10. F R ⇒ ( ) ( ) ( ) :

    AI ( ) ( ) : ( ) R 2024 5-6 t — 2024-12-16 – p.13/39
  11. H ⇒ − + = − = − ( )

    Dai, A. M., Olah, C., & Le, Q. V. (2015). Document embedding with paragraph vectors. arXiv. https://arxiv.org/abs/1507.07998 Wikipedia nearest neighbours to “Lady Gaga” - “American” + “Japanese” Ayumi Hamazaki 2024 5-6 t — 2024-12-16 – p.14/39
  12. H GPT ( ) ⇒ AI AI AI 2024 5-6

    t — 2024-12-16 – p.15/39
  13. K ⇒ GitHub . . . R ( ) 2024

    5-6 t — 2024-12-16 – p.18/39
  14. (Windows ) . . . (1) Tools → Global Options.

    . . (2) Options Graphics (3) Graphics Device Backend “Cairo” OK Export → Save as PDF. . . Options “Use cairo_pdf device” Windows “startup.R” RStudio Source # par(family="Japan1") ↓ par(family="Japan1") 2024 5-6 t — 2024-12-16 – p.19/39
  15. 5 1 : 2 : 3 : 4 : 2

    2024 5-6 t — 2024-12-16 – p.20/39
  16. (normal distribution) N(µ, σ2) ( µ σ2 ) (probability density)

    x 1 x N(µ, σ2) x µ − σ µ + σ 68.3% µ − 2σ µ + 2σ 95.4% µ − 3σ µ + 3σ 99.7% N(0, 12) (standard normal distribution) ( z) 2.5% z0.05 1.96 N(µ, σ2) x z z = x − µ σ z N(0, 12) 2024 5-6 t — 2024-12-16 – p.21/39
  17. 1 : x ( ) x N(µ, σ2) x x

    N(µ, σ2 n ) (n ) ( ) µ 1 n 1 √n n N(µ, σ2 n ) n → ∞ x µ ( (law of large numbers)) . . . σ √n (standard error) 2024 5-6 t — 2024-12-16 – p.23/39
  18. 2 : (central limit theorem) x 1 √n x N(µ,

    σ2 n ) (n ) n ( ) µ 1 n 1 √n 1 2024 5-6 t — 2024-12-16 – p.24/39
  19. 3 : A B N(µA , σ2 A ) N(µB

    , σ2 B ) ( ) xA xB (xA + xB ) (xA − xB ) ( (reproductive property)) (xA + xB ) ( ) (µa + µb ) (xA − xB ) ( ) (µa − µb ) (xA + xB ) (xA − xB ) (σ2 A + σ2 B ) (xA + xB ) (xA − xB ) σ2 A + σ2 B 2024 5-6 t — 2024-12-16 – p.25/39
  20. 4 : (xA − xB ) 1 3 A B

    N(µA , σ2 A ) N(µB , σ2 B ) xA xB xA xB (xA − xB ) (xA − xB ) ( ) (µA − µB ) (xA − xB ) xA xB σ2 A nA + σ2 B nB σ2 A nA + σ2 B nB 2024 5-6 t — 2024-12-16 – p.26/39
  21. J ( p.128) “ J.R” z pnorm (( ) )

    5 2024 5-6 t — 2024-12-16 – p.27/39
  22. 6 t µ 95% Student t σ 95% 95% 95%

    2024 5-6 t — 2024-12-16 – p.28/39
  23. µ 95% (95% confidence interval) µ 95% 95% ( µ

    ) 20 19 µ σ . . . ( ) x 95% −z0.05 +z0.05 µ ( 5 ) σ s σ s z N(0, 12) ( n ) → z t → t (t distribution) t n df ( : t(df)) t0.05 (df) ( ) 95% [x − t0.05 (df) × s √ n , x + t0.05 (df) × s √ n ] 2024 5-6 t — 2024-12-16 – p.29/39
  24. L ( p.147) µ “ L.R” p.146 t source 95%

    2024 5-6 t — 2024-12-16 – p.31/39
  25. µ 95% (1/4) : N(µ, σ2) ( : ) STEP

    1 : : 10 STEP 2 : µ . . . 20 19 µ x [ 1] N(µ, σ2 n ) : 10 x z 95% ( 20 19 ) −z0.05 ≤ z (x − µ) √ n σ ≤ +z0.05 σ 2024 5-6 t — 2024-12-16 – p.32/39
  26. . . . . . . 1.96 σ √n (

    ) σ ↑ −z0.05 ≤ (x − µ) √ n σ (1) ⇒ −z0.05 × σ ≤ (x − µ) √ n (2) ⇒ −z0.05 × σ √ n ≤ x − µ (3) ⇒ µ − z0.05 × σ √ n ≤ x (4) ⇒ µ ≤ x + z0.05 × σ √ n (5) σ 2024 5-6 t — 2024-12-16 – p.33/39
  27. µ 95% (2/4) s N(0, 12) ( n 1,000 .

    . . ) n ( df) t (Student ) x Student t 95% ( 20 19 ) −t0.05 (df) ≤ t (x − µ) √ n s ≤ +t0.05 (df) µ . . . − t0.05 (df) ≤ (x − µ) √ n s (6) ⇒ − t0.05 (df) × s √ n ≤ x − µ (7) ⇒ µ ≤ x + t0.05 (df) × s √ n ( ) (8) (x − µ) √ n s ≤ +t0.05 (df) (9) ⇒ x − µ ≤ +t0.05 df × s √ n (10) ⇒ x − t0.05 (df) × s √ n ≤ µ ( ) (11) 2024 5-6 t — 2024-12-16 – p.34/39
  28. µ 95% (3/4) ┸ ┼ ┸⒨⒨⒭  ┸⒨⒨⒭  ┸

    df 9 t0.05 9 = 2.26  ▂  ▂ ฼ฏۉЖͷ  ৴པ۠ؒ ͕ 4UVEFOU Խ͞Εͨ΋ͷ ┸⒨⒨⒭  º 㲋┲ ┷ ্ଆ৴པݶք Լଆ৴པݶք 2024 5-6 t — 2024-12-16 – p.35/39
  29. µ 95% (4/4) “ .txt” µ 95% ( 100 )

    > g <- read.table(" .txt", header=T) > mean(g$ ) - abs(qt(0.025, 99)) * sd(g$ ) / sqrt(100) # > mean(g$ ) + abs(qt(0.025, 99)) * sd(g$ ) / sqrt(100) # qt(0.025, length(g$ ) - 1) sqrt(length(g$ )) 95% CI [55.3, 62.9] “ L.R” sample t.test(g$ ) 95% CI [47.9, 55.6] 2024 5-6 t — 2024-12-16 – p.36/39
  30. 3. µ 95% (1) (t ) µ 95% (2) 2024

    12 19 ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) (1) Discord 2024 5-6 t — 2024-12-16 – p.38/39