Upgrade to Pro — share decks privately, control downloads, hide ads and more …

回帰分析/大規模言語モデルと統計 / Regression Analysis, Large L...

回帰分析/大規模言語モデルと統計 / Regression Analysis, Large Language Models and Statistics

早稲田大学大学院経営管理研究科「企業データ分析」2024 冬の第13-14回で使用したスライドです。

Kenji Saito

January 25, 2025
Tweet

More Decks by Kenji Saito

Other Decks in Technology

Transcript

  1. Corporate data analysis — generated by Stable Diffusion XL v1.0

    2024 13-14 (WBS) 2024 13-14 — 2025-01-27 – p.1/36
  2. 1 12 2 • 2 12 2 (B A )

    • 3 12 9 • 4 12 9 • 5 12 16 • 6 12 16 t • 7 12 23 2 ( ) t • 8 12 23 2 ( ) t • 9 1 6 P • 10 1 6 • 11 1 20 • 12 1 20 • 13 1 27 • 14 1 27 • W-IOI 2024 13-14 — 2025-01-27 – p.3/36
  3. ( 20 25 ) 1 (20 ) • 2 R

    ( 55 ) • 3 (32 ) • 4 (14 ) • 5 ( Git) (22 ) • 6 ( ) (24 ) • 7 (1) (25 ) • 8 (2) (25 ) • 9 R ( ) (1) — Welch (17 ) • 10 R ( ) (2) — (21 ) • 11 R ( ) (1) — (15 ) • 12 R ( ) (2) — (19 ) • 13 GPT-4 (19 ) • 14 GPT-4 (29 ) • 15 ( ) LaTeX Overleaf (40 ) • 8 (12/16 ) / (2 ) OK / 2024 13-14 — 2025-01-27 – p.4/36
  4. 11 — 2 t FWER (Family-Wise Error Rate) Bonferroni (

    2 t ) / Tukey-Kramer q 12 / r sxy / vs. 2024 13-14 — 2025-01-27 – p.5/36
  5. 13 / ( ) 2 r2 β β 95% E[y|x]

    95% ( ) y 95% ( ) GPT-4 2024 13-14 — 2025-01-27 – p.6/36
  6. 6. (1) ( ) (2) 2025 1 23 ( )

    23:59 JST ( ) Waseda Moodle (Q & A ) (1) Discord 2024 13-14 — 2025-01-27 – p.8/36
  7. . . . . . . 17 17 (1/24( )

    ) ( ) → 16 ( ) → 10 ( ) ( ) → 1 → 5 : 2024 13-14 — 2025-01-27 – p.9/36
  8. ( ) r . . . S xy ⇒ r

    S xy . . . 2024 13-14 — 2025-01-27 – p.10/36
  9. H (1/n) 2 H0 n 1 x1 y1 (x1 ,

    y1 ) ∼ (xn , yn ) n 2024 13-14 — 2025-01-27 – p.11/36
  10. H (2/n) (n = 2) n Pearson r meanx (x1

    ∼ xn meany r = (xi − meanx )(yi − meany )/ (xi − meanx )2 (yi − meany )2 r r −1 ∼ 1 xi meanx yi meany r 1 -1 0 n α Pearson r r > rα ⇒ 2024 13-14 — 2025-01-27 – p.12/36
  11. M ⇒ . . . ( ) (25 5 )

    2024 13-14 — 2025-01-27 – p.14/36
  12. K R Excel R R ⇒ Excel 1,048,576 R Excel

    R 2024 13-14 — 2025-01-27 – p.16/36
  13. 13 / ( ) 2 r2 β β 95% E[y|x]

    95% ( ) y 95% ( ) 2024 13-14 — 2025-01-27 – p.17/36
  14. (1/4) x y ( ) x y E[y|x] = α

    + βx (E[y|x] , β ) x y x y E[y|x] σ x 2 α β ( ) ˆ y = a + bx (b ) SSresidual a b SSresidual = n i=1 e2 i = n i=1 (yi − ˆ yi )2 2024 13-14 — 2025-01-27 – p.18/36
  15. (2/4) b b = r sy sx = sxy s2

    x = n i=1 (xi − ¯ x)(yi − ¯ y) n i=1 (xi − ¯ x)2 a a = ¯ y − b¯ x (¯ y = a + b¯ x (¯ x, ¯ y) ) r2 ××% △△ 2024 13-14 — 2025-01-27 – p.19/36
  16. (“ .txt” ) 0 10 20 30 40 50 10

    12 14 16 18 ㄢእ㐠ື᫬㛫 ▷㊥㞳 ಺ૠ ֎ૠ     (r2 = −0.352) y = 16.16 + −0.12x 52 10 . . . ( ) 2024 13-14 — 2025-01-27 – p.20/36
  17. (3/4) x SSx = n i=1 (xi − ¯ x)2

    MSresidual = SSresidual dfresidual = n i=1 (yi − ˆ yi )2 n − 2 b H0 : β = 0 ( β ) HA : β = 0 ( β ) Student t t = b MSresidual SS x = r n − 2 1 − r2 2024 13-14 — 2025-01-27 – p.21/36
  18. (4/4) β 95% b − t0.05 (n − 2) MSresidual

    SSx , b + t0.05 (n − 2) MSresidual SSx E[y|x] 95% ( ) ˆ y − t0.05 (n − 2) MSresisudal ( 1 n + (x − ¯ x)2 SSx ), ˆ y + t0.05 (n − 2) MSresisudal ( 1 n + (x − ¯ x)2 SSx ) y 95% ( ) ˆ y − t0.05 (n − 2) MSresidual (1 + 1 n + (x − ¯ x)2 SSx ), ˆ y + t0.05 (n − 2) MSresidual (1 + 1 n + (x − ¯ x)2 SSx ) 2024 13-14 — 2025-01-27 – p.22/36
  19. Y ( p.299) “ Y.R” ( cor.test() lm() ) 2024

    13-14 — 2025-01-27 – p.23/36
  20. m P(w1 , . . . , wm ) (Wikipedia)

    1 (Wikipedia) : ( ) ← (Generative Pre-training) : ( ) ( ) 2024 13-14 — 2025-01-27 – p.25/36
  21. ChatGPT GPT ChatGPT GPT OpenAI (GPT-3.5, GPT-4) GPT Generative Pre-trained

    Transformer ( ) (deep learning) a GPT ( ) GPT-3.5, GPT-4 a : ( ) 2024 13-14 — 2025-01-27 – p.26/36
  22. — ( ) ↓ ELSIE PREPARE TO MEET THY GOD

    ( ( )) e t h ← Wikipedia “re” (2 ) “e[ ]” (2 ) “th” “the” “th” 1 “th-e” “art-ific-ial”’ 2024 13-14 — 2025-01-27 – p.27/36
  23. —      (p = 0.xx) (p

    = 0.yy) . . . GPT ( ) GPT 2024 13-14 — 2025-01-27 – p.28/36
  24. ( → → ) OpenAI API "TTJTUBOUT $IBU $PNQMFUJPOT ػೳͷ֊૚ͱͯ͠ݟΔ

    ग़དྷΔ͜ͱͷ֦͕Γͱͯ͠ݟΔ $PNQMFUJPOT $IBU "TTJTUBOUT ଓ͖Λॻ͍ͯ͘ΕΔ νϟοτ΋Ͱ͖Δ ิ׬͢ΔػೳΛ Ԡ༻͢Δ ର࿩͢ΔػೳΛ Ԡ༻͢Δ ഇࢭ΁ ഇࢭ΁ ͲΜͳ૬खͳͷ͔΋ ͋Β͔͡ΊϓϩάϥϛϯάͰ͖Δ API : Application Programming Interface ( ) 2024 13-14 — 2025-01-27 – p.29/36
  25. A X M. MIT 2013 LEGO X A A X

    A X X X (2013 ) ⇒ 10 Open Interpreter A . . . 1 2024 13-14 — 2025-01-27 – p.31/36
  26. V ( p.248) GPT-4 R 4 19 4 4 (

    1 4) R 2 3 Tukey-Kramer sample1 sample4 "data.txt" R R R ( ) ( ) 3(a) 1(ab) 2(bc) 4(c) ( ) cld() plot() 2024 13-14 — 2025-01-27 – p.32/36
  27. 7. (1) R L A TEX (Overleaf) 2 A4 2

    (2) 2025 2 6 ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) Overleaf (read-only OK) · https://www.overleaf.com/read/tfbbnvhqfkqm#609f5b 2024 13-14 — 2025-01-27 – p.34/36
  28. L A TEX Overleaf L A TEX ( or )

    ( ) arXiv.org L ATEX L ATEX ( ) PDF GPT-4 L A TEX PAT (Paper Authoring Tutor; ) Overleaf L A TEX https://www.overleaf.com Google 2024 13-14 — 2025-01-27 – p.35/36