Slide 1

Slide 1 text

Corporate data analysis — generated by Stable Diffusion XL v1.0 2024 5-6 t (WBS) 2024 5-6 t — 2024-12-16 – p.1/39

Slide 2

Slide 2 text

https://speakerdeck.com/ks91/collections/corporate-data-analysis-2024-winter 2024 5-6 t — 2024-12-16 – p.2/39

Slide 3

Slide 3 text

( ) 1 12 2 • 2 12 2 (B A ) • 3 12 9 • 4 12 9 • 5 12 16 • 6 12 16 t • 7 12 23 2 ( ) t 8 12 23 2 ( ) t 9 1 6 P 10 1 6 11 1 20 12 1 20 13 1 27 14 1 27 W-IOI 2024 5-6 t — 2024-12-16 – p.3/39

Slide 4

Slide 4 text

( 20 25 ) 1 (20 ) • 2 R ( 55 ) • 3 (32 ) • 4 (14 ) • 5 ( Git) (22 ) • 6 ( ) (24 ) • 7 (1) (25 ) • 8 (2) (25 ) • 9 R ( ) (1) — Welch (17 ) • 10 R ( ) (2) — (21 ) • 11 R ( ) (1) — (15 ) • 12 R ( ) (2) — (19 ) • 13 GPT-4 (19 ) • 14 GPT-4 (29 ) • 15 ( ) LaTeX Overleaf (40 ) • 8 (12/16 ) / (2 ) OK / 2024 5-6 t — 2024-12-16 – p.4/39

Slide 5

Slide 5 text

3 1 ( ) 2 ( ) 1 2 4 σ2 σ s2 s df 2024 5-6 t — 2024-12-16 – p.5/39

Slide 6

Slide 6 text

5 1 : 2 : 3 : 4 : 2 6 t µ 95% Student t σ 95% 95% 2024 5-6 t — 2024-12-16 – p.6/39

Slide 7

Slide 7 text

2024 5-6 t — 2024-12-16 – p.7/39

Slide 8

Slide 8 text

2. 1 2 (1) 1 2 (2) 2023 12 12 ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) (1) Discord 2024 5-6 t — 2024-12-16 – p.8/39

Slide 9

Slide 9 text

. . . . . . 17 19 (12/13( ) ) ( ) 1 2 ( ) ( ) . . . . . . . . . → . . . 1 → 2 → ( ) RStudio Windows macOS 2024 5-6 t — 2024-12-16 – p.9/39

Slide 10

Slide 10 text

K (1/2) A B A A 8 C B B A ( 5% 1% ) 2024 5-6 t — 2024-12-16 – p.10/39

Slide 11

Slide 11 text

K (1/2) A B 1 A B A B 2 β A B 1 2 ⇒ 2024 5-6 t — 2024-12-16 – p.11/39

Slide 12

Slide 12 text

K ⇒ 2024 5-6 t — 2024-12-16 – p.12/39

Slide 13

Slide 13 text

F R ⇒ ( ) ( ) ( ) : AI ( ) ( ) : ( ) R 2024 5-6 t — 2024-12-16 – p.13/39

Slide 14

Slide 14 text

H ⇒ − + = − = − ( ) Dai, A. M., Olah, C., & Le, Q. V. (2015). Document embedding with paragraph vectors. arXiv. https://arxiv.org/abs/1507.07998 Wikipedia nearest neighbours to “Lady Gaga” - “American” + “Japanese” Ayumi Hamazaki 2024 5-6 t — 2024-12-16 – p.14/39

Slide 15

Slide 15 text

H GPT ( ) ⇒ AI AI AI 2024 5-6 t — 2024-12-16 – p.15/39

Slide 16

Slide 16 text

M ⇒ ( ) 2024 5-6 t — 2024-12-16 – p.16/39

Slide 17

Slide 17 text

H Git Git zip ⇒ 2024 5-6 t — 2024-12-16 – p.17/39

Slide 18

Slide 18 text

K ⇒ GitHub . . . R ( ) 2024 5-6 t — 2024-12-16 – p.18/39

Slide 19

Slide 19 text

(Windows ) . . . (1) Tools → Global Options. . . (2) Options Graphics (3) Graphics Device Backend “Cairo” OK Export → Save as PDF. . . Options “Use cairo_pdf device” Windows “startup.R” RStudio Source # par(family="Japan1") ↓ par(family="Japan1") 2024 5-6 t — 2024-12-16 – p.19/39

Slide 20

Slide 20 text

5 1 : 2 : 3 : 4 : 2 2024 5-6 t — 2024-12-16 – p.20/39

Slide 21

Slide 21 text

(normal distribution) N(µ, σ2) ( µ σ2 ) (probability density) x 1 x N(µ, σ2) x µ − σ µ + σ 68.3% µ − 2σ µ + 2σ 95.4% µ − 3σ µ + 3σ 99.7% N(0, 12) (standard normal distribution) ( z) 2.5% z0.05 1.96 N(µ, σ2) x z z = x − µ σ z N(0, 12) 2024 5-6 t — 2024-12-16 – p.21/39

Slide 22

Slide 22 text

R “ t .R” 2024 5-6 t — 2024-12-16 – p.22/39

Slide 23

Slide 23 text

1 : x ( ) x N(µ, σ2) x x N(µ, σ2 n ) (n ) ( ) µ 1 n 1 √n n N(µ, σ2 n ) n → ∞ x µ ( (law of large numbers)) . . . σ √n (standard error) 2024 5-6 t — 2024-12-16 – p.23/39

Slide 24

Slide 24 text

2 : (central limit theorem) x 1 √n x N(µ, σ2 n ) (n ) n ( ) µ 1 n 1 √n 1 2024 5-6 t — 2024-12-16 – p.24/39

Slide 25

Slide 25 text

3 : A B N(µA , σ2 A ) N(µB , σ2 B ) ( ) xA xB (xA + xB ) (xA − xB ) ( (reproductive property)) (xA + xB ) ( ) (µa + µb ) (xA − xB ) ( ) (µa − µb ) (xA + xB ) (xA − xB ) (σ2 A + σ2 B ) (xA + xB ) (xA − xB ) σ2 A + σ2 B 2024 5-6 t — 2024-12-16 – p.25/39

Slide 26

Slide 26 text

4 : (xA − xB ) 1 3 A B N(µA , σ2 A ) N(µB , σ2 B ) xA xB xA xB (xA − xB ) (xA − xB ) ( ) (µA − µB ) (xA − xB ) xA xB σ2 A nA + σ2 B nB σ2 A nA + σ2 B nB 2024 5-6 t — 2024-12-16 – p.26/39

Slide 27

Slide 27 text

J ( p.128) “ J.R” z pnorm (( ) ) 5 2024 5-6 t — 2024-12-16 – p.27/39

Slide 28

Slide 28 text

6 t µ 95% Student t σ 95% 95% 95% 2024 5-6 t — 2024-12-16 – p.28/39

Slide 29

Slide 29 text

µ 95% (95% confidence interval) µ 95% 95% ( µ ) 20 19 µ σ . . . ( ) x 95% −z0.05 +z0.05 µ ( 5 ) σ s σ s z N(0, 12) ( n ) → z t → t (t distribution) t n df ( : t(df)) t0.05 (df) ( ) 95% [x − t0.05 (df) × s √ n , x + t0.05 (df) × s √ n ] 2024 5-6 t — 2024-12-16 – p.29/39

Slide 30

Slide 30 text

R t “ t .R” 2024 5-6 t — 2024-12-16 – p.30/39

Slide 31

Slide 31 text

L ( p.147) µ “ L.R” p.146 t source 95% 2024 5-6 t — 2024-12-16 – p.31/39

Slide 32

Slide 32 text

µ 95% (1/4) : N(µ, σ2) ( : ) STEP 1 : : 10 STEP 2 : µ . . . 20 19 µ x [ 1] N(µ, σ2 n ) : 10 x z 95% ( 20 19 ) −z0.05 ≤ z (x − µ) √ n σ ≤ +z0.05 σ 2024 5-6 t — 2024-12-16 – p.32/39

Slide 33

Slide 33 text

. . . . . . 1.96 σ √n ( ) σ ↑ −z0.05 ≤ (x − µ) √ n σ (1) ⇒ −z0.05 × σ ≤ (x − µ) √ n (2) ⇒ −z0.05 × σ √ n ≤ x − µ (3) ⇒ µ − z0.05 × σ √ n ≤ x (4) ⇒ µ ≤ x + z0.05 × σ √ n (5) σ 2024 5-6 t — 2024-12-16 – p.33/39

Slide 34

Slide 34 text

µ 95% (2/4) s N(0, 12) ( n 1,000 . . . ) n ( df) t (Student ) x Student t 95% ( 20 19 ) −t0.05 (df) ≤ t (x − µ) √ n s ≤ +t0.05 (df) µ . . . − t0.05 (df) ≤ (x − µ) √ n s (6) ⇒ − t0.05 (df) × s √ n ≤ x − µ (7) ⇒ µ ≤ x + t0.05 (df) × s √ n ( ) (8) (x − µ) √ n s ≤ +t0.05 (df) (9) ⇒ x − µ ≤ +t0.05 df × s √ n (10) ⇒ x − t0.05 (df) × s √ n ≤ µ ( ) (11) 2024 5-6 t — 2024-12-16 – p.34/39

Slide 35

Slide 35 text

µ 95% (3/4) ┸ ┼ ┸⒨⒨⒭ ┸⒨⒨⒭ ┸ df 9 t0.05 9 = 2.26 ▂ ▂ ฼ฏۉЖͷ ৴པ۠ؒ ͕ 4UVEFOU Խ͞Εͨ΋ͷ ┸⒨⒨⒭ º 㲋┲ ┷ ্ଆ৴པݶք Լଆ৴པݶք 2024 5-6 t — 2024-12-16 – p.35/39

Slide 36

Slide 36 text

µ 95% (4/4) “ .txt” µ 95% ( 100 ) > g <- read.table(" .txt", header=T) > mean(g$ ) - abs(qt(0.025, 99)) * sd(g$ ) / sqrt(100) # > mean(g$ ) + abs(qt(0.025, 99)) * sd(g$ ) / sqrt(100) # qt(0.025, length(g$ ) - 1) sqrt(length(g$ )) 95% CI [55.3, 62.9] “ L.R” sample t.test(g$ ) 95% CI [47.9, 55.6] 2024 5-6 t — 2024-12-16 – p.36/39

Slide 37

Slide 37 text

2024 5-6 t — 2024-12-16 – p.37/39

Slide 38

Slide 38 text

3. µ 95% (1) (t ) µ 95% (2) 2024 12 19 ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) (1) Discord 2024 5-6 t — 2024-12-16 – p.38/39

Slide 39

Slide 39 text

2024 5-6 t — 2024-12-16 – p.39/39