Slide 1

Slide 1 text

Corporate data analysis — generated by Stable Diffusion XL v1.0 2024 1-2 (WBS) 2024 1-2 — 2024-12-02 – p.1/36

Slide 2

Slide 2 text

https://speakerdeck.com/ks91/collections/corporate-data-analysis-2024-winter 2024 1-2 — 2024-12-02 – p.2/36

Slide 3

Slide 3 text

( ) ( ) ( ) CSO (Chief Science Officer) 1993 ( ) 2006 ( ) SFC 24 P2P (Peer-to-Peer) 2011 ( ) 2018 2019 VR 2021.9 & VR 2022.3 2023 AI VR&RPG 2023.5 “Don’t Be So Serious” 2023 2024 AI( ) 2024 “ALOHA FROM HAWAII” 2024 2024 AI( ) → ( ) 2024 1-2 — 2024-12-02 – p.3/36

Slide 4

Slide 4 text

Dropbox Dropbox ( ) 2024 1-2 — 2024-12-02 – p.4/36

Slide 5

Slide 5 text

(B A ) 1 ( ) 2 (Wilcoxon-Mann-Whitney ) 2024 1-2 — 2024-12-02 – p.5/36

Slide 6

Slide 6 text

R 2024 1-2 — 2024-12-02 – p.6/36

Slide 7

Slide 7 text

[ ] , (2022) R R ( ) R 2024 1-2 — 2024-12-02 – p.7/36

Slide 8

Slide 8 text

( ) 1 12 2 • 2 12 2 (B A ) • 3 12 9 4 12 9 5 12 16 6 12 16 t 7 12 23 2 ( ) t 8 12 23 2 ( ) t 9 1 6 P 10 1 6 11 1 20 12 1 20 13 1 27 14 1 27 W-IOI 2024 1-2 — 2024-12-02 – p.8/36

Slide 9

Slide 9 text

( 20 25 ) 1 (20 ) • 2 R ( 55 ) • 3 (32 ) • 4 (14 ) • 5 ( Git) (22 ) • 6 ( ) (24 ) • 7 (1) (25 ) • 8 (2) (25 ) • 9 R ( ) (1) — Welch (17 ) • 10 R ( ) (2) — (21 ) • 11 R ( ) (1) — (15 ) • 12 R ( ) (2) — (19 ) • 13 GPT-4 (19 ) • 14 GPT-4 (29 ) • 15 ( ) LaTeX Overleaf (40 ) • 8 (12/16 ) / (2 ) OK / 2024 1-2 — 2024-12-02 – p.9/36

Slide 10

Slide 10 text

. . . . . . ( ) ( 20 ×(14+1) ) 2024 1-2 — 2024-12-02 – p.10/36

Slide 11

Slide 11 text

(2 )(160 ) (10∼20 ) ( ) and/or 1 (80 ) 1 Q & A & (30∼40 ) (30∼40 ) 2024 1-2 — 2024-12-02 – p.11/36

Slide 12

Slide 12 text

Moodle ( Q&A ) ( ) Discord ( ) ← ( ) 2024 1-2 — 2024-12-02 – p.12/36

Slide 13

Slide 13 text

( ) A4 2 2 (Overleaf ) L ATEX PDF ( ) 2024 1-2 — 2024-12-02 – p.13/36

Slide 14

Slide 14 text

+ + [ ] R , (2008) R 2024 1-2 — 2024-12-02 – p.14/36

Slide 15

Slide 15 text

2024 1-2 — 2024-12-02 – p.15/36

Slide 16

Slide 16 text

= ⇒ (1) (2) (3) = ⇒ ( ) ( (2)) = ⇒ ( ) ( ) AI (← ) 2024 1-2 — 2024-12-02 – p.16/36

Slide 17

Slide 17 text

(observation) (sample) (random variable) (probability distribution) (population) (simple random sampling) ( )( 2 t , , ) 2 ( , ) 2024 1-2 — 2024-12-02 – p.17/36

Slide 18

Slide 18 text

(B A ) 1 ( ) 2 (Wilcoxon-Mann-Whitney ) 2024 1-2 — 2024-12-02 – p.18/36

Slide 19

Slide 19 text

1 ( ) P(X = x) = n C x · px · (1 − p)n−x E[X] = np (1) (null hypothesis) H0 (2) (test statistic) ( x ) (3) H0 (null distribution) (4) (rejection region) ( ; 5% 1%) · (significance level) (5) ( H0 ) 2024 1-2 — 2024-12-02 – p.19/36

Slide 20

Slide 20 text

B ( p.47) RStudio R n C x ‘choose(n,x)’ n = 18, x = 0 . . . choose(18,0)×0.50 × 0.518 = choose(18,0)×0.518 ( ) ⇒ ( ) 3 : : : 2024 1-2 — 2024-12-02 – p.20/36

Slide 21

Slide 21 text

R ( B)(1/2) — R n <- 18 # p <- 0.5 # <- c() # ( ) # x 0 for (x in 0:n) { # <- c( , choose(n,x)*p^x*(1-p)^(n-x)) } halfp <- 0 # ( 0 1) ( ) 2024 1-2 — 2024-12-02 – p.21/36

Slide 22

Slide 22 text

R ( B)(2/2) — R # x 0 ( ) for (x in 0:n) { # 0.025 if (halfp + [x+1] > 0.025) { break } halfp <- halfp + [x+1] # } # color <- rep(c("red"), x) # rep 2 color <- c(color, rep(c("black"), n + 1 - x*2), color) <- 0:n # x # plot (lwd ) plot( , , type="h", lwd=3, col=color) 2024 1-2 — 2024-12-02 – p.22/36

Slide 23

Slide 23 text

0 5 10 15 0.00 0.05 0.10 0.15 ேᩘ ☜⋡ 2024 1-2 — 2024-12-02 – p.23/36

Slide 24

Slide 24 text

R > binom.test(14, n=18, p=0.5) p-value (P )( 9 ) 0.05 ↑ 2024 1-2 — 2024-12-02 – p.24/36

Slide 25

Slide 25 text

2 (Wilcoxon-Mann-Whitney ) WMW ( ) A B A B ( ) (2) U (U ) · U = min(nAnB + 1 2 nA (nA + 1) − RA, nAnB + 1 2 nB (nB + 1) − RB ) (4) ((3) ) U0.05 (5) U U0.05 2024 1-2 — 2024-12-02 – p.25/36

Slide 26

Slide 26 text

D ( p.70) RStudio . . . 2024 1-2 — 2024-12-02 – p.26/36

Slide 27

Slide 27 text

R ( D)(1/2) — GPT ChatGPT (GPT-4) R ( ) 1 ( ) ⇒ GPT-4 (1/2) # calculate_rank_sum <- function(sample1, sample2) { # combined_samples <- c(sample1, sample2) sample_group <- c(rep("sample1", length(sample1)), rep("sample2", length(sample2))) # ranks <- rank(combined_samples) 2024 1-2 — 2024-12-02 – p.27/36

Slide 28

Slide 28 text

R ( D)(2/2) — GPT ⇒ GPT-4 (2/2) # df <- data.frame(value = combined_samples, group = sample_group, rank = ranks) # rank_sum_sample1 <- sum(df[df$group == "sample1", "rank"]) rank_sum_sample2 <- sum(df[df$group == "sample2", "rank"]) return(list(sample1_rank_sum = rank_sum_sample1, sample2_rank_sum = rank_sum_sample2)) } # sample1 <- c(3, 1, 4) sample2 <- c(2, 5, 6) # calculate_rank_sum(sample1, sample2) 2024 1-2 — 2024-12-02 – p.28/36

Slide 29

Slide 29 text

GPT . . . GPT-4 . . . ‘rank(. . .)’ RStudio Help → Search R Help ⇒ GPT GPT 3 (1) (GPT ) (2) (GPT ) (3) 2024 1-2 — 2024-12-02 – p.29/36

Slide 30

Slide 30 text

R ( D)(1/2) — R <- c(4.6, 5.6, 3.2, 3.2, 3.7, 4.0, 5.0, 4.6) <- c(4.6, 4.9, 7.1, 6.0, 5.2, 3.9, 5.3, 5.8) # combined_samples <- c( , ) sample_group <- c(rep(" ", length( )), rep(" ", length( ))) # ranks <- rank(combined_samples) # df <- data.frame(value = combined_samples, group = sample_group, rank = ranks) # ra <- sum(df[df$group == " ", "rank"]) rb <- sum(df[df$group == " ", "rank"]) 2024 1-2 — 2024-12-02 – p.30/36

Slide 31

Slide 31 text

R ( D)(2/2) — R # U na <- length( ) nb <- length( ) U <- min(na*nb + na / 2 * (na + 1) - ra, na*nb + nb / 2 * (nb + 1) - rb) print(paste("U =", U)) # paste # sdf <- data.frame( , ) # boxplot(sdf, ylim=c(0, 8.0), ylab=" ( : )") U U0.05 2024 1-2 — 2024-12-02 – p.31/36

Slide 32

Slide 32 text

⫧‶ ⫧‶࡛ࡣ࡞࠸ 0 2 4 6 8 ᖺ཰ (༢఩:ⓒ୓෇) 2024 1-2 — 2024-12-02 – p.32/36

Slide 33

Slide 33 text

R WMW > wilcox.test( , ) p-value (P )( 9 ) 0.05 P ↑ 2024 1-2 — 2024-12-02 – p.33/36

Slide 34

Slide 34 text

2024 1-2 — 2024-12-02 – p.34/36

Slide 35

Slide 35 text

1. (1) (2) 2024 12 5 ( ) 23:59 JST ( ) Waseda Moodle (Q & A ) 2024 1-2 — 2024-12-02 – p.35/36

Slide 36

Slide 36 text

2024 1-2 — 2024-12-02 – p.36/36