Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
組織とデータ分析/統計的仮説検定 / Organization, data analysis ...
Search
Kenji Saito
PRO
November 28, 2024
Technology
0
160
組織とデータ分析/統計的仮説検定 / Organization, data analysis and statistical hypothesis testing
早稲田大学大学院経営管理研究科「企業データ分析」2024 冬の第1-2回で使用したスライドです。
Kenji Saito
PRO
November 28, 2024
Tweet
Share
More Decks by Kenji Saito
See All by Kenji Saito
We Never Took the Kobayashi Maru Test Until Now. What Do You Think of Our Solutions? — Journeys of the Mind Through a No-Win Game
ks91
PRO
0
12
思いつきが武器になる:研究というゲームを始めよう / Ideas Are Your Equipments : Let the Game of Research Begin!
ks91
PRO
0
65
ロボットを雰囲気(ヴァイブ)でプログラミングするこどもたち / Children Vibe-Programming Robots
ks91
PRO
0
21
アカデミーキャンプ 2025 SuuuuuuMMeR「燃えろ!!ロボコン」 / Academy Camp 2025 SuuuuuuMMeR "Burn the Spirit, Robocon!!" DAY 3
ks91
PRO
0
30
アカデミーキャンプ 2025 SuuuuuuMMeR「燃えろ!!ロボコン」 / Academy Camp 2025 SuuuuuuMMeR "Burn the Spirit, Robocon!!" DAY 2
ks91
PRO
0
32
アカデミーキャンプ 2025 SuuuuuuMMeR「燃えろ!!ロボコン」 / Academy Camp 2025 SuuuuuuMMeR "Burn the Spirit, Robocon!!" DAY 1
ks91
PRO
0
160
未来へのフォワードキャスト / Forward Cast to the Future
ks91
PRO
0
86
発表と総括 / Presentations and Summary
ks91
PRO
0
61
サイバーフィジカル社会、金融の未来とアイデアソン / Cyber Physical Society, Future of Finance, and Ideathon
ks91
PRO
0
78
Other Decks in Technology
See All in Technology
生成AI時代のデータ基盤
shibuiwilliam
1
1.3k
Browser
recruitengineers
PRO
7
2.1k
Goss: New Production-Ready Go Binding for Faiss #coefl_go_jp
bengo4com
1
1.1k
事業価値と Engineering
recruitengineers
PRO
7
5.3k
Kubernetes における cgroup v2 でのOut-Of-Memory 問題の解決
pfn
PRO
0
430
【5分でわかる】セーフィー エンジニア向け会社紹介
safie_recruit
0
30k
ヘブンバーンズレッドにおける、世界観を活かしたミニゲーム企画の作り方
gree_tech
PRO
0
410
知られざるprops命名の慣習 アクション編
uhyo
11
2.8k
実践アプリケーション設計 ①データモデルとドメインモデル
recruitengineers
PRO
5
1.3k
Grafana Meetup Japan Vol. 6
kaedemalu
1
190
努力家なスクラムマスターが陥る「傍観者」という罠と乗り越えた先に信頼があった話 / 20250830 Takahiro Sasaki
shift_evolve
PRO
2
130
ガチな登山用デバイスからこんにちは
halka
1
180
Featured
See All Featured
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
110
20k
A better future with KSS
kneath
239
17k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.7k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Unsuck your backbone
ammeep
671
58k
The Cult of Friendly URLs
andyhume
79
6.6k
We Have a Design System, Now What?
morganepeng
53
7.8k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
185
54k
Six Lessons from altMBA
skipperchong
28
4k
Transcript
Corporate data analysis — generated by Stable Diffusion XL v1.0
2024 1-2 (WBS) 2024 1-2 — 2024-12-02 – p.1/36
https://speakerdeck.com/ks91/collections/corporate-data-analysis-2024-winter 2024 1-2 — 2024-12-02 – p.2/36
( ) ( ) ( ) CSO (Chief Science Officer)
1993 ( ) 2006 ( ) SFC 24 P2P (Peer-to-Peer) 2011 ( ) 2018 2019 VR 2021.9 & VR 2022.3 2023 AI VR&RPG 2023.5 “Don’t Be So Serious” 2023 2024 AI( ) 2024 “ALOHA FROM HAWAII” 2024 2024 AI( ) → ( ) 2024 1-2 — 2024-12-02 – p.3/36
Dropbox Dropbox ( ) 2024 1-2 — 2024-12-02 – p.4/36
(B A ) 1 ( ) 2 (Wilcoxon-Mann-Whitney ) 2024
1-2 — 2024-12-02 – p.5/36
R 2024 1-2 — 2024-12-02 – p.6/36
[ ] , (2022) R R ( ) R 2024
1-2 — 2024-12-02 – p.7/36
( ) 1 12 2 • 2 12 2 (B
A ) • 3 12 9 4 12 9 5 12 16 6 12 16 t 7 12 23 2 ( ) t 8 12 23 2 ( ) t 9 1 6 P 10 1 6 11 1 20 12 1 20 13 1 27 14 1 27 W-IOI 2024 1-2 — 2024-12-02 – p.8/36
( 20 25 ) 1 (20 ) • 2 R
( 55 ) • 3 (32 ) • 4 (14 ) • 5 ( Git) (22 ) • 6 ( ) (24 ) • 7 (1) (25 ) • 8 (2) (25 ) • 9 R ( ) (1) — Welch (17 ) • 10 R ( ) (2) — (21 ) • 11 R ( ) (1) — (15 ) • 12 R ( ) (2) — (19 ) • 13 GPT-4 (19 ) • 14 GPT-4 (29 ) • 15 ( ) LaTeX Overleaf (40 ) • 8 (12/16 ) / (2 ) OK / 2024 1-2 — 2024-12-02 – p.9/36
. . . . . . ( ) ( 20
×(14+1) ) 2024 1-2 — 2024-12-02 – p.10/36
(2 )(160 ) (10∼20 ) ( ) and/or 1 (80
) 1 Q & A & (30∼40 ) (30∼40 ) 2024 1-2 — 2024-12-02 – p.11/36
Moodle ( Q&A ) ( ) Discord ( ) ←
( ) 2024 1-2 — 2024-12-02 – p.12/36
( ) A4 2 2 (Overleaf ) L ATEX PDF
( ) 2024 1-2 — 2024-12-02 – p.13/36
+ + [ ] R , (2008) R 2024 1-2
— 2024-12-02 – p.14/36
2024 1-2 — 2024-12-02 – p.15/36
= ⇒ (1) (2) (3) = ⇒ ( ) (
(2)) = ⇒ ( ) ( ) AI (← ) 2024 1-2 — 2024-12-02 – p.16/36
(observation) (sample) (random variable) (probability distribution) (population) (simple random sampling)
( )( 2 t , , ) 2 ( , ) 2024 1-2 — 2024-12-02 – p.17/36
(B A ) 1 ( ) 2 (Wilcoxon-Mann-Whitney ) 2024
1-2 — 2024-12-02 – p.18/36
1 ( ) P(X = x) = n C x
· px · (1 − p)n−x E[X] = np (1) (null hypothesis) H0 (2) (test statistic) ( x ) (3) H0 (null distribution) (4) (rejection region) ( ; 5% 1%) · (significance level) (5) ( H0 ) 2024 1-2 — 2024-12-02 – p.19/36
B ( p.47) RStudio R n C x ‘choose(n,x)’ n
= 18, x = 0 . . . choose(18,0)×0.50 × 0.518 = choose(18,0)×0.518 ( ) ⇒ ( ) 3 : : : 2024 1-2 — 2024-12-02 – p.20/36
R ( B)(1/2) — R n <- 18 # p
<- 0.5 # <- c() # ( ) # x 0 for (x in 0:n) { # <- c( , choose(n,x)*p^x*(1-p)^(n-x)) } halfp <- 0 # ( 0 1) ( ) 2024 1-2 — 2024-12-02 – p.21/36
R ( B)(2/2) — R # x 0 ( )
for (x in 0:n) { # 0.025 if (halfp + [x+1] > 0.025) { break } halfp <- halfp + [x+1] # } # color <- rep(c("red"), x) # rep 2 color <- c(color, rep(c("black"), n + 1 - x*2), color) <- 0:n # x # plot (lwd ) plot( , , type="h", lwd=3, col=color) 2024 1-2 — 2024-12-02 – p.22/36
0 5 10 15 0.00 0.05 0.10 0.15 ேᩘ ☜⋡
2024 1-2 — 2024-12-02 – p.23/36
R > binom.test(14, n=18, p=0.5) p-value (P )( 9 )
0.05 ↑ 2024 1-2 — 2024-12-02 – p.24/36
2 (Wilcoxon-Mann-Whitney ) WMW ( ) A B A B
( ) (2) U (U ) · U = min(nAnB + 1 2 nA (nA + 1) − RA, nAnB + 1 2 nB (nB + 1) − RB ) (4) ((3) ) U0.05 (5) U U0.05 2024 1-2 — 2024-12-02 – p.25/36
D ( p.70) RStudio . . . 2024 1-2 —
2024-12-02 – p.26/36
R ( D)(1/2) — GPT ChatGPT (GPT-4) R ( )
1 ( ) ⇒ GPT-4 (1/2) # calculate_rank_sum <- function(sample1, sample2) { # combined_samples <- c(sample1, sample2) sample_group <- c(rep("sample1", length(sample1)), rep("sample2", length(sample2))) # ranks <- rank(combined_samples) 2024 1-2 — 2024-12-02 – p.27/36
R ( D)(2/2) — GPT ⇒ GPT-4 (2/2) # df
<- data.frame(value = combined_samples, group = sample_group, rank = ranks) # rank_sum_sample1 <- sum(df[df$group == "sample1", "rank"]) rank_sum_sample2 <- sum(df[df$group == "sample2", "rank"]) return(list(sample1_rank_sum = rank_sum_sample1, sample2_rank_sum = rank_sum_sample2)) } # sample1 <- c(3, 1, 4) sample2 <- c(2, 5, 6) # calculate_rank_sum(sample1, sample2) 2024 1-2 — 2024-12-02 – p.28/36
GPT . . . GPT-4 . . . ‘rank(. .
.)’ RStudio Help → Search R Help ⇒ GPT GPT 3 (1) (GPT ) (2) (GPT ) (3) 2024 1-2 — 2024-12-02 – p.29/36
R ( D)(1/2) — R <- c(4.6, 5.6, 3.2, 3.2,
3.7, 4.0, 5.0, 4.6) <- c(4.6, 4.9, 7.1, 6.0, 5.2, 3.9, 5.3, 5.8) # combined_samples <- c( , ) sample_group <- c(rep(" ", length( )), rep(" ", length( ))) # ranks <- rank(combined_samples) # df <- data.frame(value = combined_samples, group = sample_group, rank = ranks) # ra <- sum(df[df$group == " ", "rank"]) rb <- sum(df[df$group == " ", "rank"]) 2024 1-2 — 2024-12-02 – p.30/36
R ( D)(2/2) — R # U na <- length(
) nb <- length( ) U <- min(na*nb + na / 2 * (na + 1) - ra, na*nb + nb / 2 * (nb + 1) - rb) print(paste("U =", U)) # paste # sdf <- data.frame( , ) # boxplot(sdf, ylim=c(0, 8.0), ylab=" ( : )") U U0.05 2024 1-2 — 2024-12-02 – p.31/36
⫧‶ ⫧‶࡛ࡣ࡞࠸ 0 2 4 6 8 ᖺ (༢:ⓒ) 2024
1-2 — 2024-12-02 – p.32/36
R WMW > wilcox.test( , ) p-value (P )( 9
) 0.05 P ↑ 2024 1-2 — 2024-12-02 – p.33/36
2024 1-2 — 2024-12-02 – p.34/36
1. (1) (2) 2024 12 5 ( ) 23:59 JST
( ) Waseda Moodle (Q & A ) 2024 1-2 — 2024-12-02 – p.35/36
2024 1-2 — 2024-12-02 – p.36/36