Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SappoRo.R_roundrobin
Search
kilometer
March 18, 2023
Programming
0
160
SappoRo.R_roundrobin
第10回Sapporo.Rで喋った際のスライドです。
kilometer
March 18, 2023
Tweet
Share
More Decks by kilometer
See All by kilometer
TokyoR#111_ANOVA
kilometer
2
920
TokyoR109.pdf
kilometer
1
500
TokyoR#108_NestedDataHandling
kilometer
0
870
TokyoR#107_R_GeoData
kilometer
0
470
TokyoR#104_DataProcessing
kilometer
1
730
TokyoR#103_DataProcessing
kilometer
0
930
TokyoR#102_RMarkdown
kilometer
1
690
TokyoR#101_RegressionAnalysis
kilometer
0
510
TokyoR#99_Divergence
kilometer
1
430
Other Decks in Programming
See All in Programming
明日から始めるリファクタリング
ryounasso
0
110
Reduxモダナイズ 〜コードのモダン化を通して、将来のライブラリ移行に備える〜
pvcresin
2
680
iOSエンジニア向けの英語学習アプリを作る!
yukawashouhei
0
170
Pythonスレッドとは結局何なのか? CPython実装から見るNoGIL時代の変化
curekoshimizu
4
1.3k
猫と暮らすネットワークカメラ生活🐈 ~Vision frameworkでペットを愛でよう~ / iOSDC Japan 2025
yutailang0119
0
220
Go Conference 2025: Goで体感するMultipath TCP ― Go 1.24 時代の MPTCP Listener を理解する
takehaya
7
1.6k
そのpreloadは必要?見過ごされたpreloadが技術的負債として爆発した日
mugitti9
2
3k
Conquering Massive Traffic Spikes in Ruby Applications with Pitchfork
riseshia
0
150
CSC509 Lecture 04
javiergs
PRO
0
290
Model Pollution
hschwentner
1
180
CSC305 Lecture 01
javiergs
PRO
1
400
Goで実践するドメイン駆動開発 AIと歩み始めた新規プロダクト開発の現在地
imkaoru
4
590
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
54
3k
Making Projects Easy
brettharned
119
6.4k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.7k
Building an army of robots
kneath
306
46k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
30
2.9k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
A Modern Web Designer's Workflow
chriscoyier
697
190k
Navigating Team Friction
lara
189
15k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Building Applications with DynamoDB
mza
96
6.6k
Why Our Code Smells
bkeepers
PRO
339
57k
Building Adaptive Systems
keathley
43
2.8k
Transcript
SappoRo.R #10 @kilometer00 2023.03.18 らくらく総当たり組み合わせ
Who!? Who?
Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング
医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: アンキロサウルス
宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!
宣伝2!! R⾔語の地域コミュニティ@東京です。 定期的にR⾔語に関する勉強会を開催しています。 次回は4⽉22⽇!! 初⼼者特集回です!!
総当たり組み合わせ Round-robin そう あ あ く
dat_nest <- palmerpenguins::penguins %>% dplyr::group_nest(species) データを畳み込む > dat_nest # A
tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] (息を吐くように)
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
base::expand.grid()関数 > dat_nest # A tibble: 3 × 2 species
data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] dat_nest$species
grid <- dat_nest$species %>% expand.grid(., .) base::expand.grid()関数 > grid Var1
Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo
dplyr::left_join()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap
Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo > dat_nest # A tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] ①対応づけて結合 ②対応づけて結合
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" )
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" ) ① ②
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) > dat_rr_rename # A tibble: 9 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) key <- "species" x <- stringr::str_c(key, ".x") y <- stringr::str_c(key, ".y") dat_rr_rename <- dat_rr %>% rename(!!x := Var1) %>% rename(!!y := Var2) 別解 {rlang}パッケージの演算⼦
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
grid <- dat_nest$species %>% expand.grid(., .) %>% subset(unclass(Var1) < unclass(Var2))
%>% tibble::as_tibble() base::subset()関数 > grid # A tibble: 3 × 2 Var1 Var2 <fct> <fct> 1 Adelie Chinstrap 2 Adelie Gentoo 3 Chinstrap Gentoo
という変換を パッケージにしました。 devtools::install_github( "kilometer0101/roundrobin" ) (4回⼿打ちしたら⾯倒臭くなったので)
roundrobin::roundrobin()関数 # A tibble: 9 × 4 species.x species.y data.x
data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species")
library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species", combination = TRUE) roundrobin::roundrobin()関数
# A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7]
使ってみますか。
library(tidyverse) library(palmerpenguins) library(roundrobin) dat <- palmerpenguins::penguins %>% na.omit() %>% #
NA除去 mutate_at( vars(c(contains("mm"), contains("g"))), ~ (. - mean(.)) / sd(.) # 標準化 ) %>% select(species, contains("mm"), contains("g")) 前処理
> dat # A tibble: 333 × 5 species bill_length_mm
bill_depth_mm flipper_length_mm body_mass_g <fct> <dbl> <dbl> <dbl> <dbl> 1 Adelie -0.895 0.780 -1.42 -0.568 2 Adelie -0.822 0.119 -1.07 -0.506 3 Adelie -0.675 0.424 -0.426 -1.19 4 Adelie -1.33 1.08 -0.568 -0.940 5 Adelie -0.858 1.74 -0.782 -0.692 6 Adelie -0.931 0.323 -1.42 -0.723 7 Adelie -0.876 1.24 -0.426 0.581 8 Adelie -0.529 0.221 -1.35 -1.25 9 Adelie -0.986 2.05 -0.711 -0.506 10 Adelie -1.72 2.00 -0.212 0.240 # … with 323 more rows # i Use `print(n = ...)` to see more rows 前処理
dat_long <- dat %>% rowid_to_column("id") %>% pivot_longer( cols = !species,
names_to = "parameter", values_to = "value" ) %>% group_by(parameter) %>% ungroup() .y <- dat_long %>% ungroup() %>% group_by(species) %>% summarise( mean_id = mean(id), min_id = min(id) ) dat_long %>% ggplot() + aes(parameter, id) + geom_tile(aes(fill = value)) + geom_hline( yintercept = max(dat_long$id) ) + geom_hline(data = .y, aes(yintercept = min_id)) + scale_y_continuous( breaks = .y$mean_id, labels = .y$species, expand = c(0, 0)) + theme( axis.title = element_blank(), axis.text.x = element_text( angle = 30, hjust = 1 ) ) 可視化コード (ちょちょいのちょい)
可視化
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,4]>> <list<tibble[,4]>> 1 Adelie Adelie [146 × 4] [146 × 4] 2 Chinstrap Adelie [68 × 4] [146 × 4] 3 Gentoo Adelie [119 × 4] [146 × 4] 4 Adelie Chinstrap [146 × 4] [68 × 4] 5 Chinstrap Chinstrap [68 × 4] [68 × 4] 6 Gentoo Chinstrap [119 × 4] [68 × 4] 7 Adelie Gentoo [146 × 4] [119 × 4] 8 Chinstrap Gentoo [68 × 4] [119 × 4] 9 Gentoo Gentoo [119 × 4] [119 × 4] 総当たり組み合わせ dat_rr <- dat %>% roundrobin(key = "species", rename = FALSE)
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) > dat_rr_mahaD # A tibble: 9 × 5 Var1 Var2 data.x data.y mahaD2 <fct> <chr> <list<tibble[,4]>> <list<tibble[,4]>> <list> 1 Adelie vs. Adelie [146 × 4] [146 × 4] <dbl [146]> 2 Chinstrap vs. Adelie [68 × 4] [146 × 4] <dbl [68]> 3 Gentoo vs. Adelie [119 × 4] [146 × 4] <dbl [119]> 4 Adelie vs. Chinstrap [146 × 4] [68 × 4] <dbl [146]> 5 Chinstrap vs. Chinstrap [68 × 4] [68 × 4] <dbl [68]> 6 Gentoo vs. Chinstrap [119 × 4] [68 × 4] <dbl [119]> 7 Adelie vs. Gentoo [146 × 4] [119 × 4] <dbl [146]> 8 Chinstrap vs. Gentoo [68 × 4] [119 × 4] <dbl [68]> 9 Gentoo vs. Gentoo [119 × 4] [119 × 4] <dbl [119]>
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) dat_mahaD <- dat_rr_mahaD %>% select(!data.y) %>% unnest(everything())
> dat_mahaD # A tibble: 999 × 7 Var1 Var2
bill_length_mm bill_…¹ flipp…² body_…³ mahaD2 <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Adelie vs. Adelie -0.895 0.780 -1.42 -0.568 2.84 2 Adelie vs. Adelie -0.822 0.119 -1.07 -0.506 1.95 3 Adelie vs. Adelie -0.675 0.424 -0.426 -1.19 4.26 4 Adelie vs. Adelie -1.33 1.08 -0.568 -0.940 3.32 5 Adelie vs. Adelie -0.858 1.74 -0.782 -0.692 5.57 6 Adelie vs. Adelie -0.931 0.323 -1.42 -0.723 2.47 7 Adelie vs. Adelie -0.876 1.24 -0.426 0.581 5.94 8 Adelie vs. Adelie -0.529 0.221 -1.35 -1.25 5.27 9 Adelie vs. Adelie -0.986 2.05 -0.711 -0.506 7.75 10 Adelie vs. Adelie -1.72 2.00 -0.212 0.240 15.2 # … with 989 more rows, and abbreviated variable names # ¹bill_depth_mm, ²flipper_length_mm, ³body_mass_g # ℹ Use `print(n = ...)` to see more rows 例えばマハラノビス距離
例えばマハラノビス距離 ggplot(dat_mahaD) + aes(mahaD2, color = Var1, fill = Var1)
+ geom_density(alpha = 0.5) + facet_wrap(~Var2)
総当たり組み合わせ Round-robin そう あ あ く devtools::install_github( "kilometer0101/roundrobin" )
Enjoy!