Slide 1

Slide 1 text

SappoRo.R #10 @kilometer00 2023.03.18 らくらく総当たり組み合わせ

Slide 2

Slide 2 text

Who!? Who?

Slide 3

Slide 3 text

Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング 医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: アンキロサウルス

Slide 4

Slide 4 text

宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!

Slide 5

Slide 5 text

宣伝2!! R⾔語の地域コミュニティ@東京です。 定期的にR⾔語に関する勉強会を開催しています。 次回は4⽉22⽇!! 初⼼者特集回です!!

Slide 6

Slide 6 text

総当たり組み合わせ Round-robin そう あ あ く

Slide 7

Slide 7 text

dat_nest <- palmerpenguins::penguins %>% dplyr::group_nest(species) データを畳み込む > dat_nest # A tibble: 3 × 2 species data > 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] (息を吐くように)

Slide 8

Slide 8 text

# A tibble: 9 × 4 species.x species.y data.x data.y > > 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y > > 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)

Slide 9

Slide 9 text

base::expand.grid()関数 > dat_nest # A tibble: 3 × 2 species data > 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] dat_nest$species

Slide 10

Slide 10 text

grid <- dat_nest$species %>% expand.grid(., .) base::expand.grid()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo

Slide 11

Slide 11 text

dplyr::left_join()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo > dat_nest # A tibble: 3 × 2 species data > 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] ①対応づけて結合 ②対応づけて結合

Slide 12

Slide 12 text

dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" )

Slide 13

Slide 13 text

dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" ) ① ②

Slide 14

Slide 14 text

> dat_rr # A tibble: 9 × 4 Var1 Var2 data.x data.y > 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]

Slide 15

Slide 15 text

dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y = Var2) > dat_rr_rename # A tibble: 9 × 4 species.x species.y data.x data.y > > 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]

Slide 16

Slide 16 text

dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y = Var2) key <- "species" x <- stringr::str_c(key, ".x") y <- stringr::str_c(key, ".y") dat_rr_rename <- dat_rr %>% rename(!!x := Var1) %>% rename(!!y := Var2) 別解 {rlang}パッケージの演算⼦

Slide 17

Slide 17 text

# A tibble: 9 × 4 species.x species.y data.x data.y > > 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y > > 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)

Slide 18

Slide 18 text

grid <- dat_nest$species %>% expand.grid(., .) %>% subset(unclass(Var1) < unclass(Var2)) %>% tibble::as_tibble() base::subset()関数 > grid # A tibble: 3 × 2 Var1 Var2 1 Adelie Chinstrap 2 Adelie Gentoo 3 Chinstrap Gentoo

Slide 19

Slide 19 text

という変換を パッケージにしました。 devtools::install_github( "kilometer0101/roundrobin" ) (4回⼿打ちしたら⾯倒臭くなったので)

Slide 20

Slide 20 text

roundrobin::roundrobin()関数 # A tibble: 9 × 4 species.x species.y data.x data.y > > 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species")

Slide 21

Slide 21 text

library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species", combination = TRUE) roundrobin::roundrobin()関数 # A tibble: 3 × 4 species.x species.y data.x data.y > > 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7]

Slide 22

Slide 22 text

使ってみますか。

Slide 23

Slide 23 text

library(tidyverse) library(palmerpenguins) library(roundrobin) dat <- palmerpenguins::penguins %>% na.omit() %>% # NA除去 mutate_at( vars(c(contains("mm"), contains("g"))), ~ (. - mean(.)) / sd(.) # 標準化 ) %>% select(species, contains("mm"), contains("g")) 前処理

Slide 24

Slide 24 text

> dat # A tibble: 333 × 5 species bill_length_mm bill_depth_mm flipper_length_mm body_mass_g 1 Adelie -0.895 0.780 -1.42 -0.568 2 Adelie -0.822 0.119 -1.07 -0.506 3 Adelie -0.675 0.424 -0.426 -1.19 4 Adelie -1.33 1.08 -0.568 -0.940 5 Adelie -0.858 1.74 -0.782 -0.692 6 Adelie -0.931 0.323 -1.42 -0.723 7 Adelie -0.876 1.24 -0.426 0.581 8 Adelie -0.529 0.221 -1.35 -1.25 9 Adelie -0.986 2.05 -0.711 -0.506 10 Adelie -1.72 2.00 -0.212 0.240 # … with 323 more rows # i Use `print(n = ...)` to see more rows 前処理

Slide 25

Slide 25 text

dat_long <- dat %>% rowid_to_column("id") %>% pivot_longer( cols = !species, names_to = "parameter", values_to = "value" ) %>% group_by(parameter) %>% ungroup() .y <- dat_long %>% ungroup() %>% group_by(species) %>% summarise( mean_id = mean(id), min_id = min(id) ) dat_long %>% ggplot() + aes(parameter, id) + geom_tile(aes(fill = value)) + geom_hline( yintercept = max(dat_long$id) ) + geom_hline(data = .y, aes(yintercept = min_id)) + scale_y_continuous( breaks = .y$mean_id, labels = .y$species, expand = c(0, 0)) + theme( axis.title = element_blank(), axis.text.x = element_text( angle = 30, hjust = 1 ) ) 可視化コード (ちょちょいのちょい)

Slide 26

Slide 26 text

可視化

Slide 27

Slide 27 text

> dat_rr # A tibble: 9 × 4 Var1 Var2 data.x data.y > > 1 Adelie Adelie [146 × 4] [146 × 4] 2 Chinstrap Adelie [68 × 4] [146 × 4] 3 Gentoo Adelie [119 × 4] [146 × 4] 4 Adelie Chinstrap [146 × 4] [68 × 4] 5 Chinstrap Chinstrap [68 × 4] [68 × 4] 6 Gentoo Chinstrap [119 × 4] [68 × 4] 7 Adelie Gentoo [146 × 4] [119 × 4] 8 Chinstrap Gentoo [68 × 4] [119 × 4] 9 Gentoo Gentoo [119 × 4] [119 × 4] 総当たり組み合わせ dat_rr <- dat %>% roundrobin(key = "species", rename = FALSE)

Slide 28

Slide 28 text

例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y, # yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) > dat_rr_mahaD # A tibble: 9 × 5 Var1 Var2 data.x data.y mahaD2 > > 1 Adelie vs. Adelie [146 × 4] [146 × 4] 2 Chinstrap vs. Adelie [68 × 4] [146 × 4] 3 Gentoo vs. Adelie [119 × 4] [146 × 4] 4 Adelie vs. Chinstrap [146 × 4] [68 × 4] 5 Chinstrap vs. Chinstrap [68 × 4] [68 × 4] 6 Gentoo vs. Chinstrap [119 × 4] [68 × 4] 7 Adelie vs. Gentoo [146 × 4] [119 × 4] 8 Chinstrap vs. Gentoo [68 × 4] [119 × 4] 9 Gentoo vs. Gentoo [119 × 4] [119 × 4]

Slide 29

Slide 29 text

例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y, # yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) dat_mahaD <- dat_rr_mahaD %>% select(!data.y) %>% unnest(everything())

Slide 30

Slide 30 text

> dat_mahaD # A tibble: 999 × 7 Var1 Var2 bill_length_mm bill_…¹ flipp…² body_…³ mahaD2 1 Adelie vs. Adelie -0.895 0.780 -1.42 -0.568 2.84 2 Adelie vs. Adelie -0.822 0.119 -1.07 -0.506 1.95 3 Adelie vs. Adelie -0.675 0.424 -0.426 -1.19 4.26 4 Adelie vs. Adelie -1.33 1.08 -0.568 -0.940 3.32 5 Adelie vs. Adelie -0.858 1.74 -0.782 -0.692 5.57 6 Adelie vs. Adelie -0.931 0.323 -1.42 -0.723 2.47 7 Adelie vs. Adelie -0.876 1.24 -0.426 0.581 5.94 8 Adelie vs. Adelie -0.529 0.221 -1.35 -1.25 5.27 9 Adelie vs. Adelie -0.986 2.05 -0.711 -0.506 7.75 10 Adelie vs. Adelie -1.72 2.00 -0.212 0.240 15.2 # … with 989 more rows, and abbreviated variable names # ¹bill_depth_mm, ²flipper_length_mm, ³body_mass_g # ℹ Use `print(n = ...)` to see more rows 例えばマハラノビス距離

Slide 31

Slide 31 text

例えばマハラノビス距離 ggplot(dat_mahaD) + aes(mahaD2, color = Var1, fill = Var1) + geom_density(alpha = 0.5) + facet_wrap(~Var2)

Slide 32

Slide 32 text

総当たり組み合わせ Round-robin そう あ あ く devtools::install_github( "kilometer0101/roundrobin" )

Slide 33

Slide 33 text

Enjoy!