Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SappoRo.R_roundrobin
Search
kilometer
March 18, 2023
Programming
0
140
SappoRo.R_roundrobin
第10回Sapporo.Rで喋った際のスライドです。
kilometer
March 18, 2023
Tweet
Share
More Decks by kilometer
See All by kilometer
TokyoR#111_ANOVA
kilometer
2
870
TokyoR109.pdf
kilometer
1
460
TokyoR#108_NestedDataHandling
kilometer
0
790
TokyoR#107_R_GeoData
kilometer
0
420
TokyoR#104_DataProcessing
kilometer
1
690
TokyoR#103_DataProcessing
kilometer
0
880
TokyoR#102_RMarkdown
kilometer
1
640
TokyoR#101_RegressionAnalysis
kilometer
0
380
TokyoR#99_Divergence
kilometer
1
360
Other Decks in Programming
See All in Programming
AHC041解説
terryu16
0
560
Djangoアプリケーション 運用のリアル 〜問題発生から可視化、最適化への道〜 #pyconshizu
kashewnuts
1
140
【PHP】破壊的バージョンアップと戦った話〜決断と説得
satoshi256kbyte
0
110
chibiccをCILに移植した結果 (NGK2025S版)
kekyo
PRO
0
190
法律の脱レガシーに学ぶフロントエンド刷新
oguemon
5
670
Writing documentation can be fun with plugin system
okuramasafumi
0
110
ASP. NET CoreにおけるWebAPIの最新情報
tomokusaba
0
310
ゼロからの、レトロゲームエンジンの作り方
tokujiros
3
1.2k
Spring gRPC について / About Spring gRPC
mackey0225
0
200
チームリードになって変わったこと
isaka1022
0
160
混沌とした例外処理とエラー監視に秩序をもたらす
morihirok
19
3.4k
Multi Step Form, Decentralized Autonomous Organization
pumpkiinbell
1
180
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Adopting Sorbet at Scale
ufuk
74
9.2k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
27
1.9k
A Tale of Four Properties
chriscoyier
158
23k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.6k
Learning to Love Humans: Emotional Interface Design
aarron
274
40k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
39
1.9k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Optimizing for Happiness
mojombo
376
70k
The Invisible Side of Design
smashingmag
299
50k
Transcript
SappoRo.R #10 @kilometer00 2023.03.18 らくらく総当たり組み合わせ
Who!? Who?
Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング
医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: アンキロサウルス
宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!
宣伝2!! R⾔語の地域コミュニティ@東京です。 定期的にR⾔語に関する勉強会を開催しています。 次回は4⽉22⽇!! 初⼼者特集回です!!
総当たり組み合わせ Round-robin そう あ あ く
dat_nest <- palmerpenguins::penguins %>% dplyr::group_nest(species) データを畳み込む > dat_nest # A
tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] (息を吐くように)
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
base::expand.grid()関数 > dat_nest # A tibble: 3 × 2 species
data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] dat_nest$species
grid <- dat_nest$species %>% expand.grid(., .) base::expand.grid()関数 > grid Var1
Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo
dplyr::left_join()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap
Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo > dat_nest # A tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] ①対応づけて結合 ②対応づけて結合
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" )
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" ) ① ②
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) > dat_rr_rename # A tibble: 9 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) key <- "species" x <- stringr::str_c(key, ".x") y <- stringr::str_c(key, ".y") dat_rr_rename <- dat_rr %>% rename(!!x := Var1) %>% rename(!!y := Var2) 別解 {rlang}パッケージの演算⼦
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
grid <- dat_nest$species %>% expand.grid(., .) %>% subset(unclass(Var1) < unclass(Var2))
%>% tibble::as_tibble() base::subset()関数 > grid # A tibble: 3 × 2 Var1 Var2 <fct> <fct> 1 Adelie Chinstrap 2 Adelie Gentoo 3 Chinstrap Gentoo
という変換を パッケージにしました。 devtools::install_github( "kilometer0101/roundrobin" ) (4回⼿打ちしたら⾯倒臭くなったので)
roundrobin::roundrobin()関数 # A tibble: 9 × 4 species.x species.y data.x
data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species")
library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species", combination = TRUE) roundrobin::roundrobin()関数
# A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7]
使ってみますか。
library(tidyverse) library(palmerpenguins) library(roundrobin) dat <- palmerpenguins::penguins %>% na.omit() %>% #
NA除去 mutate_at( vars(c(contains("mm"), contains("g"))), ~ (. - mean(.)) / sd(.) # 標準化 ) %>% select(species, contains("mm"), contains("g")) 前処理
> dat # A tibble: 333 × 5 species bill_length_mm
bill_depth_mm flipper_length_mm body_mass_g <fct> <dbl> <dbl> <dbl> <dbl> 1 Adelie -0.895 0.780 -1.42 -0.568 2 Adelie -0.822 0.119 -1.07 -0.506 3 Adelie -0.675 0.424 -0.426 -1.19 4 Adelie -1.33 1.08 -0.568 -0.940 5 Adelie -0.858 1.74 -0.782 -0.692 6 Adelie -0.931 0.323 -1.42 -0.723 7 Adelie -0.876 1.24 -0.426 0.581 8 Adelie -0.529 0.221 -1.35 -1.25 9 Adelie -0.986 2.05 -0.711 -0.506 10 Adelie -1.72 2.00 -0.212 0.240 # … with 323 more rows # i Use `print(n = ...)` to see more rows 前処理
dat_long <- dat %>% rowid_to_column("id") %>% pivot_longer( cols = !species,
names_to = "parameter", values_to = "value" ) %>% group_by(parameter) %>% ungroup() .y <- dat_long %>% ungroup() %>% group_by(species) %>% summarise( mean_id = mean(id), min_id = min(id) ) dat_long %>% ggplot() + aes(parameter, id) + geom_tile(aes(fill = value)) + geom_hline( yintercept = max(dat_long$id) ) + geom_hline(data = .y, aes(yintercept = min_id)) + scale_y_continuous( breaks = .y$mean_id, labels = .y$species, expand = c(0, 0)) + theme( axis.title = element_blank(), axis.text.x = element_text( angle = 30, hjust = 1 ) ) 可視化コード (ちょちょいのちょい)
可視化
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,4]>> <list<tibble[,4]>> 1 Adelie Adelie [146 × 4] [146 × 4] 2 Chinstrap Adelie [68 × 4] [146 × 4] 3 Gentoo Adelie [119 × 4] [146 × 4] 4 Adelie Chinstrap [146 × 4] [68 × 4] 5 Chinstrap Chinstrap [68 × 4] [68 × 4] 6 Gentoo Chinstrap [119 × 4] [68 × 4] 7 Adelie Gentoo [146 × 4] [119 × 4] 8 Chinstrap Gentoo [68 × 4] [119 × 4] 9 Gentoo Gentoo [119 × 4] [119 × 4] 総当たり組み合わせ dat_rr <- dat %>% roundrobin(key = "species", rename = FALSE)
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) > dat_rr_mahaD # A tibble: 9 × 5 Var1 Var2 data.x data.y mahaD2 <fct> <chr> <list<tibble[,4]>> <list<tibble[,4]>> <list> 1 Adelie vs. Adelie [146 × 4] [146 × 4] <dbl [146]> 2 Chinstrap vs. Adelie [68 × 4] [146 × 4] <dbl [68]> 3 Gentoo vs. Adelie [119 × 4] [146 × 4] <dbl [119]> 4 Adelie vs. Chinstrap [146 × 4] [68 × 4] <dbl [146]> 5 Chinstrap vs. Chinstrap [68 × 4] [68 × 4] <dbl [68]> 6 Gentoo vs. Chinstrap [119 × 4] [68 × 4] <dbl [119]> 7 Adelie vs. Gentoo [146 × 4] [119 × 4] <dbl [146]> 8 Chinstrap vs. Gentoo [68 × 4] [119 × 4] <dbl [68]> 9 Gentoo vs. Gentoo [119 × 4] [119 × 4] <dbl [119]>
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) dat_mahaD <- dat_rr_mahaD %>% select(!data.y) %>% unnest(everything())
> dat_mahaD # A tibble: 999 × 7 Var1 Var2
bill_length_mm bill_…¹ flipp…² body_…³ mahaD2 <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Adelie vs. Adelie -0.895 0.780 -1.42 -0.568 2.84 2 Adelie vs. Adelie -0.822 0.119 -1.07 -0.506 1.95 3 Adelie vs. Adelie -0.675 0.424 -0.426 -1.19 4.26 4 Adelie vs. Adelie -1.33 1.08 -0.568 -0.940 3.32 5 Adelie vs. Adelie -0.858 1.74 -0.782 -0.692 5.57 6 Adelie vs. Adelie -0.931 0.323 -1.42 -0.723 2.47 7 Adelie vs. Adelie -0.876 1.24 -0.426 0.581 5.94 8 Adelie vs. Adelie -0.529 0.221 -1.35 -1.25 5.27 9 Adelie vs. Adelie -0.986 2.05 -0.711 -0.506 7.75 10 Adelie vs. Adelie -1.72 2.00 -0.212 0.240 15.2 # … with 989 more rows, and abbreviated variable names # ¹bill_depth_mm, ²flipper_length_mm, ³body_mass_g # ℹ Use `print(n = ...)` to see more rows 例えばマハラノビス距離
例えばマハラノビス距離 ggplot(dat_mahaD) + aes(mahaD2, color = Var1, fill = Var1)
+ geom_density(alpha = 0.5) + facet_wrap(~Var2)
総当たり組み合わせ Round-robin そう あ あ く devtools::install_github( "kilometer0101/roundrobin" )
Enjoy!