Japan.R 2015: {purrr} による非テーブルデータの処理

{purrr} ʹΑΔ ඇςʔϒϧσʔλͷॲཧ

ࣗݾ঺հ • @sinhrks • ۀ຿: σʔλ෼ੳ (ϝʔΧʔ) • ར༻ݴޠ: Python,
Rͱ͔ • झຯ: OSS ׆ಈ • pandas ※ ίϛολ (Python ύοέʔδ) ※ R ͷ data.frame + dplyr + tidyr + readr + haven + lubridate + stringr + ggplot2… (ҎԼུ ͷΑ͏ͳ΋ͷ

ຊ೔͸ Θ͚Ͱ͸͋Γ·ͤΜ Λ dis Γʹདྷͨ

ͳͥ Japan.R ʹ? • Rͷύοέʔδ࡞ͬͨ (JU)VC"XBSET ݱࡏ • ͜ͷR։ൃऀ͕͍͢͝ (2015)
by @u_ribo બग़ • http://uribo.hatenablog.com/entry/2015/12/02/180004

ͳͥ Λʁ

ͳͥRΛ? • ϓϩάϥϛϯά͕ઐ໳Ͱ͸ͳ͍͕ɺઐ໳෼໺Ͱ ౷ܭ/ػցֶश͕ඞཁͩͬͨͷͰ • ΤϯδχΞ͕ͩɺ౷ܭ/ػցֶश͕ඞཁʹͳͬͨ ͷͰ

૝ఆ͢ΔϨϕϧͱΰʔϧ • લఏ: dplyr, ggplot2 ͸গ͠࢖ͬͨ͜ͱ͕͋Δ • ΰʔϧ: {purrr} ͷجຊతͳ࢖͍ํΛ஌Δ
• ͜ΜͳίʔυΛॻ͔ͳͯ͘΋Α͘ͳΔ glm.fit1 <- glm(y ~ x, …) glm.fit2 <- glm(y ~ x + z, …) … predict(glm.fit1, newdata = newdata) predict(glm.fit2, newdata = newdata) …

ຊ೔ͷ಺༰ {purrr} Ͱσʔλॲཧ

σʔλॲཧͷྲྀΕ σʔλͷ४උ ϞσϦϯά ධՁ

RʹΑΔσʔλॲཧ EBUBGSBNF ϦετϕΫτϧ Ϟσϧ HHQMPU UJEZS EQMZS SFBES SWFTU SMJTU
QVSSS ֤छ౷ܭػցֶश ύοέʔδ CSPPN DBSFU HHGPSUJGZ

data.frameͷॲཧ

data.frameͷॲཧ • σʔλͷॲཧʹ͸ύλʔϯ͕͋Δ • ྫ: data.frame Λ ͋Δྻͷ஋ͰάϧʔϓΘ͚͠ɺ ྻ͝ͱʹฏۉ஋Λܭࢉ͠ɺ݁ՌΛ data.frame
ʹ ͍ͨ͠ɻ ෼ׂ ద༻ ݁߹

plyr • ૊ΈࠐΈ: library(plyr) plyr::ddply(iris, .(Species), plyr::colwise(mean)) ## Species Sepal.Length
Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 iris_group <- split(iris, iris$Species) res <- sapply(iris_group, function(x) { sapply(x[, 1:4], mean) }) as.data.frame(t(res)) ## Sepal.Length Sepal.Width Petal.Length Petal.Width ## setosa 5.006 3.428 1.462 0.246 ## versicolor 5.936 2.770 4.260 1.326 ## virginica 6.588 2.974 5.552 2.026 • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ

dplyr • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ library(plyr) plyr::ddply(iris,
.(Species), plyr::colwise(mean)) ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 library(dplyr) dplyr::summarise_each(dplyr::group_by(iris, Species), dplyr::funs(mean)) ## Source: local data frame [3 x 5] ## ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## (fctr) (dbl) (dbl) (dbl) (dbl) ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 • dplyr: ॲཧ͝ͱʹؔ਺Λద༻

ύΠϓԋࢉࢠ %>% • magrittr iris %>% dplyr::group_by(Species) %>% dplyr::summarise_each(dplyr::funs(mean)) ##
Source: local data frame [3 x 5] ## ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## (fctr) (dbl) (dbl) (dbl) (dbl) ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 x %>% f(…) == f(x, …)

data.frameͷॲཧ • ߟ͑ํ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ • ؔ਺͸ύΠϓԋࢉࢠ %>% Ͱ઀ଓ •
ಉ͜͡ͱΛ data.frame Ҏ֎Ͱ΋΍Γ͍ͨ

data.frame Ҏ֎ͷॲཧ

͜Μͳ͜ͱ͋Γ·ͤΜ͔ʁ • ͋ΔॲཧΛɺ • ܁Γฦ͍ͨ͠ • ෳ਺ͷର৅ʹߦ͍͍ͨ • ͦͷ··Ͱ͸ data.frame
ʹͰ͖ͳ͍σʔλ (JSON / XMLͳͲ) Λѻ͍͍ͨ

ྫ͑͹ σʔλΛ8FC"1*͔Β+40/Ͱऔಘͯ͠੔ܗ ͋Δ৚݅ʹ߹க͠ͳ͍σʔλ͸আ֎ EBUBGSBNFΛ࡞੒ ෳ਺ͷϞσϧΛ࡞੒ൺֱϓϩοτ

૊ΈࠐΈؔ਺Ͱ΍Δͱ… • Apply ؔ਺܈ • ߴ֊ؔ਺ ※ ܈ BQQMZ lapply(list(1,
2, 3), function (x) { x + 1 }) Map(function (x) { x + 1 }, list(1, 2, 3)) MBQQMZ TBQQMZ UBQQMZ NBQQMZ 3FEVDF 'JMUFS 'JOE .BQ /FHBUF 1PTJUJPO ແ໊ؔ਺ͷఆ͕ٛΊΜͲ͏ ؔ਺ͷద༻Ҏ֎ͷॲཧ͸Ͱ͖ͳ͍ ؔ਺͕ୈҰҾ਺ͷͨΊύΠϓԋࢉࢠ͕࢖͍ʹ͍͘ ֮͑ΒΕͳ͍ ※ ؔ਺ΛҾ਺΍ฦΓ஋ʹ͢Δؔ਺

$SFBUJWF$PNNPOT$$GSPNUIF1JYBCBZ ͳΜͱ͔ͳΒͳ͍͔ʜ

purrr !!! 1IPUPCZԶ

{purrr} ͱ͸

{purrr} ͱ͸ • RStudio BlogΑΓ • ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ •
ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ Purrr is a new package that fills in the missing pieces in R’s functional programming tools: it’s designed to make your pure functions purrr. Like many of my recent packages, it works with magrittr to allow you to express complex operations by combining simple pieces in a standard way.

ؔ਺ܕϓϩάϥϛϯάͱ͸? • (Ұҙͷఆ͕ٛ͋ΔΘ͚Ͱ͸ͳ͍͕) ෳ਺ͷؔ਺ͷ ૊Έ߹ΘͤʹΑͬͯ࠷ऴతʹ΍Γ͍ͨॲཧΛه ड़͍ͯ͘͠ελΠϧɻ • ؔ਺ܕϓϩάϥϛϯάΛ૝ఆͨ͠ݴޠ͸ؔ਺ ܕݴޠͱݺ͹ΕΔ

R ͸ؔ਺ܕݴޠ͔? • Yes (by Hadley Wickham in “Advanced R”)
. • ৄ͘͠͸ “Rݴޠపఈղઆ” (๜༁͕12/23ൃച)

{purrr} ͷಛ௃ • ϥϜμࣜ • ύΠϓԋࢉࢠͰ઀ଓͰ͖Δؔ਺܈ • ஫ҙ఺ • όʔδϣϯ͸0.1ɻࠓޙ
ഁյతͳมߋ͕͋ΔՄೳੑ͕͋Δɻ • ຊ೔͸ओཁͳؔ਺ͷΈ͝঺հ (͜ΕͰ΄ͱΜͲͷ͜ͱ͸Ͱ͖Δ)ɻ

ϥϜμࣜ • ໊લΛ΋ͨͳ͍ؔ਺ (ແ໊ؔ਺) Λγϯϓϧʹهड़ • R ඪ४ • {purrr}
ͷϥϜμࣜ • {purrr} ͷؔ਺ͷதͰ࢖͏ (ߴ֊ؔ਺) function(x) { x + 1 } ~ . + 1 map(x, ~ . + 1) υοτ ͕Ҿ਺ʹରԠ

{purrr} ͷؔ਺܈ (Ұ෦) ඪ४ QVSSS BQQMZܥ NBQ 'JMUFS LFFQ 3FEVDF
SFEVDF TQMJU TQMJU@CZ TPSU TPSU@CZ

map • ؔ਺Λ֤ཁૉʹద༻ map(c(1, 2, 3), ~ . + 1)
## x = list 3 (216 bytes) ## . [[1]] = double 1= 2 ## . [[2]] = double 1= 3 ## . [[3]] = double 1= 4 map(list(a = 1, b = 2, c = 3), ~ . + 1) ## x = list 3 (544 bytes) ## . a = double 1= 2 ## . b = double 1= 3 ## . c = double 1= 4 map(c(1, 2, 3), ~ . + 1) == list(1 + 1, 2 + 1, 3 + 1)

map2 • 2ͭͷҾ਺ʹରͯؔ͠਺ద༻ map2(c(1, 2, 3), c(4, 5, 6), ~
.x * .y) ## x = list 3 (216 bytes) ## . [[1]] = double 1= 4 ## . [[2]] = double 1= 10 ## . [[3]] = double 1= 18 map2(c(1, 2, 3), c(4, 5, 6), ~ .x * .y) == list(1 * 4, 2 * 5, 3 * 6) ͻͱͭΊ͔ΒͷཁૉΛY ;ͨͭΊ͔ΒͷཁૉΛZͰࢀর

map_xxx • ݁ՌΛϕΫτϧͰऔಘɺܕͷࢦఆ͕ඞཁ map_int(list(a = 1L, b = 2L, c
= 3L), ~ . + 1L) ## a b c ## 2 3 4 NBQ@YYY ϕΫτϧͷܕ NBQ@MHM MPHJDBM NBQ@DIS DIBSBDUFS NBQ@JOU JOUFHFS NBQ@ECM OVNSJD

keep • ؔ਺ͷ৚݅ʹ͋ͯ͸·ΔཁૉΛநग़ keep(c(1, 2, 3), ~ . >= 2)
## [1] 2 3 keep(list(a = 1, b = 2, c = 3), ~ . >= 2) ## x = list 2 (416 bytes) ## . b = double 1= 2 ## . c = double 1= 3

reduce • ͨͨΈࠐΈ reduce(c(1, 2, 3), `+`) ## [1] 6
reduce(list(a = 1, b = 2, c = 3), `+`) ## [1] 6 reduce(c(1, 2, 3), `+`) == ((1 + 2) + 3)

split_by • base::split + ϥϜμࣜ split_by(c(1, 2, 3), ~ .
%% 2) ## x = list 2 (424 bytes) ## . 0 = double 1= 2 ## . 1 = double 2= 1 3 split_by(list(a = 1, b = 2, c = 3), ~ . %% 2) ## x = list 2 (1040 bytes) ## . 0 = list 1 ## . . b = double 1= 2 ## . 1 = list 2 ## . . a = double 1= 1 ## . . c = double 1= 3

sort_by • ιʔτ sort_by(c(2, -3, 1), ~ abs(.)) ## [1]
1 2 -3 sort_by(list(a = 2, b = - 3, c = 1), ~ abs(.)) ## x = list 3 (544 bytes) ## . c = double 1= 1 ## . a = double 1= 2 ## . b = double 1= -3

2. αϯϓϧ

{purrr} • 1. μϛʔσʔλͷ࡞੒ • 2. Ϧετ͔Βͷσʔλબ୒ • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
• 4. ϓϩοτ

1. μϛʔσʔλͷ࡞੒ • ద౰ͳμϛʔσʔλΛ࡞Γ͍ͨ • ҎԼͷଐੑΛ΋ͭϦετͷϦετ • name: ໊લ •
age: ೥ྸ • likes: ޷͖ͳਓ

1. μϛʔσʔλͷ࡞੒ ndata <- c('hoxo-m', 'hoxo-eros', 'hoxo-um', 'hoxo-uri') dummies <-
map(1:5, ~ list(name = sample(ndata, size = 1), age = sample(25:35, size = 1), likes = sample(ndata, size = 2))) dummies ## x = list 5 (3632 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 29 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-uri hoxo-eros ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 25 ## . . likes = character 2= hoxo-m hoxo-uri ## . [[4]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 29 ## . . likes = character 2= hoxo-uri hoxo-eros ## . [[5]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 35 ## . . likes = character 2= hoxo-eros hoxo-um ϦετΛฦ͢ϥϜμࣜ

1. μϛʔσʔλͷ࡞੒ dummies <- map(ndata, ~ list(name = ., age
= sample(30:35, size = 1), likes = sample(ndata[ndata != .], size = sample(1:(length(ndata) - 1), size = 1)))) dummies ## x = list 4 (2896 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[4]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m

1. μϛʔσʔλͷ࡞੒ gen <- function(name) { age = sample(30:35, size
= 1) size = sample(1:(length(ndata) - 1), size = 1) likes = sample(ndata[ndata != name], size = size) return (list(name = name, age = age, likes = likes)) } dummies <- map(ndata, ~ gen(.)) dummies ## x = list 4 (2896 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[4]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m ී௨ͷؔ਺Λ౉ͯ͠΋Α͍

1. μϛʔσʔλͷ࡞੒ ndata %>% map(~ list(name = ., age =
sample(30:35, size = 1))) %>% dplyr::bind_rows() EBUBGSBNFʹ͚ͨ͠Ε͹ EQMZSCJOE@SPXT ## Source: local data frame [4 x 2] ## ## name age ## (chr) (int) ## 1 hoxo-m 34 ## 2 hoxo-eros 35 ## 3 hoxo-um 32 ## 4 hoxo-uri 34

2. Ϧετ͔Βͷσʔλબ୒ • ؆୯ͳૢ࡞͸ {purrr} Ͱ΋Ͱ͖Δ • Ϩίʔυͷબ୒ • ଐੑͷબ୒
• ࢀߟ: {rlist} ͱͷൺֱ • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <1> • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <2>

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ .$name == 'hoxo-m') ## x =
list 1 (752 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri OBNFଐੑ͕bIPYPN`ͷ ϨίʔυΛબ୒

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ .$age > 30) ## x =
list 3 (2256 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m BHFଐੑ͕ΑΓେ͖͍ ͓͡͞ΜΛબ୒ ˞࣮ࡍͷ೥ྸͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ 'hoxo-m' %in% .$likes) ## x =
list 3 (2192 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[2]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[3]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m bIPYPN`͞Μͷ͜ͱ͕ ޷͖ͳਓΛநग़ ˞ࣄ࣮ͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ 'hoxo-m' %in% .$likes) %>% map(~ .$name)
## x = list 3 (376 bytes) ## . [[1]] = character 1= hoxo-eros ## . [[2]] = character 1= hoxo-um ## . [[3]] = character 1= hoxo-uri ϑΟϧλ݁Ռ͔Β OBNFଐੑͷΈΛநग़ keep(dummies, ~ 'hoxo-m' %in% .$likes) %>% map('name') ## x = list 3 (376 bytes) ## . [[1]] = character 1= hoxo-eros ## . [[2]] = character 1= hoxo-um ## . [[3]] = character 1= hoxo-uri NBQʹจࣈྻΛ ౉ͯ͠΋Α͍

• σʔλղੳͷͨΊͷ౷ܭϞσϦϯάೖ໳ • ୈ3ষΑΓ • y: छࢠ਺ • x: ২෺ͷαΠζ
• f: ࢪං͋Γ(T) / ͳ͠ 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ df <- read.csv('data3a.csv') head(df) ## y x f ## 1 6 8.31 C ## 2 6 9.44 C ## 3 6 9.50 C ## 4 12 9.07 C ## 5 10 10.16 C ## 6 4 8.32 C

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ formulas <- list(mod1 = y ~ x, mod2
= y ~ f, mod3 = y ~ x + f) results <- purrr::map(formulas, ~ glm(formula = ., family = poisson, data = df)) results[[1]] Call: glm(formula = .x, family = .y, data = df) Coefficients: (Intercept) x 1.29172 0.07566 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 89.51 Residual Deviance: 84.99 AIC: 474.8 GPSNVMBͷϦετΛ࡞੒ ֤GPSNVMBͰ ϞσϧΛ࡞੒

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ map(results, logLik) ## x = list 3 (2128
bytes) ## . mod1 = double 1( logLik )= -235.39 ## . A nobs = integer 1= 100 ## . A df = integer 1= 2 ## . mod2 = double 1( logLik )= -237.63 ## . A nobs = integer 1= 100 ## . A df = integer 1= 2 ## . mod3 = double 1( logLik )= -235.29 ## . A nobs = integer 1= 100 ## . A df = integer 1= 3 ݁Ռ͢΂ͯʹ MPH-JLؔ਺Λద༻

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ map(results, AIC) ## x = list 3 (544
bytes) ## . mod1 = double 1= 474.77 ## . mod2 = double 1= 479.25 ## . mod3 = double 1= 476.59 sort_by(results, AIC) %>% names() ## [1] "mod1" "mod3" "mod2" ݁Ռ͢΂ͯʹ "*$ؔ਺Λద༻ "*$ͷॱʹฒ΂ସ͑ɺ Ϟσϧ໊Λදࣔ

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ formulas <- c(y ~ x, y ~ x,
y ~ f) families <- c(poisson, gaussian, poisson) results2 <- purrr::map2(formulas, families, ~ glm(formula = .x, family = .y, data = df)) results2[[1]] Call: glm(formula = .x, family = .y, data = df) Coefficients: (Intercept) x 1.29172 0.07566 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 89.51 Residual Deviance: 84.99 AIC: 474.8 ෳ਺ύϥϝʔλͷ৔߹͸ NBQ

4. ϓϩοτ plot(results[[1]], which = 2) map(results, ~ plot(., which
= 2)) ݁Ռͷ͏ͪͻͱͭΛ 22ϓϩοτ ͢΂ͯΛ22ϓϩοτ ॱ൪ʹදࣔ͞ΕΔ αϒϓϩοτ͚ͨ͠Ε͹QBS

͓·͚

{purrr} ͱ૊Έ߹Θͤͯ ಛʹ͏Ε͍͠ύοέʔδ

{caret} • ػցֶशશ෦ೖΓ library(mlbench) library(caret) data(PimaIndiansDiabetes) control <- trainControl(method =
"repeatedcv", number = 10, repeats = 3) methods <- c('gbm', 'rpart', 'svmRadial') trained <- methods %>% purrr::map(~ train(diabetes ~ ., data = PimaIndiansDiabetes, method = ., trControl = control)) resampled <- resamples(trained) summary(resampled) ## Call: ## summary.resamples(object = resampled) ## ## Models: Model1, Model2, Model3 ## Number of resamples: 30 ## ## Accuracy ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's ## Model1 0.6883 0.7435 0.7662 0.7682 0.7915 0.8442 0 ## Model2 0.6711 0.7273 0.7451 0.7499 0.7785 0.8312 0 ## Model3 0.6883 0.7403 0.7662 0.7678 0.7922 0.8312 0 ## ## Kappa ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's ## Model1 0.3037 0.4297 0.4763 0.4711 0.5268 0.6578 0 ## Model2 0.2339 0.3657 0.4078 0.4245 0.4733 0.6128 0 ## Model3 0.2518 0.4043 0.4610 0.4607 0.5138 0.6165 0 ൺֱ͍ͨ͠ϞσϧΛϦετʹ͠ɺ ֤ϞσϧΛUSBJO

{broom} • ϞσϧΛ data.frame ʹม׵͢Δ library(broom) glance(results[[1]]) ## null.deviance df.null
logLik AIC BIC deviance df.residual ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.993 98 ؔ਺ ֓ཁ UJEZ Ϟσϧͷཁ໿ BVHNFOU Ϟσϧͷ৘ใΛݩͷEBUBGSBNFʹ෇༩͢Δ HMBODF Ұߦͷཁ໿ • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱͷ݁ՌΛྲྀ༻

{broom} • ϞσϧΛ data.frame ʹม׵͢Δ map(results, broom::glance) %>% dplyr::bind_rows() ##
Source: local data frame [3 x 7] ## ## null.deviance df.null logLik AIC BIC deviance df.residual ## (dbl) (int) (dbl) (dbl) (dbl) (dbl) (int) ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.99300 98 ## 2 89.50694 99 -237.6273 479.2545 484.4649 89.47501 98 ## 3 89.50694 99 -235.2937 476.5874 484.4029 84.80793 97 ෳ਺ͷϞσϧͷཁ໿Λ EBUBGSBNFʹ

{ggfortify} • ϞσϧΛ ggplot2::autoplot Λར༻ͯ͠ඳը͢Δ • {purrr} ͰϞσϧͷϦετΛ࡞Δ -> autoplot
library(ggfortify) autoplot(results, which = 2)

·ͱΊ • {purrr} ͸ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ • ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ

Enjoy!

Japan.R 2015: {purrr} による非テーブルデータの処理

Japan.R 2015: {purrr} による非テーブルデータの処理

More Decks by Sinhrks

Other Decks in Technology

Featured

Transcript