Slide 1

Slide 1 text

{purrr} ʹΑΔ ඇςʔϒϧσʔλͷॲཧ

Slide 2

Slide 2 text

ࣗݾ঺հ • @sinhrks • ۀ຿: σʔλ෼ੳ (ϝʔΧʔ) • ར༻ݴޠ: Python, Rͱ͔ • झຯ: OSS ׆ಈ • pandas ※ ίϛολ (Python ύοέʔδ) ※ R ͷ data.frame + dplyr + tidyr + readr + haven + lubridate + stringr + ggplot2… (ҎԼུ ͷΑ͏ͳ΋ͷ

Slide 3

Slide 3 text

ຊ೔͸ Θ͚Ͱ͸͋Γ·ͤΜ Λ dis Γʹདྷͨ

Slide 4

Slide 4 text

ͳͥ Japan.R ʹ? • Rͷύοέʔδ࡞ͬͨ (JU)VC"XBSET ݱࡏ • ͜ͷR։ൃऀ͕͍͢͝ (2015) by @u_ribo બग़ • http://uribo.hatenablog.com/entry/2015/12/02/180004

Slide 5

Slide 5 text

ͳͥ Λʁ

Slide 6

Slide 6 text

ͳͥRΛ? • ϓϩάϥϛϯά͕ઐ໳Ͱ͸ͳ͍͕ɺઐ໳෼໺Ͱ ౷ܭ/ػցֶश͕ඞཁͩͬͨͷͰ • ΤϯδχΞ͕ͩɺ౷ܭ/ػցֶश͕ඞཁʹͳͬͨ ͷͰ

Slide 7

Slide 7 text

૝ఆ͢ΔϨϕϧͱΰʔϧ • લఏ: dplyr, ggplot2 ͸গ͠࢖ͬͨ͜ͱ͕͋Δ • ΰʔϧ: {purrr} ͷجຊతͳ࢖͍ํΛ஌Δ • ͜ΜͳίʔυΛॻ͔ͳͯ͘΋Α͘ͳΔ glm.fit1 <- glm(y ~ x, …) glm.fit2 <- glm(y ~ x + z, …) … predict(glm.fit1, newdata = newdata) predict(glm.fit2, newdata = newdata) …

Slide 8

Slide 8 text

ຊ೔ͷ಺༰ {purrr} Ͱσʔλॲཧ

Slide 9

Slide 9 text

σʔλॲཧͷྲྀΕ σʔλͷ४උ ϞσϦϯά ධՁ

Slide 10

Slide 10 text

RʹΑΔσʔλॲཧ EBUBGSBNF ϦετϕΫτϧ Ϟσϧ HHQMPU UJEZS EQMZS SFBES SWFTU SMJTU QVSSS ֤छ౷ܭػցֶश ύοέʔδ CSPPN DBSFU HHGPSUJGZ

Slide 11

Slide 11 text

data.frameͷॲཧ

Slide 12

Slide 12 text

data.frameͷॲཧ • σʔλͷॲཧʹ͸ύλʔϯ͕͋Δ • ྫ: data.frame Λ ͋Δྻͷ஋ͰάϧʔϓΘ͚͠ɺ ྻ͝ͱʹฏۉ஋Λܭࢉ͠ɺ݁ՌΛ data.frame ʹ ͍ͨ͠ɻ ෼ׂ ద༻ ݁߹

Slide 13

Slide 13 text

plyr • ૊ΈࠐΈ: library(plyr) plyr::ddply(iris, .(Species), plyr::colwise(mean)) ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 iris_group <- split(iris, iris$Species) res <- sapply(iris_group, function(x) { sapply(x[, 1:4], mean) }) as.data.frame(t(res)) ## Sepal.Length Sepal.Width Petal.Length Petal.Width ## setosa 5.006 3.428 1.462 0.246 ## versicolor 5.936 2.770 4.260 1.326 ## virginica 6.588 2.974 5.552 2.026 • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ

Slide 14

Slide 14 text

dplyr • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ library(plyr) plyr::ddply(iris, .(Species), plyr::colwise(mean)) ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 library(dplyr) dplyr::summarise_each(dplyr::group_by(iris, Species), dplyr::funs(mean)) ## Source: local data frame [3 x 5] ## ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## (fctr) (dbl) (dbl) (dbl) (dbl) ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 • dplyr: ॲཧ͝ͱʹؔ਺Λద༻

Slide 15

Slide 15 text

ύΠϓԋࢉࢠ %>% • magrittr iris %>% dplyr::group_by(Species) %>% dplyr::summarise_each(dplyr::funs(mean)) ## Source: local data frame [3 x 5] ## ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## (fctr) (dbl) (dbl) (dbl) (dbl) ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 x %>% f(…) == f(x, …)

Slide 16

Slide 16 text

data.frameͷॲཧ • ߟ͑ํ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ • ؔ਺͸ύΠϓԋࢉࢠ %>% Ͱ઀ଓ • ಉ͜͡ͱΛ data.frame Ҏ֎Ͱ΋΍Γ͍ͨ

Slide 17

Slide 17 text

data.frame Ҏ֎ͷॲཧ

Slide 18

Slide 18 text

͜Μͳ͜ͱ͋Γ·ͤΜ͔ʁ • ͋ΔॲཧΛɺ • ܁Γฦ͍ͨ͠ • ෳ਺ͷର৅ʹߦ͍͍ͨ • ͦͷ··Ͱ͸ data.frame ʹͰ͖ͳ͍σʔλ (JSON / XMLͳͲ) Λѻ͍͍ͨ

Slide 19

Slide 19 text

ྫ͑͹ σʔλΛ8FC"1*͔Β+40/Ͱऔಘͯ͠੔ܗ ͋Δ৚݅ʹ߹க͠ͳ͍σʔλ͸আ֎ EBUBGSBNFΛ࡞੒ ෳ਺ͷϞσϧΛ࡞੒ൺֱϓϩοτ

Slide 20

Slide 20 text

૊ΈࠐΈؔ਺Ͱ΍Δͱ… • Apply ؔ਺܈ • ߴ֊ؔ਺ ※ ܈ BQQMZ lapply(list(1, 2, 3), function (x) { x + 1 }) Map(function (x) { x + 1 }, list(1, 2, 3)) MBQQMZ TBQQMZ UBQQMZ NBQQMZ 3FEVDF 'JMUFS 'JOE .BQ /FHBUF 1PTJUJPO ແ໊ؔ਺ͷఆ͕ٛΊΜͲ͏ ؔ਺ͷద༻Ҏ֎ͷॲཧ͸Ͱ͖ͳ͍ ؔ਺͕ୈҰҾ਺ͷͨΊύΠϓԋࢉࢠ͕࢖͍ʹ͍͘ ֮͑ΒΕͳ͍ ※ ؔ਺ΛҾ਺΍ฦΓ஋ʹ͢Δؔ਺

Slide 21

Slide 21 text

$SFBUJWF$PNNPOT$$GSPNUIF1JYBCBZ ͳΜͱ͔ͳΒͳ͍͔ʜ

Slide 22

Slide 22 text

purrr !!! 1IPUPCZԶ

Slide 23

Slide 23 text

{purrr} ͱ͸

Slide 24

Slide 24 text

{purrr} ͱ͸ • RStudio BlogΑΓ • ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ • ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ Purrr is a new package that fills in the missing pieces in R’s functional programming tools: it’s designed to make your pure functions purrr. Like many of my recent packages, it works with magrittr to allow you to express complex operations by combining simple pieces in a standard way.

Slide 25

Slide 25 text

ؔ਺ܕϓϩάϥϛϯάͱ͸? • (Ұҙͷఆ͕ٛ͋ΔΘ͚Ͱ͸ͳ͍͕) ෳ਺ͷؔ਺ͷ ૊Έ߹ΘͤʹΑͬͯ࠷ऴతʹ΍Γ͍ͨॲཧΛه ड़͍ͯ͘͠ελΠϧɻ • ؔ਺ܕϓϩάϥϛϯάΛ૝ఆͨ͠ݴޠ͸ؔ਺ ܕݴޠͱݺ͹ΕΔ

Slide 26

Slide 26 text

R ͸ؔ਺ܕݴޠ͔? • Yes (by Hadley Wickham in “Advanced R”) . • ৄ͘͠͸ “Rݴޠపఈղઆ” (๜༁͕12/23ൃച)

Slide 27

Slide 27 text

{purrr} ͷಛ௃ • ϥϜμࣜ • ύΠϓԋࢉࢠͰ઀ଓͰ͖Δؔ਺܈ • ஫ҙ఺ • όʔδϣϯ͸0.1ɻࠓޙ ഁյతͳมߋ͕͋ΔՄೳੑ͕͋Δɻ • ຊ೔͸ओཁͳؔ਺ͷΈ͝঺հ (͜ΕͰ΄ͱΜͲͷ͜ͱ͸Ͱ͖Δ)ɻ

Slide 28

Slide 28 text

ϥϜμࣜ • ໊લΛ΋ͨͳ͍ؔ਺ (ແ໊ؔ਺) Λγϯϓϧʹهड़ • R ඪ४ • {purrr} ͷϥϜμࣜ • {purrr} ͷؔ਺ͷதͰ࢖͏ (ߴ֊ؔ਺) function(x) { x + 1 } ~ . + 1 map(x, ~ . + 1) υοτ ͕Ҿ਺ʹରԠ

Slide 29

Slide 29 text

{purrr} ͷؔ਺܈ (Ұ෦) ඪ४ QVSSS BQQMZܥ NBQ 'JMUFS LFFQ 3FEVDF SFEVDF TQMJU TQMJU@CZ TPSU TPSU@CZ

Slide 30

Slide 30 text

map • ؔ਺Λ֤ཁૉʹద༻ map(c(1, 2, 3), ~ . + 1) ## x = list 3 (216 bytes) ## . [[1]] = double 1= 2 ## . [[2]] = double 1= 3 ## . [[3]] = double 1= 4 map(list(a = 1, b = 2, c = 3), ~ . + 1) ## x = list 3 (544 bytes) ## . a = double 1= 2 ## . b = double 1= 3 ## . c = double 1= 4 map(c(1, 2, 3), ~ . + 1) == list(1 + 1, 2 + 1, 3 + 1)

Slide 31

Slide 31 text

map2 • 2ͭͷҾ਺ʹରͯؔ͠਺ద༻ map2(c(1, 2, 3), c(4, 5, 6), ~ .x * .y) ## x = list 3 (216 bytes) ## . [[1]] = double 1= 4 ## . [[2]] = double 1= 10 ## . [[3]] = double 1= 18 map2(c(1, 2, 3), c(4, 5, 6), ~ .x * .y) == list(1 * 4, 2 * 5, 3 * 6) ͻͱͭΊ͔ΒͷཁૉΛY ;ͨͭΊ͔ΒͷཁૉΛZͰࢀর

Slide 32

Slide 32 text

map_xxx • ݁ՌΛϕΫτϧͰऔಘɺܕͷࢦఆ͕ඞཁ map_int(list(a = 1L, b = 2L, c = 3L), ~ . + 1L) ## a b c ## 2 3 4 NBQ@YYY ϕΫτϧͷܕ NBQ@MHM MPHJDBM NBQ@DIS DIBSBDUFS NBQ@JOU JOUFHFS NBQ@ECM OVNSJD

Slide 33

Slide 33 text

keep • ؔ਺ͷ৚݅ʹ͋ͯ͸·ΔཁૉΛநग़ keep(c(1, 2, 3), ~ . >= 2) ## [1] 2 3 keep(list(a = 1, b = 2, c = 3), ~ . >= 2) ## x = list 2 (416 bytes) ## . b = double 1= 2 ## . c = double 1= 3

Slide 34

Slide 34 text

reduce • ͨͨΈࠐΈ reduce(c(1, 2, 3), `+`) ## [1] 6 reduce(list(a = 1, b = 2, c = 3), `+`) ## [1] 6 reduce(c(1, 2, 3), `+`) == ((1 + 2) + 3)

Slide 35

Slide 35 text

split_by • base::split + ϥϜμࣜ split_by(c(1, 2, 3), ~ . %% 2) ## x = list 2 (424 bytes) ## . 0 = double 1= 2 ## . 1 = double 2= 1 3 split_by(list(a = 1, b = 2, c = 3), ~ . %% 2) ## x = list 2 (1040 bytes) ## . 0 = list 1 ## . . b = double 1= 2 ## . 1 = list 2 ## . . a = double 1= 1 ## . . c = double 1= 3

Slide 36

Slide 36 text

sort_by • ιʔτ sort_by(c(2, -3, 1), ~ abs(.)) ## [1] 1 2 -3 sort_by(list(a = 2, b = - 3, c = 1), ~ abs(.)) ## x = list 3 (544 bytes) ## . c = double 1= 1 ## . a = double 1= 2 ## . b = double 1= -3

Slide 37

Slide 37 text

2. αϯϓϧ

Slide 38

Slide 38 text

{purrr} • 1. μϛʔσʔλͷ࡞੒ • 2. Ϧετ͔Βͷσʔλબ୒ • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ • 4. ϓϩοτ

Slide 39

Slide 39 text

1. μϛʔσʔλͷ࡞੒ • ద౰ͳμϛʔσʔλΛ࡞Γ͍ͨ • ҎԼͷଐੑΛ΋ͭϦετͷϦετ • name: ໊લ • age: ೥ྸ • likes: ޷͖ͳਓ

Slide 40

Slide 40 text

1. μϛʔσʔλͷ࡞੒ ndata <- c('hoxo-m', 'hoxo-eros', 'hoxo-um', 'hoxo-uri') dummies <- map(1:5, ~ list(name = sample(ndata, size = 1), age = sample(25:35, size = 1), likes = sample(ndata, size = 2))) dummies ## x = list 5 (3632 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 29 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-uri hoxo-eros ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 25 ## . . likes = character 2= hoxo-m hoxo-uri ## . [[4]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 29 ## . . likes = character 2= hoxo-uri hoxo-eros ## . [[5]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 35 ## . . likes = character 2= hoxo-eros hoxo-um ϦετΛฦ͢ϥϜμࣜ

Slide 41

Slide 41 text

1. μϛʔσʔλͷ࡞੒ dummies <- map(ndata, ~ list(name = ., age = sample(30:35, size = 1), likes = sample(ndata[ndata != .], size = sample(1:(length(ndata) - 1), size = 1)))) dummies ## x = list 4 (2896 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[4]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m

Slide 42

Slide 42 text

1. μϛʔσʔλͷ࡞੒ gen <- function(name) { age = sample(30:35, size = 1) size = sample(1:(length(ndata) - 1), size = 1) likes = sample(ndata[ndata != name], size = size) return (list(name = name, age = age, likes = likes)) } dummies <- map(ndata, ~ gen(.)) dummies ## x = list 4 (2896 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[4]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m ී௨ͷؔ਺Λ౉ͯ͠΋Α͍

Slide 43

Slide 43 text

1. μϛʔσʔλͷ࡞੒ ndata %>% map(~ list(name = ., age = sample(30:35, size = 1))) %>% dplyr::bind_rows() EBUBGSBNFʹ͚ͨ͠Ε͹ EQMZSCJOE@SPXT ## Source: local data frame [4 x 2] ## ## name age ## (chr) (int) ## 1 hoxo-m 34 ## 2 hoxo-eros 35 ## 3 hoxo-um 32 ## 4 hoxo-uri 34

Slide 44

Slide 44 text

2. Ϧετ͔Βͷσʔλબ୒ • ؆୯ͳૢ࡞͸ {purrr} Ͱ΋Ͱ͖Δ • Ϩίʔυͷબ୒ • ଐੑͷબ୒ • ࢀߟ: {rlist} ͱͷൺֱ • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <1> • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <2>

Slide 45

Slide 45 text

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ .$name == 'hoxo-m') ## x = list 1 (752 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri OBNFଐੑ͕bIPYPN`ͷ ϨίʔυΛબ୒

Slide 46

Slide 46 text

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ .$age > 30) ## x = list 3 (2256 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-m ## . . age = integer 1= 31 ## . . likes = character 2= hoxo-um hoxo-uri ## . [[2]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[3]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m BHFଐੑ͕ΑΓେ͖͍ ͓͡͞ΜΛબ୒ ˞࣮ࡍͷ೥ྸͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

Slide 47

Slide 47 text

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ 'hoxo-m' %in% .$likes) ## x = list 3 (2192 bytes) ## . [[1]] = list 3 ## . . name = character 1= hoxo-eros ## . . age = integer 1= 31 ## . . likes = character 3= hoxo-uri hoxo-um ... ## . [[2]] = list 3 ## . . name = character 1= hoxo-um ## . . age = integer 1= 30 ## . . likes = character 1= hoxo-m ## . [[3]] = list 3 ## . . name = character 1= hoxo-uri ## . . age = integer 1= 34 ## . . likes = character 2= hoxo-um hoxo-m bIPYPN`͞Μͷ͜ͱ͕ ޷͖ͳਓΛநग़ ˞ࣄ࣮ͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

Slide 48

Slide 48 text

2. Ϧετ͔Βͷσʔλબ୒ keep(dummies, ~ 'hoxo-m' %in% .$likes) %>% map(~ .$name) ## x = list 3 (376 bytes) ## . [[1]] = character 1= hoxo-eros ## . [[2]] = character 1= hoxo-um ## . [[3]] = character 1= hoxo-uri ϑΟϧλ݁Ռ͔Β OBNFଐੑͷΈΛநग़ keep(dummies, ~ 'hoxo-m' %in% .$likes) %>% map('name') ## x = list 3 (376 bytes) ## . [[1]] = character 1= hoxo-eros ## . [[2]] = character 1= hoxo-um ## . [[3]] = character 1= hoxo-uri NBQʹจࣈྻΛ ౉ͯ͠΋Α͍

Slide 49

Slide 49 text

• σʔλղੳͷͨΊͷ౷ܭϞσϦϯάೖ໳ • ୈ3ষΑΓ • y: छࢠ਺ • x: ২෺ͷαΠζ • f: ࢪං͋Γ(T) / ͳ͠ 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ df <- read.csv('data3a.csv') head(df) ## y x f ## 1 6 8.31 C ## 2 6 9.44 C ## 3 6 9.50 C ## 4 12 9.07 C ## 5 10 10.16 C ## 6 4 8.32 C

Slide 50

Slide 50 text

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ formulas <- list(mod1 = y ~ x, mod2 = y ~ f, mod3 = y ~ x + f) results <- purrr::map(formulas, ~ glm(formula = ., family = poisson, data = df)) results[[1]] Call: glm(formula = .x, family = .y, data = df) Coefficients: (Intercept) x 1.29172 0.07566 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 89.51 Residual Deviance: 84.99 AIC: 474.8 GPSNVMBͷϦετΛ࡞੒ ֤GPSNVMBͰ ϞσϧΛ࡞੒

Slide 51

Slide 51 text

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ map(results, logLik) ## x = list 3 (2128 bytes) ## . mod1 = double 1( logLik )= -235.39 ## . A nobs = integer 1= 100 ## . A df = integer 1= 2 ## . mod2 = double 1( logLik )= -237.63 ## . A nobs = integer 1= 100 ## . A df = integer 1= 2 ## . mod3 = double 1( logLik )= -235.29 ## . A nobs = integer 1= 100 ## . A df = integer 1= 3 ݁Ռ͢΂ͯʹ MPH-JLؔ਺Λద༻

Slide 52

Slide 52 text

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ map(results, AIC) ## x = list 3 (544 bytes) ## . mod1 = double 1= 474.77 ## . mod2 = double 1= 479.25 ## . mod3 = double 1= 476.59 sort_by(results, AIC) %>% names() ## [1] "mod1" "mod3" "mod2" ݁Ռ͢΂ͯʹ "*$ؔ਺Λద༻ "*$ͷॱʹฒ΂ସ͑ɺ Ϟσϧ໊Λදࣔ

Slide 53

Slide 53 text

3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ formulas <- c(y ~ x, y ~ x, y ~ f) families <- c(poisson, gaussian, poisson) results2 <- purrr::map2(formulas, families, ~ glm(formula = .x, family = .y, data = df)) results2[[1]] Call: glm(formula = .x, family = .y, data = df) Coefficients: (Intercept) x 1.29172 0.07566 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 89.51 Residual Deviance: 84.99 AIC: 474.8 ෳ਺ύϥϝʔλͷ৔߹͸ NBQ

Slide 54

Slide 54 text

4. ϓϩοτ plot(results[[1]], which = 2) map(results, ~ plot(., which = 2)) ݁Ռͷ͏ͪͻͱͭΛ 22ϓϩοτ ͢΂ͯΛ22ϓϩοτ ॱ൪ʹදࣔ͞ΕΔ αϒϓϩοτ͚ͨ͠Ε͹QBS

Slide 55

Slide 55 text

͓·͚

Slide 56

Slide 56 text

{purrr} ͱ૊Έ߹Θͤͯ ಛʹ͏Ε͍͠ύοέʔδ

Slide 57

Slide 57 text

{caret} • ػցֶशશ෦ೖΓ library(mlbench) library(caret) data(PimaIndiansDiabetes) control <- trainControl(method = "repeatedcv", number = 10, repeats = 3) methods <- c('gbm', 'rpart', 'svmRadial') trained <- methods %>% purrr::map(~ train(diabetes ~ ., data = PimaIndiansDiabetes, method = ., trControl = control)) resampled <- resamples(trained) summary(resampled) ## Call: ## summary.resamples(object = resampled) ## ## Models: Model1, Model2, Model3 ## Number of resamples: 30 ## ## Accuracy ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's ## Model1 0.6883 0.7435 0.7662 0.7682 0.7915 0.8442 0 ## Model2 0.6711 0.7273 0.7451 0.7499 0.7785 0.8312 0 ## Model3 0.6883 0.7403 0.7662 0.7678 0.7922 0.8312 0 ## ## Kappa ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's ## Model1 0.3037 0.4297 0.4763 0.4711 0.5268 0.6578 0 ## Model2 0.2339 0.3657 0.4078 0.4245 0.4733 0.6128 0 ## Model3 0.2518 0.4043 0.4610 0.4607 0.5138 0.6165 0 ൺֱ͍ͨ͠ϞσϧΛϦετʹ͠ɺ ֤ϞσϧΛUSBJO

Slide 58

Slide 58 text

{broom} • ϞσϧΛ data.frame ʹม׵͢Δ library(broom) glance(results[[1]]) ## null.deviance df.null logLik AIC BIC deviance df.residual ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.993 98 ؔ਺ ֓ཁ UJEZ Ϟσϧͷཁ໿ BVHNFOU Ϟσϧͷ৘ใΛݩͷEBUBGSBNFʹ෇༩͢Δ HMBODF Ұߦͷཁ໿ • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱͷ݁ՌΛྲྀ༻

Slide 59

Slide 59 text

{broom} • ϞσϧΛ data.frame ʹม׵͢Δ map(results, broom::glance) %>% dplyr::bind_rows() ## Source: local data frame [3 x 7] ## ## null.deviance df.null logLik AIC BIC deviance df.residual ## (dbl) (int) (dbl) (dbl) (dbl) (dbl) (int) ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.99300 98 ## 2 89.50694 99 -237.6273 479.2545 484.4649 89.47501 98 ## 3 89.50694 99 -235.2937 476.5874 484.4029 84.80793 97 ෳ਺ͷϞσϧͷཁ໿Λ EBUBGSBNFʹ

Slide 60

Slide 60 text

{ggfortify} • ϞσϧΛ ggplot2::autoplot Λར༻ͯ͠ඳը͢Δ • {purrr} ͰϞσϧͷϦετΛ࡞Δ -> autoplot library(ggfortify) autoplot(results, which = 2)

Slide 61

Slide 61 text

·ͱΊ • {purrr} ͸ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़ • ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ

Slide 62

Slide 62 text

Enjoy!