Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Japan.R 2015: {purrr} による非テーブルデータの処理

Sinhrks
December 05, 2015

Japan.R 2015: {purrr} による非テーブルデータの処理

Sinhrks

December 05, 2015
Tweet

More Decks by Sinhrks

Other Decks in Technology

Transcript

  1. {purrr} ʹΑΔ

    ඇςʔϒϧσʔλͷॲཧ

    View Slide

  2. ࣗݾ঺հ
    • @sinhrks

    • ۀ຿: σʔλ෼ੳ (ϝʔΧʔ)

    • ར༻ݴޠ: Python, Rͱ͔

    • झຯ: OSS ׆ಈ

    • pandas ※ ίϛολ (Python ύοέʔδ)

    ※ R ͷ data.frame + dplyr + tidyr + readr + haven + lubridate + stringr
    + ggplot2… (ҎԼུ ͷΑ͏ͳ΋ͷ

    View Slide

  3. ຊ೔͸
    Θ͚Ͱ͸͋Γ·ͤΜ
    Λ dis Γʹདྷͨ

    View Slide

  4. ͳͥ Japan.R ʹ?
    • Rͷύοέʔδ࡞ͬͨ
    (JU)VC"XBSET
    ݱࡏ
    • ͜ͷR։ൃऀ͕͍͢͝ (2015) by @u_ribo બग़

    • http://uribo.hatenablog.com/entry/2015/12/02/180004

    View Slide

  5. ͳͥ Λʁ

    View Slide

  6. ͳͥRΛ?
    • ϓϩάϥϛϯά͕ઐ໳Ͱ͸ͳ͍͕ɺઐ໳෼໺Ͱ
    ౷ܭ/ػցֶश͕ඞཁͩͬͨͷͰ

    • ΤϯδχΞ͕ͩɺ౷ܭ/ػցֶश͕ඞཁʹͳͬͨ
    ͷͰ

    View Slide

  7. ૝ఆ͢ΔϨϕϧͱΰʔϧ
    • લఏ: dplyr, ggplot2 ͸গ͠࢖ͬͨ͜ͱ͕͋Δ

    • ΰʔϧ: {purrr} ͷجຊతͳ࢖͍ํΛ஌Δ

    • ͜ΜͳίʔυΛॻ͔ͳͯ͘΋Α͘ͳΔ
    glm.fit1 glm.fit2 …
    predict(glm.fit1, newdata = newdata)
    predict(glm.fit2, newdata = newdata)

    View Slide

  8. ຊ೔ͷ಺༰

    {purrr} Ͱσʔλॲཧ

    View Slide

  9. σʔλॲཧͷྲྀΕ
    σʔλͷ४උ ϞσϦϯά ධՁ

    View Slide

  10. RʹΑΔσʔλॲཧ
    EBUBGSBNF
    ϦετϕΫτϧ
    Ϟσϧ
    HHQMPU
    UJEZS
    EQMZS
    SFBES
    SWFTU
    SMJTU
    QVSSS
    ֤छ౷ܭػցֶश
    ύοέʔδ
    CSPPN
    DBSFU
    HHGPSUJGZ

    View Slide

  11. data.frameͷॲཧ

    View Slide

  12. data.frameͷॲཧ
    • σʔλͷॲཧʹ͸ύλʔϯ͕͋Δ

    • ྫ: data.frame Λ ͋Δྻͷ஋ͰάϧʔϓΘ͚͠ɺ
    ྻ͝ͱʹฏۉ஋Λܭࢉ͠ɺ݁ՌΛ data.frame ʹ
    ͍ͨ͠ɻ
    ෼ׂ ద༻ ݁߹

    View Slide

  13. plyr
    • ૊ΈࠐΈ:
    library(plyr)
    plyr::ddply(iris, .(Species), plyr::colwise(mean))
    ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width
    ## 1 setosa 5.006 3.428 1.462 0.246
    ## 2 versicolor 5.936 2.770 4.260 1.326
    ## 3 virginica 6.588 2.974 5.552 2.026
    iris_group res as.data.frame(t(res))
    ## Sepal.Length Sepal.Width Petal.Length Petal.Width
    ## setosa 5.006 3.428 1.462 0.246
    ## versicolor 5.936 2.770 4.260 1.326
    ## virginica 6.588 2.974 5.552 2.026
    • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ

    View Slide

  14. dplyr
    • plyr: ෼ׂ - ద༻ - ݁߹ΛҰͭͷؔ਺Ͱ
    library(plyr)
    plyr::ddply(iris, .(Species), plyr::colwise(mean))
    ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width
    ## 1 setosa 5.006 3.428 1.462 0.246
    ## 2 versicolor 5.936 2.770 4.260 1.326
    ## 3 virginica 6.588 2.974 5.552 2.026
    library(dplyr)
    dplyr::summarise_each(dplyr::group_by(iris, Species),
    dplyr::funs(mean))
    ## Source: local data frame [3 x 5]
    ##
    ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width
    ## (fctr) (dbl) (dbl) (dbl) (dbl)
    ## 1 setosa 5.006 3.428 1.462 0.246
    ## 2 versicolor 5.936 2.770 4.260 1.326
    ## 3 virginica 6.588 2.974 5.552 2.026
    • dplyr: ॲཧ͝ͱʹؔ਺Λద༻

    View Slide

  15. ύΠϓԋࢉࢠ %>%
    • magrittr

    iris %>%
    dplyr::group_by(Species) %>%
    dplyr::summarise_each(dplyr::funs(mean))
    ## Source: local data frame [3 x 5]
    ##
    ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width
    ## (fctr) (dbl) (dbl) (dbl) (dbl)
    ## 1 setosa 5.006 3.428 1.462 0.246
    ## 2 versicolor 5.936 2.770 4.260 1.326
    ## 3 virginica 6.588 2.974 5.552 2.026
    x %>% f(…) == f(x, …)

    View Slide

  16. data.frameͷॲཧ
    • ߟ͑ํ

    • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़

    • ؔ਺͸ύΠϓԋࢉࢠ %>% Ͱ઀ଓ

    • ಉ͜͡ͱΛ data.frame Ҏ֎Ͱ΋΍Γ͍ͨ

    View Slide

  17. data.frame Ҏ֎ͷॲཧ

    View Slide

  18. ͜Μͳ͜ͱ͋Γ·ͤΜ͔ʁ
    • ͋ΔॲཧΛɺ

    • ܁Γฦ͍ͨ͠

    • ෳ਺ͷର৅ʹߦ͍͍ͨ

    • ͦͷ··Ͱ͸ data.frame ʹͰ͖ͳ͍σʔλ
    (JSON / XMLͳͲ) Λѻ͍͍ͨ

    View Slide

  19. ྫ͑͹
    σʔλΛ8FC"1*͔Β+40/Ͱऔಘͯ͠੔ܗ
    ͋Δ৚݅ʹ߹க͠ͳ͍σʔλ͸আ֎
    EBUBGSBNFΛ࡞੒
    ෳ਺ͷϞσϧΛ࡞੒ൺֱϓϩοτ

    View Slide

  20. ૊ΈࠐΈؔ਺Ͱ΍Δͱ…
    • Apply ؔ਺܈

    • ߴ֊ؔ਺ ※ ܈
    BQQMZ
    lapply(list(1, 2, 3), function (x) { x + 1 })
    Map(function (x) { x + 1 }, list(1, 2, 3))
    MBQQMZ TBQQMZ UBQQMZ NBQQMZ
    3FEVDF 'JMUFS 'JOE .BQ /FHBUF 1PTJUJPO
    ແ໊ؔ਺ͷఆ͕ٛΊΜͲ͏
    ؔ਺ͷద༻Ҏ֎ͷॲཧ͸Ͱ͖ͳ͍
    ؔ਺͕ୈҰҾ਺ͷͨΊύΠϓԋࢉࢠ͕࢖͍ʹ͍͘
    ֮͑ΒΕͳ͍
    ※ ؔ਺ΛҾ਺΍ฦΓ஋ʹ͢Δؔ਺

    View Slide

  21. $SFBUJWF$PNNPOT$$GSPNUIF1JYBCBZ
    ͳΜͱ͔ͳΒͳ͍͔ʜ

    View Slide

  22. purrr !!!
    1IPUPCZԶ

    View Slide

  23. {purrr} ͱ͸

    View Slide

  24. {purrr} ͱ͸
    • RStudio BlogΑΓ

    • ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ

    • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़

    • ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ
    Purrr is a new package that fills in the missing
    pieces in R’s functional programming tools: it’s
    designed to make your pure functions purrr. Like
    many of my recent packages, it works with magrittr
    to allow you to express complex operations by
    combining simple pieces in a standard way.

    View Slide

  25. ؔ਺ܕϓϩάϥϛϯάͱ͸?
    • (Ұҙͷఆ͕ٛ͋ΔΘ͚Ͱ͸ͳ͍͕) ෳ਺ͷؔ਺ͷ
    ૊Έ߹ΘͤʹΑͬͯ࠷ऴతʹ΍Γ͍ͨॲཧΛه
    ड़͍ͯ͘͠ελΠϧɻ

    • ؔ਺ܕϓϩάϥϛϯάΛ૝ఆͨ͠ݴޠ͸ؔ਺
    ܕݴޠͱݺ͹ΕΔ

    View Slide

  26. R ͸ؔ਺ܕݴޠ͔?
    • Yes (by Hadley Wickham in “Advanced R”) .

    • ৄ͘͠͸ “Rݴޠపఈղઆ” (๜༁͕12/23ൃച)

    View Slide

  27. {purrr} ͷಛ௃
    • ϥϜμࣜ

    • ύΠϓԋࢉࢠͰ઀ଓͰ͖Δؔ਺܈

    • ஫ҙ఺

    • όʔδϣϯ͸0.1ɻࠓޙ ഁյతͳมߋ͕͋ΔՄೳੑ͕͋Δɻ

    • ຊ೔͸ओཁͳؔ਺ͷΈ͝঺հ (͜ΕͰ΄ͱΜͲͷ͜ͱ͸Ͱ͖Δ)ɻ

    View Slide

  28. ϥϜμࣜ
    • ໊લΛ΋ͨͳ͍ؔ਺ (ແ໊ؔ਺) Λγϯϓϧʹهड़

    • R ඪ४

    • {purrr} ͷϥϜμࣜ

    • {purrr} ͷؔ਺ͷதͰ࢖͏ (ߴ֊ؔ਺)
    function(x) { x + 1 }
    ~ . + 1
    map(x, ~ . + 1)
    υοτ
    ͕Ҿ਺ʹରԠ

    View Slide

  29. {purrr} ͷؔ਺܈ (Ұ෦)
    ඪ४ QVSSS
    BQQMZܥ NBQ
    'JMUFS LFFQ
    3FEVDF SFEVDF
    TQMJU [email protected]
    TPSU [email protected]

    View Slide

  30. map
    • ؔ਺Λ֤ཁૉʹద༻
    map(c(1, 2, 3), ~ . + 1)
    ## x = list 3 (216 bytes)
    ## . [[1]] = double 1= 2
    ## . [[2]] = double 1= 3
    ## . [[3]] = double 1= 4
    map(list(a = 1, b = 2, c = 3), ~ . + 1)
    ## x = list 3 (544 bytes)
    ## . a = double 1= 2
    ## . b = double 1= 3
    ## . c = double 1= 4
    map(c(1, 2, 3), ~ . + 1) == list(1 + 1, 2 + 1, 3 + 1)

    View Slide

  31. map2
    • 2ͭͷҾ਺ʹରͯؔ͠਺ద༻
    map2(c(1, 2, 3), c(4, 5, 6), ~ .x * .y)
    ## x = list 3 (216 bytes)
    ## . [[1]] = double 1= 4
    ## . [[2]] = double 1= 10
    ## . [[3]] = double 1= 18
    map2(c(1, 2, 3), c(4, 5, 6), ~ .x * .y)
    == list(1 * 4, 2 * 5, 3 * 6)
    ͻͱͭΊ͔ΒͷཁૉΛY
    ;ͨͭΊ͔ΒͷཁૉΛZͰࢀর

    View Slide

  32. map_xxx
    • ݁ՌΛϕΫτϧͰऔಘɺܕͷࢦఆ͕ඞཁ
    map_int(list(a = 1L, b = 2L, c = 3L), ~ . + 1L)
    ## a b c
    ## 2 3 4
    [email protected] ϕΫτϧͷܕ
    [email protected] MPHJDBM
    [email protected] DIBSBDUFS
    [email protected] JOUFHFS
    [email protected] OVNSJD

    View Slide

  33. keep
    • ؔ਺ͷ৚݅ʹ͋ͯ͸·ΔཁૉΛநग़
    keep(c(1, 2, 3), ~ . >= 2)
    ## [1] 2 3
    keep(list(a = 1, b = 2, c = 3), ~ . >= 2)
    ## x = list 2 (416 bytes)
    ## . b = double 1= 2
    ## . c = double 1= 3

    View Slide

  34. reduce
    • ͨͨΈࠐΈ
    reduce(c(1, 2, 3), `+`)
    ## [1] 6
    reduce(list(a = 1, b = 2, c = 3), `+`)
    ## [1] 6
    reduce(c(1, 2, 3), `+`) == ((1 + 2) + 3)

    View Slide

  35. split_by
    • base::split + ϥϜμࣜ
    split_by(c(1, 2, 3), ~ . %% 2)
    ## x = list 2 (424 bytes)
    ## . 0 = double 1= 2
    ## . 1 = double 2= 1 3
    split_by(list(a = 1, b = 2, c = 3), ~ . %% 2)
    ## x = list 2 (1040 bytes)
    ## . 0 = list 1
    ## . . b = double 1= 2
    ## . 1 = list 2
    ## . . a = double 1= 1
    ## . . c = double 1= 3

    View Slide

  36. sort_by
    • ιʔτ
    sort_by(c(2, -3, 1), ~ abs(.))
    ## [1] 1 2 -3
    sort_by(list(a = 2, b = - 3, c = 1), ~ abs(.))
    ## x = list 3 (544 bytes)
    ## . c = double 1= 1
    ## . a = double 1= 2
    ## . b = double 1= -3

    View Slide

  37. 2. αϯϓϧ

    View Slide

  38. {purrr}
    • 1. μϛʔσʔλͷ࡞੒

    • 2. Ϧετ͔Βͷσʔλબ୒

    • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ

    • 4. ϓϩοτ

    View Slide

  39. 1. μϛʔσʔλͷ࡞੒
    • ద౰ͳμϛʔσʔλΛ࡞Γ͍ͨ

    • ҎԼͷଐੑΛ΋ͭϦετͷϦετ

    • name: ໊લ

    • age: ೥ྸ

    • likes: ޷͖ͳਓ

    View Slide

  40. 1. μϛʔσʔλͷ࡞੒
    ndata dummies age = sample(25:35, size = 1),
    likes = sample(ndata, size = 2)))
    dummies
    ## x = list 5 (3632 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-eros
    ## . . age = integer 1= 29
    ## . . likes = character 2= hoxo-um hoxo-uri
    ## . [[2]] = list 3
    ## . . name = character 1= hoxo-m
    ## . . age = integer 1= 34
    ## . . likes = character 2= hoxo-uri hoxo-eros
    ## . [[3]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 25
    ## . . likes = character 2= hoxo-m hoxo-uri
    ## . [[4]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 29
    ## . . likes = character 2= hoxo-uri hoxo-eros
    ## . [[5]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 35
    ## . . likes = character 2= hoxo-eros hoxo-um
    ϦετΛฦ͢ϥϜμࣜ

    View Slide

  41. 1. μϛʔσʔλͷ࡞੒
    dummies ~ list(name = .,
    age = sample(30:35, size = 1),
    likes = sample(ndata[ndata != .],
    size = sample(1:(length(ndata) - 1),
    size = 1))))
    dummies
    ## x = list 4 (2896 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-m
    ## . . age = integer 1= 31
    ## . . likes = character 2= hoxo-um hoxo-uri
    ## . [[2]] = list 3
    ## . . name = character 1= hoxo-eros
    ## . . age = integer 1= 31
    ## . . likes = character 3= hoxo-uri hoxo-um ...
    ## . [[3]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 30
    ## . . likes = character 1= hoxo-m
    ## . [[4]] = list 3
    ## . . name = character 1= hoxo-uri
    ## . . age = integer 1= 34
    ## . . likes = character 2= hoxo-um hoxo-m

    View Slide

  42. 1. μϛʔσʔλͷ࡞੒
    gen age = sample(30:35, size = 1)
    size = sample(1:(length(ndata) - 1), size = 1)
    likes = sample(ndata[ndata != name], size = size)
    return (list(name = name, age = age, likes = likes))
    }
    dummies dummies
    ## x = list 4 (2896 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-m
    ## . . age = integer 1= 31
    ## . . likes = character 2= hoxo-um hoxo-uri
    ## . [[2]] = list 3
    ## . . name = character 1= hoxo-eros
    ## . . age = integer 1= 31
    ## . . likes = character 3= hoxo-uri hoxo-um ...
    ## . [[3]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 30
    ## . . likes = character 1= hoxo-m
    ## . [[4]] = list 3
    ## . . name = character 1= hoxo-uri
    ## . . age = integer 1= 34
    ## . . likes = character 2= hoxo-um hoxo-m
    ී௨ͷؔ਺Λ౉ͯ͠΋Α͍

    View Slide

  43. 1. μϛʔσʔλͷ࡞੒
    ndata %>%
    map(~ list(name = ., age = sample(30:35, size = 1))) %>%
    dplyr::bind_rows()
    EBUBGSBNFʹ͚ͨ͠Ε͹
    [email protected]

    ## Source: local data frame [4 x 2]
    ##
    ## name age
    ## (chr) (int)
    ## 1 hoxo-m 34
    ## 2 hoxo-eros 35
    ## 3 hoxo-um 32
    ## 4 hoxo-uri 34

    View Slide

  44. 2. Ϧετ͔Βͷσʔλબ୒
    • ؆୯ͳૢ࡞͸ {purrr} Ͱ΋Ͱ͖Δ

    • Ϩίʔυͷબ୒

    • ଐੑͷબ୒

    • ࢀߟ: {rlist} ͱͷൺֱ

    • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <1>

    • {purrr} ͰϦετσʔλΛૢ࡞͢Δ <2>

    View Slide

  45. 2. Ϧετ͔Βͷσʔλબ୒
    keep(dummies, ~ .$name == 'hoxo-m')
    ## x = list 1 (752 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-m
    ## . . age = integer 1= 31
    ## . . likes = character 2= hoxo-um hoxo-uri
    OBNFଐੑ͕bIPYPN`ͷ
    ϨίʔυΛબ୒

    View Slide

  46. 2. Ϧετ͔Βͷσʔλબ୒
    keep(dummies, ~ .$age > 30)
    ## x = list 3 (2256 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-m
    ## . . age = integer 1= 31
    ## . . likes = character 2= hoxo-um hoxo-uri
    ## . [[2]] = list 3
    ## . . name = character 1= hoxo-eros
    ## . . age = integer 1= 31
    ## . . likes = character 3= hoxo-uri hoxo-um ...
    ## . [[3]] = list 3
    ## . . name = character 1= hoxo-uri
    ## . . age = integer 1= 34
    ## . . likes = character 2= hoxo-um hoxo-m
    BHFଐੑ͕ΑΓେ͖͍
    ͓͡͞ΜΛબ୒
    ˞࣮ࡍͷ೥ྸͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

    View Slide

  47. 2. Ϧετ͔Βͷσʔλબ୒
    keep(dummies, ~ 'hoxo-m' %in% .$likes)
    ## x = list 3 (2192 bytes)
    ## . [[1]] = list 3
    ## . . name = character 1= hoxo-eros
    ## . . age = integer 1= 31
    ## . . likes = character 3= hoxo-uri hoxo-um ...
    ## . [[2]] = list 3
    ## . . name = character 1= hoxo-um
    ## . . age = integer 1= 30
    ## . . likes = character 1= hoxo-m
    ## . [[3]] = list 3
    ## . . name = character 1= hoxo-uri
    ## . . age = integer 1= 34
    ## . . likes = character 2= hoxo-um hoxo-m
    bIPYPN`͞Μͷ͜ͱ͕
    ޷͖ͳਓΛநग़
    ˞ࣄ࣮ͱ͸ҟͳΔՄೳੑ͕͋Γ·͢

    View Slide

  48. 2. Ϧετ͔Βͷσʔλબ୒
    keep(dummies, ~ 'hoxo-m' %in% .$likes) %>%
    map(~ .$name)
    ## x = list 3 (376 bytes)
    ## . [[1]] = character 1= hoxo-eros
    ## . [[2]] = character 1= hoxo-um
    ## . [[3]] = character 1= hoxo-uri
    ϑΟϧλ݁Ռ͔Β
    OBNFଐੑͷΈΛநग़
    keep(dummies, ~ 'hoxo-m' %in% .$likes) %>%
    map('name')
    ## x = list 3 (376 bytes)
    ## . [[1]] = character 1= hoxo-eros
    ## . [[2]] = character 1= hoxo-um
    ## . [[3]] = character 1= hoxo-uri
    NBQʹจࣈྻΛ
    ౉ͯ͠΋Α͍

    View Slide

  49. • σʔλղੳͷͨΊͷ౷ܭϞσϦϯάೖ໳

    • ୈ3ষΑΓ

    • y: छࢠ਺

    • x: ২෺ͷαΠζ

    • f: ࢪං͋Γ(T) / ͳ͠
    3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
    df head(df)
    ## y x f
    ## 1 6 8.31 C
    ## 2 6 9.44 C
    ## 3 6 9.50 C
    ## 4 12 9.07 C
    ## 5 10 10.16 C
    ## 6 4 8.32 C

    View Slide

  50. 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
    formulas results ~ glm(formula = ., family = poisson, data =
    df))
    results[[1]]
    Call: glm(formula = .x, family = .y, data = df)
    Coefficients:
    (Intercept) x
    1.29172 0.07566
    Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
    Null Deviance: 89.51
    Residual Deviance: 84.99 AIC: 474.8
    GPSNVMBͷϦετΛ࡞੒
    ֤GPSNVMBͰ
    ϞσϧΛ࡞੒

    View Slide

  51. 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
    map(results, logLik)
    ## x = list 3 (2128 bytes)
    ## . mod1 = double 1( logLik )= -235.39
    ## . A nobs = integer 1= 100
    ## . A df = integer 1= 2
    ## . mod2 = double 1( logLik )= -237.63
    ## . A nobs = integer 1= 100
    ## . A df = integer 1= 2
    ## . mod3 = double 1( logLik )= -235.29
    ## . A nobs = integer 1= 100
    ## . A df = integer 1= 3
    ݁Ռ͢΂ͯʹ
    MPH-JLؔ਺Λద༻

    View Slide

  52. 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
    map(results, AIC)
    ## x = list 3 (544 bytes)
    ## . mod1 = double 1= 474.77
    ## . mod2 = double 1= 479.25
    ## . mod3 = double 1= 476.59
    sort_by(results, AIC) %>% names()
    ## [1] "mod1" "mod3" "mod2"
    ݁Ռ͢΂ͯʹ
    "*$ؔ਺Λద༻
    "*$ͷॱʹฒ΂ସ͑ɺ
    Ϟσϧ໊Λදࣔ

    View Slide

  53. 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱ
    formulas families results2 ~ glm(formula = .x, family = .y, data = df))
    results2[[1]]
    Call: glm(formula = .x, family = .y, data = df)
    Coefficients:
    (Intercept) x
    1.29172 0.07566
    Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
    Null Deviance: 89.51
    Residual Deviance: 84.99 AIC: 474.8
    ෳ਺ύϥϝʔλͷ৔߹͸
    NBQ

    View Slide

  54. 4. ϓϩοτ
    plot(results[[1]], which = 2)
    map(results, ~ plot(., which = 2))
    ݁Ռͷ͏ͪͻͱͭΛ
    22ϓϩοτ
    ͢΂ͯΛ22ϓϩοτ
    ॱ൪ʹදࣔ͞ΕΔ
    αϒϓϩοτ͚ͨ͠Ε͹QBS

    View Slide

  55. ͓·͚

    View Slide

  56. {purrr} ͱ૊Έ߹Θͤͯ
    ಛʹ͏Ε͍͠ύοέʔδ

    View Slide

  57. {caret}
    • ػցֶशશ෦ೖΓ
    library(mlbench)
    library(caret)
    data(PimaIndiansDiabetes)
    control methods trained %
    purrr::map(~ train(diabetes ~ ., data = PimaIndiansDiabetes,
    method = ., trControl = control))
    resampled summary(resampled)
    ## Call:
    ## summary.resamples(object = resampled)
    ##
    ## Models: Model1, Model2, Model3
    ## Number of resamples: 30
    ##
    ## Accuracy
    ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
    ## Model1 0.6883 0.7435 0.7662 0.7682 0.7915 0.8442 0
    ## Model2 0.6711 0.7273 0.7451 0.7499 0.7785 0.8312 0
    ## Model3 0.6883 0.7403 0.7662 0.7678 0.7922 0.8312 0
    ##
    ## Kappa
    ## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
    ## Model1 0.3037 0.4297 0.4763 0.4711 0.5268 0.6578 0
    ## Model2 0.2339 0.3657 0.4078 0.4245 0.4733 0.6128 0
    ## Model3 0.2518 0.4043 0.4610 0.4607 0.5138 0.6165 0
    ൺֱ͍ͨ͠ϞσϧΛϦετʹ͠ɺ
    ֤ϞσϧΛUSBJO

    View Slide

  58. {broom}
    • ϞσϧΛ data.frame ʹม׵͢Δ
    library(broom)
    glance(results[[1]])
    ## null.deviance df.null logLik AIC BIC deviance df.residual
    ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.993 98
    ؔ਺ ֓ཁ
    UJEZ Ϟσϧͷཁ໿
    BVHNFOU Ϟσϧͷ৘ใΛݩͷEBUBGSBNFʹ෇༩͢Δ
    HMBODF Ұߦͷཁ໿
    • 3. ෳ਺Ϟσϧͷ࡞੒ͱൺֱͷ݁ՌΛྲྀ༻

    View Slide

  59. {broom}
    • ϞσϧΛ data.frame ʹม׵͢Δ
    map(results, broom::glance) %>%
    dplyr::bind_rows()
    ## Source: local data frame [3 x 7]
    ##
    ## null.deviance df.null logLik AIC BIC deviance df.residual
    ## (dbl) (int) (dbl) (dbl) (dbl) (dbl) (int)
    ## 1 89.50694 99 -235.3863 474.7725 479.9828 84.99300 98
    ## 2 89.50694 99 -237.6273 479.2545 484.4649 89.47501 98
    ## 3 89.50694 99 -235.2937 476.5874 484.4029 84.80793 97
    ෳ਺ͷϞσϧͷཁ໿Λ
    EBUBGSBNFʹ

    View Slide

  60. {ggfortify}
    • ϞσϧΛ ggplot2::autoplot Λར༻ͯ͠ඳը͢Δ

    • {purrr} ͰϞσϧͷϦετΛ࡞Δ -> autoplot
    library(ggfortify)
    autoplot(results, which = 2)

    View Slide

  61. ·ͱΊ
    • {purrr} ͸ؔ਺ܕϓϩάϥϛϯάͷͨΊͷπʔϧ

    • ॲཧΛγϯϓϧͳؔ਺ͷ૊Έ߹ΘͤͰهड़

    • ؔ਺͸ύΠϓԋࢉࢠͰ઀ଓ

    View Slide

  62. Enjoy!

    View Slide