Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TokyoR#108_NestedDataHandling

kilometer
September 02, 2023

 TokyoR#108_NestedDataHandling

第108回Tokyo.Rでトークした際のスライド資料です。

kilometer

September 02, 2023
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. #108
    @kilometer00
    2021.09.02
    Nested data handling

    View full-size slide

  2. Who!?
    Who?

    View full-size slide

  3. Who!?
    ・ @kilometer
    ・特任教員 (Ph.D. Eng.)
    ・神経科学
    ・⾏動計算論
    ・データ可視化
    ・R: ~ 15 years
    ・近況:京極堂 (鵺でるよ!)

    View full-size slide

  4. 宣伝!!(書籍の翻訳に参加しました。)
    絶賛販売中!

    View full-size slide

  5. 宣伝!!

    Rを使ったデータ科学の基礎を
    統数研 on siteで2⽇間かけて学びます!
    (宿泊・交通費の補助あり!)
    12⽉に開催予定!

    View full-size slide

  6. #108
    @kilometer00
    2021.09.02
    Nested data handling

    View full-size slide

  7. 表データの加⼯と可視化の概観
    Long
    Wide
    Nested
    plot
    Figures
    Data table
    read_csv write_csv
    pivot_longer
    pivot_wider
    group_nest
    unnest
    ggplot
    ggsave
    wrap_plots
    map
    rowwise

    View full-size slide

  8. 基礎知識

    View full-size slide

  9. 1JQFBMHFCSB
    X %>% f
    X %>% f(y)
    X %>% f %>% g
    X %>% f(y, .)
    f(X)
    f(X, y)
    g(f(X))
    f(y, X)
    %>% {magrittr}
    「dplyr再⼊⾨(基本編)」yutanihilation
    https://speakerdeck.com/yutannihilation/dplyrzai-ru-men-ji-ben-bian

    View full-size slide

  10. ?data.frame
    list(
    x = c(1:3),
    y = letters[1:3],
    z = seq(3, 5, by = 1))
    ## $x
    ## [1] 1 2 3
    ##
    ## $y
    ## [1] "a" "b" "c"
    ##
    ## $z
    ## [1] 3 4 5

    View full-size slide

  11. ?data.frame
    data.frame(
    x = c(1:3),
    y = letters[1:3],
    z = seq(3, 5, by = 1))
    ## x y z
    ## 1 1 a 3
    ## 2 2 b 4
    ## 3 3 c 5

    View full-size slide

  12. ?data.frame
    data.frame(
    x = c(1:3),
    y = letters[1:3],
    z = seq(3, 5, by = 1))
    ## x y z
    ## 1 1 a 3
    ## 2 2 b 4
    ## 3 3 c 5
    observation
    variable

    View full-size slide

  13. Nested dataの準備

    View full-size slide

  14. dat <-
    palmerpenguins::penguins %>%
    na.omit() # NAを含む行の除去
    パイプ演算⼦
    パッケージ名
    > dat
    # A tibble: 333 × 8
    species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year

    1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
    2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
    3 Adelie Torgersen 40.3 18 195 3250 female 2007
    4 Adelie Torgersen 36.7 19.3 193 3450 female 2007
    5 Adelie Torgersen 39.3 20.6 190 3650 male 2007
    6 Adelie Torgersen 38.9 17.8 181 3625 female 2007
    7 Adelie Torgersen 39.2 19.6 195 4675 male 2007
    8 Adelie Torgersen 41.1 17.6 182 3200 female 2007
    9 Adelie Torgersen 38.6 21.2 191 3800 male 2007
    10 Adelie Torgersen 34.6 21.1 198 4400 male 2007
    # … with 323 more rows
    # ℹ Use `print(n = ...)` to see more rows
    データの準備

    View full-size slide

  15. 「 tibble?ああ, あれは美味しいよね.」
    data.frameに⽐べて
    ・⾊々厳密になっている
    → 再利⽤規則, データ型, 変数参照, etc.
    ・遅延評価に対応
    → tibble::tibble()で作成するときなど
    ・list型オブジェクトをカラムに取れる
    ・⾏名は指定できない(数字表記のみ)

    View full-size slide

  16. tibble::tibble(
    chr = letters[1:3],
    num = list(1:2, 2:3, 3:4)
    )
    # A tibble: 3 × 2
    chr num

    1 a
    2 b
    3 c
    「 tibble?ああ, あれは美味しいよね.」
    ・Nested data = listを内包するtibble
    list型
    .$num[[1]]に
    c(1:2)が畳み込まれている

    View full-size slide

  17. データを畳み込む
    dplyr::group_nest()

    View full-size slide

  18. > dat
    # A tibble: 333 × 8
    species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year

    1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
    2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
    3 Adelie Torgersen 40.3 18 195 3250 female 2007
    4 Adelie Torgersen 36.7 19.3 193 3450 female 2007
    5 Adelie Torgersen 39.3 20.6 190 3650 male 2007
    6 Adelie Torgersen 38.9 17.8 181 3625 female 2007
    7 Adelie Torgersen 39.2 19.6 195 4675 male 2007
    8 Adelie Torgersen 41.1 17.6 182 3200 female 2007
    9 Adelie Torgersen 38.6 21.2 191 3800 male 2007
    10 Adelie Torgersen 34.6 21.1 198 4400 male 2007
    # … with 323 more rows
    # ℹ Use `print(n = ...)` to see more rows
    dat %>%
    tidyr::nest()
    # A tibble: 1 × 1
    data

    1
    データの畳み込み

    View full-size slide

  19. dat %>%
    tidyr::nest() %>%
    .$data
    [[1]]
    # A tibble: 333 × 8
    species island bill_length…¹ bill_…² flipp…³ body_…⁴ sex year

    1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
    2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
    3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
    4 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
    5 Adelie Torgersen 39.3 20.6 190 3650 male 2007
    6 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
    7 Adelie Torgersen 39.2 19.6 195 4675 male 2007
    8 Adelie Torgersen 41.1 17.6 182 3200 fema… 2007
    9 Adelie Torgersen 38.6 21.2 191 3800 male 2007
    10 Adelie Torgersen 34.6 21.1 198 4400 male 2007
    # … with 323 more rows, and abbreviated variable names
    # ¹bill_length_mm, ²bill_depth_mm, ³flipper_length_mm, ⁴body_mass_g
    # ℹ Use `print(n = ...)` to see more rows

    View full-size slide

  20. dat %>%
    tidyr::nest() %$%
    data
    [[1]]
    # A tibble: 333 × 8
    species island bill_length…¹ bill_…² flipp…³ body_…⁴ sex year

    1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
    2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
    3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
    4 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
    5 Adelie Torgersen 39.3 20.6 190 3650 male 2007
    6 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
    7 Adelie Torgersen 39.2 19.6 195 4675 male 2007
    8 Adelie Torgersen 41.1 17.6 182 3200 fema… 2007
    9 Adelie Torgersen 38.6 21.2 191 3800 male 2007
    10 Adelie Torgersen 34.6 21.1 198 4400 male 2007
    # … with 323 more rows, and abbreviated variable names
    # ¹bill_length_mm, ²bill_depth_mm, ³flipper_length_mm, ⁴body_mass_g
    # ℹ Use `print(n = ...)` to see more rows
    ドル演算⼦
    (magrittrパッケージ)
    リストになっていることに注意

    View full-size slide

  21. dat %>%
    tidyr::nest() %$%
    data %>% .[[1]]
    # A tibble: 333 × 8
    species island bill_length…¹ bill_…² flipp…³ body_…⁴ sex year

    1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
    2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
    3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
    4 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
    5 Adelie Torgersen 39.3 20.6 190 3650 male 2007
    6 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
    7 Adelie Torgersen 39.2 19.6 195 4675 male 2007
    8 Adelie Torgersen 41.1 17.6 182 3200 fema… 2007
    9 Adelie Torgersen 38.6 21.2 191 3800 male 2007
    10 Adelie Torgersen 34.6 21.1 198 4400 male 2007
    # … with 323 more rows, and abbreviated variable names
    # ¹bill_length_mm, ²bill_depth_mm, ³flipper_length_mm, ⁴body_mass_g
    # ℹ Use `print(n = ...)` to see more rows
    ドル演算⼦
    (magrittrパッケージ)

    View full-size slide

  22. dat %>%
    dplyr::group_by(island) %>%
    tidyr::nest()
    # A tibble: 3 × 2
    # Groups: island [3]
    island data

    1 Torgersen
    2 Biscoe
    3 Dream
    畳み込みたい⽔準で
    事前にグループ化しておく

    View full-size slide

  23. dat %>%
    dplyr::group_by(island) %>%
    tidyr::nest()
    # A tibble: 3 × 2
    # Groups: island [3]
    island data

    1 Torgersen
    2 Biscoe
    3 Dream
    畳み込みたい⽔準で
    事前にグループ化しておく

    View full-size slide

  24. dat %>%
    dplyr::group_by(island) %>%
    tidyr::nest() 畳み込みたい⽔準で
    事前にグループ化しておく
    dat %>%
    tidyr::nest(.by = island)
    dat %>%
    dplyr::group_nest(island)
    グループ化と畳み込みを
    ⼀括でやってくれる

    View full-size slide

  25. > group_nest
    function (.tbl, ..., .key = "data", keep = FALSE)
    {
    lifecycle::signal_stage("experimental", "group_nest()")
    UseMethod("group_nest")
    }


    畳み込みたいデータ
    (data.frame / tibble)
    畳み込む⽔準
    (カラム名, NSE)
    畳み込み先のカラム名
    (⽂字列)
    畳み込み⽔準も畳み込むか
    (TRUE / FALSE)

    View full-size slide

  26. dat %>%
    dplyr::group_nest()
    # A tibble: 1 × 1
    data

    1
    dat %>%
    dplyr::group_nest(species)
    # A tibble: 3 × 2
    species data
    >
    1 Adelie [146 × 7]
    2 Chinstrap [68 × 7]
    3 Gentoo [119 × 7]
    dat %>%
    dplyr::group_nest(.key = "hoge")
    # A tibble: 1 × 1
    hoge

    1
    指定なし=データ全体を1x1に畳み込む
    tidyr::nest()関数でも同じ結果になる。
    畳み込み⽔準を指定
    畳み込み先のカラム名を⽂字列で指定

    View full-size slide

  27. dat %>%
    dplyr::group_nest(species)
    # A tibble: 3 × 2
    species data
    >
    1 Adelie [146 × 7]
    2 Chinstrap [68 × 7]
    3 Gentoo [119 × 7]
    dat %>%
    dplyr::group_nest("species")
    # A tibble: 1 × 2
    `"species"` data
    >
    1 species [333 × 8]
    畳み込み⽔準を⽂字列で指定
    新しいchrカラムを作ってそれで
    全体を畳み込むという挙動になる
    (⾮推奨)
    畳み込み⽔準をNSEで指定
    (推奨)

    View full-size slide

  28. dat %>% group_nest(species, keep = TRUE)
    # A tibble: 3 × 2
    species data
    >
    1 Adelie [146 × 8]
    2 Chinstrap [68 × 8]
    3 Gentoo [119 × 8]
    dat %>% group_nest(species, keep = FALSE)
    # A tibble: 3 × 2
    species data
    >
    1 Adelie [146 × 7]
    2 Chinstrap [68 × 7]
    3 Gentoo [119 × 7]

    View full-size slide

  29. dat %>%
    group_nest(species, keep = TRUE) %$%
    data %>% .[[1]] %>% names()
    dat %>%
    group_nest(species, keep = FALSE) %$%
    data %>% .[[1]] %>% names()
    [1] "species" "island" "bill_length_mm"
    [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
    [7] "sex" "year"
    [1] "island" "bill_length_mm" "bill_depth_mm"
    [4] "flipper_length_mm" "body_mass_g" "sex"
    [7] "year"

    View full-size slide

  30. dat %>%
    group_nest(species, island)
    # A tibble: 5 × 3
    species island data
    >
    1 Adelie Biscoe [44 × 6]
    2 Adelie Dream [55 × 6]
    3 Adelie Torgersen [47 × 6]
    4 Chinstrap Dream [68 × 6]
    5 Gentoo Biscoe [119 × 6]
    複数の⽔準を指定できる

    View full-size slide

  31. 畳み込みデータを展開する
    tidyr::unnest()

    View full-size slide

  32. dat_nest <-
    dat %>%
    dplyr::group_nest(species)
    # A tibble: 3 × 2
    species data
    >
    1 Adelie [146 × 7]
    2 Chinstrap [68 × 7]
    3 Gentoo [119 × 7]
    dat_nest %>%
    tidyr::unnest(cols = data)
    データの展開
    展開するカラム名
    (NSE or ⽂字列)

    View full-size slide

  33. dat_nest_ex <-
    tibble::tibble(
    chr = list(letters[1:3]),
    num = list(1:3)
    )
    # A tibble: 1 × 2
    chr num

    1
    dat_nest_ex %>%
    tidyr::unnest(cols = chr)
    # A tibble: 3 × 2
    chr num

    1 a
    2 b
    3 c
    指定されたchrは展開され、
    展開されなかったnumは複製される

    View full-size slide

  34. dat_nest_ex2 <-
    tibble::tibble(
    chr1 = list(letters[1:3]),
    chr2 = list(letters[4:6]),
    num = list(1:3)
    )
    # A tibble: 1 × 3
    chr1 chr2 num

    1

    View full-size slide

  35. dat_nest_ex2 %>%
    tidyr::unnest(cols = chr1)
    # A tibble: 3 × 3
    chr1 chr2 num

    1 a
    2 b
    3 c

    View full-size slide

  36. dat_nest_ex2 %>%
    tidyr::unnest(cols = chr1) %>%
    tidyr::unnest(cols = chr2)
    # A tibble: 9 × 3
    chr1 chr2 num

    1 a d
    2 a e
    3 a f
    4 b d
    5 b e
    6 b f
    7 c d
    8 c e
    9 c f

    View full-size slide

  37. dat_nest_ex2 %>%
    tidyr::unnest(cols = c(chr1, chr2))
    # A tibble: 3 × 3
    chr1 chr2 num

    1 a d
    2 b e
    3 c f
    ⼀度に展開される列同⼠では対応関係が
    保持される(要素数が等しくないとエラー)

    View full-size slide

  38. dat_nest_ex2 %>%
    tidyr::unnest(
    tidyselect::starts_with("chr")
    )
    # A tibble: 3 × 3
    chr1 chr2 num

    1 a d
    2 b e
    3 c f
    列指定にはselectヘルプ関数が使える

    View full-size slide

  39. dat_nest_ex2 %>%
    tidyr::unnest(
    tidyselect::everything()
    )
    # A tibble: 3 × 3
    chr1 chr2 num

    1 a d 1
    2 b e 2
    3 c f 3
    列指定にはselectヘルプ関数が使える

    View full-size slide

  40. データを畳み込む
    dplyr::group_nest()
    畳み込みデータを展開する
    tidyr::unnest()

    View full-size slide

  41. 畳み込みデータを使う
    purrr::map()

    View full-size slide

  42. a <- c(1:5)
    a * 3
    [1] 3 6 9 12 15
    b <- list(1:2, 3:4)
    b * 3
    Error in b * 3 : non-numeric argument to binary operator
    ベクトルの要素に対する演算
    リストの要素に対する演算

    View full-size slide

  43. a <- c(1:5)
    a * 3
    [1] 3 6 9 12 15
    b <- list(1:2, 3:4)
    b %>% purrr::map(function(x){x * 3})
    [[1]]
    [1] 3 6
    [[2]]
    [1] 9 12
    ベクトルの要素に対する演算
    リストの要素に対する演算

    View full-size slide

  44. b %>% purrr::map(function(x){x * 3})
    f <- function(x){x * 3}
    b %>% purrr::map(f) # b %>% map(f())はエラー
    b %>% purrr::map(~(.x * 3))
    b %>% purrr::map(~(. * 3))
    b %>% purrr::map(∖(x){x * 3})
    表記⽅は⾊々(結果は同じ)
    ↑これが今⾵なんだけどPowerPointでは
    バックスラッシュが表⽰できないので却下
    (Macだけ?)

    View full-size slide

  45. dat <-
    tibble::tibble(
    chr = letters[1:3],
    num = 1:3
    )
    dat %>%
    dplyr::mutate(x = num * 3)
    # A tibble: 3 × 3
    chr num x

    1 a 1 3
    2 b 2 6
    3 c 3 9
    普通のテーブルデータの場合
    ベクトルの演算と⼀緒

    View full-size slide

  46. dat <-
    tibble::tibble(
    chr = letters[1:3] ,
    num = list(1:2, 2:3, 3:4)
    )
    dat %>%
    dplyr::mutate(x = num * 3)
    Error in `dplyr::mutate()`:
    ℹ In argument: `x = num * 3`.
    Caused by error in `num * 3`:
    ! non-numeric argument to binary
    operator
    Run `rlang::last_trace()` to see where
    the error occurred.
    畳み込みデータの場合

    View full-size slide

  47. dat <-
    tibble::tibble(
    chr = letters[1:3],
    num = list(1:2, 2:3, 3:4)
    )
    dat %>%
    dplyr::mutate(x = map(num, ~(. * 3)))
    # A tibble: 3 × 3
    chr num x

    1 a
    2 b
    3 c
    畳み込みデータの場合

    View full-size slide

  48. dat <-
    tibble::tibble(
    chr = letters[1:3],
    num = list(1:2, 2:3, 3:4)
    )
    dat %>%
    dplyr::mutate(x = map(num, mean))
    # A tibble: 3 × 3
    chr num x

    1 a
    2 b
    3 c
    畳み込みデータの場合

    View full-size slide

  49. dat <-
    tibble::tibble(
    chr = letters[1:3],
    num = list(1:2, 2:3, 3:4)
    )
    dat %>%
    dplyr::mutate(x = map(num, mean)) %>%
    dplyr::mutate(x = unlist(x))
    # A tibble: 3 × 3
    chr num x

    1 a 1.5
    2 b 2.5
    3 c 3.5
    畳み込みデータの場合

    View full-size slide

  50. dat <-
    tibble::tibble(
    chr = letters[1:3],
    num = list(1:2, 2:3, 3:4)
    )
    dat %>%
    dplyr::mutate(x = map_dbl(num, mean))
    # A tibble: 3 × 3
    chr num x

    1 a 1.5
    2 b 2.5
    3 c 3.5
    畳み込みデータの場合

    View full-size slide

  51. dat <-
    tibble::tibble(
    chr = list("abc", "cdf", "ghr"),
    num = 1:3
    )
    dat %>%
    dplyr::mutate(x = map_chr(
    chr, ~ stringr::str_sub(., 1, 2)))
    # A tibble: 3 × 3
    chr num x

    1 1 ab
    2 2 de
    3 3 gh
    畳み込みデータの場合

    View full-size slide

  52. dat_nest <-
    palmerpenguins::penguins %>%
    na.omit() %>%
    group_nest(island)
    # A tibble: 3 × 2
    island data
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]
    実践例1

    View full-size slide

  53. dat_g <- dat_nest$data[[1]]
    dat_g %>%
    ggplot(data = .) +
    aes(x = bill_length_mm,
    y = body_mass_g,
    color = species) +
    geom_point()
    実践例1

    View full-size slide

  54. gg_pen <- function(data){
    data %>%
    ggplot(data = .) +
    aes(x = bill_length_mm,
    y = body_mass_g,
    color = species) +
    geom_point()
    }
    実践例1

    View full-size slide

  55. dat_nest
    # A tibble: 3 × 2
    island data
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]
    実践例1
    dat_nest_g <-
    dat_nest %>%
    mutate(g = map(data, gg_pen))
    # A tibble: 3 × 3
    island data g
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]

    View full-size slide

  56. dat_nest_g$g
    実践例1
    dat_nest_g
    # A tibble: 3 × 3
    island data g
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]
    [[1]] [[2]] [[3]]

    View full-size slide

  57. dat_nest_g %$% g %>%
    patchwork::wrap_plots(nrow = 1)
    実践例1
    dat_nest_g
    # A tibble: 3 × 3
    island data g
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]

    View full-size slide

  58. dat_nest_g %>%
    mutate(g = map2(
    g, island, ~ .x + ggtitle(.y))
    ) %$%
    g %>%
    patchwork::wrap_plots(nrow = 1)
    実践例1
    dat_nest_g
    # A tibble: 3 × 3
    island data g
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]

    View full-size slide

  59. dat_nest_g %>%
    mutate(g = map2(
    g, island, ~ .x + ggtitle(.y))
    ) %$%
    g %>%
    patchwork::wrap_plots(nrow = 1)
    実践例1

    View full-size slide

  60. dat_nest_g %>%
    mutate(g = map2(
    g, island, ∖(x, y){x + ggtitle(y)}
    ) %$%
    g %>%
    patchwork::wrap_plots(nrow = 1)
    実践例1
    ナウい書き⽅

    View full-size slide

  61. dat_nest_g %>%
    dplyr::rowwise() %>%
    mutate(g = list(g + ggtitle(island))) %$%
    g %>%
    patchwork::wrap_plots(nrow = 1)
    別解

    View full-size slide

  62. 寄り道
    dat %>%
    dplyr::group_nest(island)
    # A tibble: 3 × 2
    island data
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]
    dat %>%
    dplyr::nest_by(island)
    # A tibble: 3 × 2
    # Rowwise: island
    island data
    >
    1 Biscoe [163 × 7]
    2 Dream [123 × 7]
    3 Torgersen [47 × 7]

    View full-size slide

  63. 寄り道:結果は⼀緒
    dat %>%
    dplyr::group_nest(island) %>%
    mutate(g = purrr::map(data, gg_pen))
    dat %>%
    dplyr::nest_by(island) %>%
    mutate(g = list(gg_pen(data))
    dat %>%
    dplyr::nest_nest(island) %>%
    dplyr::rowwise() %>%
    mutate(g = list(gg_pen(data))

    View full-size slide

  64. dplyr::rowwise()について
    https://notchained.hatenablog.com/entry/2020/07/04/211806

    View full-size slide

  65. 実践例2
    dat2 <-
    palmerpenguins::penguins %>%
    na.omit() %>%
    filter(island %in% c("Biscoe", "Dream")) %>%
    mutate(x = bill_length_mm,
    y = body_mass_g)

    View full-size slide

  66. 実践例2
    dat2 %>%
    gg_pen() +
    facet_wrap(~island)

    View full-size slide

  67. 実践例2
    dat2_nest <-
    dat2 %>%
    group_nest(island)
    # A tibble: 2 × 2
    island data
    >
    1 Biscoe [163 × 9]
    2 Dream [123 × 9]

    View full-size slide

  68. 実践例2
    models <-
    list(
    function(df){lm(y ~ 1, data = df)},
    function(df){lm(y ~ x, data = df)},
    function(df){lme4::lmer(y ~ x + (x|species), data = df)},
    function(df){lme4::lmer(y ~ x + (1|species), data = df)},
    function(df){lme4::lmer(y ~ x + (0 + x|species), data = df)}
    )
    回帰モデルの関数をlistにまとめる。
    dat2_nest_model <-
    dat2_nest %>%
    nest() %>% # tibble: 1x1になる
    mutate(model = list(models)) %>%
    unnest(model) %>% # modelについて展開
    tibble::rowid_to_column(“id”) %>% # model番号を振る
    unnest(data) # dataについて展開

    View full-size slide

  69. 実践例2
    dat2_nest_model
    # A tibble: 10 × 4
    id island data model
    >
    1 1 Biscoe [163 × 9]
    2 1 Dream [123 × 9]
    3 2 Biscoe [163 × 9]
    4 2 Dream [123 × 9]
    5 3 Biscoe [163 × 9]
    6 3 Dream [123 × 9]
    7 4 Biscoe [163 × 9]
    8 4 Dream [123 × 9]
    9 5 Biscoe [163 × 9]
    10 5 Dream [123 × 9]
    > dat2_nest_model$model[[5]]
    function(data){lme4::lmer(y ~ x + (x|species), data = data)}

    View full-size slide

  70. dat2_nest_model_fit <-
    dat2_nest_model %>%
    mutate(fit = map2(model, data, ~ .x(.y))) %>%
    mutate(AIC = map_dbl(fit, AIC))
    # A tibble: 10 × 6
    id island data model fit AIC
    >
    1 1 Biscoe [163 × 9] 2641.
    2 1 Dream [123 × 9] 1834.
    3 2 Biscoe [163 × 9] 2413.
    4 2 Dream [123 × 9] 1824.
    5 3 Biscoe [163 × 9] 2393.
    6 3 Dream [123 × 9] 1787.
    7 4 Biscoe [163 × 9] 2389.
    8 4 Dream [123 × 9] 1787.
    9 5 Biscoe [163 × 9] 2390.
    10 5 Dream [123 × 9] 1785.
    実践例2
    cAICを使う⼿もある

    View full-size slide

  71. dat2_nest_model_fit <-
    dat2_nest_model %>%
    mutate(fit = map2(model, data, ~ .x(.y))) %>%
    mutate(AIC = map_dbl(fit, AIC))
    実践例2
    dat2_nest_model_fit <-
    dat2_nest_model %>%
    dplyr::rowwise() %>%
    mutate(fit = list(model(data))) %>%
    mutate(AIC = AIC(fit))
    別解

    View full-size slide

  72. dat2_nest_model_minAIC <-
    dat2_nest_model_fit %>%
    group_by(island) %>%
    filter(AIC == min(AIC))
    # A tibble: 2 × 6
    # Groups: island [2]
    id island data model fit AIC
    >
    1 4 Biscoe [163 × 9] 2389.
    2 5 Dream [123 × 9] 1785.
    実践例2

    View full-size slide

  73. dat2_nest_model_minAIC %$%
    rlang::set_names(fit, island)
    $Biscoe
    Linear mixed model fit by REML ['lmerMod']
    Formula: y ~ x + (1 | species)
    Data: data
    ...
    $Dream
    Linear mixed model fit by REML ['lmerMod']
    Formula: y ~ x + (0 + x | species)
    Data: data
    ...
    実践例2
    切⽚の混合効果
    傾きの混合効果

    View full-size slide

  74. 実践例2
    dat2 %>%
    gg_pen() +
    facet_wrap(~island)
    切⽚の混合効果 傾きの混合効果

    View full-size slide

  75. dat2_model_minAIC_pred <-
    dat2_nest_model_minAIC %>%
    mutate(pred = map(fit, predict)) %>%
    select(island, data, pred) %>%
    unnest(everything())
    # A tibble: 286 × 11
    # Groups: island [2]
    island species bill_l…¹ bill_…² flipp…³ body_…⁴ sex year

    1 Biscoe Adelie 37.8 18.3 174 3400 fema… 2007
    2 Biscoe Adelie 37.7 18.7 180 3600 male 2007
    3 Biscoe Adelie 35.9 19.2 189 3800 fema… 2007
    4 Biscoe Adelie 38.2 18.1 185 3950 male 2007
    5 Biscoe Adelie 38.8 17.2 180 3800 male 2007
    6 Biscoe Adelie 35.3 18.9 187 3800 fema… 2007
    7 Biscoe Adelie 40.6 18.6 183 3550 male 2007
    8 Biscoe Adelie 40.5 17.9 187 3200 fema… 2007
    9 Biscoe Adelie 37.9 18.6 172 3150 fema… 2007
    10 Biscoe Adelie 40.5 18.9 180 3950 male 2007
    # … with 276 more rows, 3 more variables: x , y ,
    # pred , and abbreviated variable names
    # ¹bill_length_mm, ²bill_depth_mm, ³flipper_length_mm,
    # ⁴body_mass_g
    # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable nam
    実践例2

    View full-size slide

  76. dat2_model_minAIC_pred %>%
    gg_pen() +
    geom_path(aes(y = pred)) +
    facet_wrap(~island)
    実践例2
    切⽚の混合効果 傾きの混合効果

    View full-size slide

  77. 表データの加⼯と可視化の概観
    Long
    Wide
    Nested
    plot
    Figures
    Data table
    read_csv write_csv
    pivot_longer
    pivot_wider
    group_nest
    unnest
    ggplot
    ggsave
    wrap_plots
    map
    rowwise

    View full-size slide