Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tokyo.R #97 Data Visualization

Tokyo.R #97 Data Visualization

第97回Tokyo.Rの初心者セッションでトークした際のスライドです。

kilometer

March 19, 2022
Tweet

More Decks by kilometer

Other Decks in Technology

Transcript

  1. #97
    @kilometer00
    2022.03.19
    BeginneR Session
    -- Data Visualization --

    View Slide

  2. Who!?
    誰だ?

    View Slide

  3. Who!?
    名前: 三村 @kilometer
    職業: ポスドク (こうがくはくし)
    専⾨: ⾏動神経科学(霊⻑類)
    脳イメージング
    医療システム⼯学
    R歴: ~ 10年ぐらい
    流⾏: むし社

    View Slide

  4. 宣伝!!(書籍の翻訳に参加しました。)

    View Slide

  5. BeginneR Session

    View Slide

  6. BeginneR

    View Slide

  7. Beginne
    R
    Advance
    d
    Hoxo_m
    If I have seen further it is by standing on the
    shoulders of Giants.
    -- Sir Isaac Newton, 1676

    View Slide

  8. Before After
    BeginneR Session
    BeginneR BeginneR

    View Slide

  9. "a" != "b"
    # is A in B?
    ブール演算⼦ Boolean Algebra
    [1] TRUE
    1 %in% 10:100
    # is A in B?
    [1] FALSE

    View Slide

  10. George Boole
    1815 - 1864
    A Class-Room Introduc2on to Logic
    h7ps://niyamaklogic.wordpress.com/c
    ategory/laws-of-thoughts/
    Mathema;cian
    Philosopher
    &

    View Slide

  11. ブール演算⼦ Boolean Algebra
    A == B A != B
    George Boole
    1815 - 1864
    A | B A & B
    A %in% B
    # equal to # not equal to
    # or # and
    # is A in B?
    wikipedia

    View Slide

  12. Programing

    View Slide

  13. Programing

    View Slide

  14. Programing
    Write
    Run
    Read
    Think
    Write
    Run
    Read
    Think
    Communicate
    Share

    View Slide

  15. Text Image
    Information
    Intention
    Data
    decode
    encode
    Data analysis
    feedback

    View Slide

  16. Text
    Image
    First, A. Next, B.
    Then C. Finally D.
    time
    Intention
    encode
    "Frozen" structure
    A B C D 8me
    value
    α
    β

    View Slide

  17. σʔλ
    情報のうち意思伝達・解釈・処理に
    適した再利⽤可能なもの
    国際電気標準会議(International Electrotechnical Commission, IEC)による定義

    View Slide

  18. σʔλ
    情報のうち意思伝達・解釈・処理に
    適した再利⽤可能なもの
    ৘ใ 実存を符号化した表象

    View Slide

  19. σʔλ
    ৘ใͷ͏ͪҙࢥ఻ୡɾղऍɾॲཧʹ
    దͨ͠࠶ར༻Մೳͳ΋ͷ
    ৘ใ ࣮ଘΛූ߸Խͨ͠ද৅
    ࣮ଘ
    ؍࡯ͷ༗ແʹΑΒͣଘࡏ͍ͯ͠Δ
    ΋ͷͦͷ΋ͷ
    ࣸ૾ʢූ߸Խʣ

    View Slide

  20. ࣸ૾
    Ϧϯΰ
    ʢ࣮ଘʣ
    Ϧϯΰ
    ʢ৘ใʣ
    mapping

    View Slide

  21. ࣸ૾ (mapping)
    𝑓: 𝑋 → 𝑌
    𝑋 𝑌
    ͋Δ৘ใͷू߹ͷཁૉΛɺผͷ৘ใͷू߹ͷ
    ͨͩͭͷཁૉʹରԠ͚ͮΔϓϩηε

    View Slide

  22. ৘ใྔ
    ࣮ଘ
    ৘ใ
    σʔλ Ϧϯΰ
    ූ߸Խ

    View Slide

  23. ৘ใྔ
    ࣮ଘ
    ৘ใ
    σʔλ Ϧϯΰ
    ූ߸Խ
    ৘ใྔͷଛࣦ

    View Slide

  24. Ϧϯΰ
    ࣸ૾
    ϑϧʔπ
    ੺৭

    ը૾

    ࣮ଘ ৘ใ
    νϟωϧ
    mapping
    channel

    View Slide

  25. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    σʔλՄࢹԽ
    ࣸ૾
    mapping

    View Slide

  26. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    σʔλՄࢹԽ
    ࣸ૾
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels
    ৹ඒతνϟωϧ

    View Slide

  27. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    σʔλՄࢹԽ
    ࣸ૾
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels
    ৹ඒతνϟωϧ
    ggplot(data = my_data) +
    aes(x = X, y = Y)) +
    goem_point()
    HHQMPUʹΑΔ࡞ਤ

    View Slide

  28. ࣮ଘ
    ࣸ૾ʢ؍࡯ʣ
    σʔλ
    ࣸ૾ʢσʔλՄࢹԽʣ
    άϥϑ
    𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    EBUB
    mapping
    aesthetic channels
    ৹ඒతνϟωϧ
    σʔλՄࢹԽ

    View Slide

  29. ॳΊͯͷHHQMPU
    library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    ggplot() +
    geom_point(data = dat,
    mapping = aes(x = X, y = Y))

    View Slide

  30. ॳΊͯͷHHQMPU

    View Slide

  31. ॳΊͯͷHHQMPU
    library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    ggplot() +
    geom_point(data = dat,
    mapping = aes(x = X, y = Y))
    EBUBGSBNFͷࢦఆ
    BFT
    ؔ਺ͷதͰ৹ඒతཁૉͱͯ͠ม਺ͱνϟωϧͷରԠΛࢦఆ
    ඳը։࢝Λએݴ ه߸Ͱͭͳ͙
    BFT
    ؔ਺ͷҾ਺໊
    EBUͷม਺໊
    άϥϑͷछྨʹ߹Θͤͨ[email protected]
    ؔ਺Λ࢖༻

    View Slide

  32. library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    ggplot() +
    geom_point(data = dat,
    mapping = aes(x = X, y = Y)) +
    geom_path(data = dat,
    mapping = aes(x = X, y = Y))
    ॳΊ͔ͯΒ൪໨ͷHHQMPU

    View Slide

  33. ॳΊ͔ͯΒ൪໨ͷHHQMPU

    View Slide

  34. HHQMPUίʔυͷॻ͖ํͷ৭ʑ
    ggplot() +
    geom_point(data = dat,
    mapping = aes(x = X, y = Y)) +
    geom_path(data = dat,
    mapping = aes(x = X, y = Y))
    ggplot(data = dat,
    mapping = aes(x = X, y = Y)) +
    geom_point() +
    geom_path()
    ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_point() +
    geom_path()
    ڞ௨ͷࢦఆΛHHQMPU
    ؔ਺ͷதͰߦ͍ɺҎԼলུ͢Δ͜ͱ͕Մೳ
    NBQQJOHͷ৘ใ͕ॻ͔ΕͨBFT
    ؔ਺ΛHHQMPU
    ؔ਺ͷ֎ʹஔ͘͜ͱ΋Ͱ͖Δ

    View Slide

  35. HHQMPUίʔυͷॻ͖ํͷ৭ʑ
    ggplot() +
    geom_point(data = dat,
    mapping = aes(x = X, y = Y, color = tag)) +
    geom_path(data = dat,
    mapping = aes(x = X, y = Y))
    ggplot(data = dat) +
    aes(x = X, y = Y) + # 括り出すのは共通するものだけ
    geom_point(mapping = aes(color = tag)) +
    geom_path()
    ϙΠϯτͷ৭ͷNBQQJOHΛࢦఆ

    View Slide

  36. HHQMPUίʔυͷॻ͖ํͷ৭ʑ
    ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_point(aes(color = tag)) +
    geom_path()
    ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_path() +
    geom_point(aes(color = tag))
    ͋ͱ͔ΒͰॏͶͨཁૉ͕લ໘ʹඳը͞ΕΔ

    View Slide

  37. library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    g ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_path() +
    geom_point(mapping = aes(color = tag))
    HHQMPUը૾ͷอଘ
    ggsave(filename = "fig/demo01.png",
    plot = g,
    width = 4, height = 3, dpi = 150)

    View Slide

  38. library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    g ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_path() +
    geom_point(mapping = aes(color = tag))
    HHQMPUը૾ͷอଘ
    ggsave(filename = "fig/demo01.png",
    plot = g,
    width = 4, height = 3, dpi = 150)
    αΠζ͸σϑΥϧτͰ͸Πϯν୯ҐͰࢦఆ

    View Slide

  39. library(tidyverse)
    dat data.frame(tag = rep(c("a", "b"), each = 2),
    X = c(1, 3, 5, 7),
    Y = c(3, 9, 4, 2))
    g ggplot(data = dat) +
    aes(x = X, y = Y) +
    geom_path() +
    geom_point(mapping = aes(color = tag))
    HHQMPUը૾ͷอଘ
    ggsave(filename = "fig/demo01.png",
    plot = g,
    width = 10, height = 7.5, dpi = 150,
    units = "cm") # "cm", "mm", "in"を指定可能

    View Slide

  40. [email protected]
    ؔ਺܈ DGIUUQTXXXSTUVEJPDPNSFTPVSDFTDIFBUTIFFUT

    View Slide

  41. ෳ਺ͷܥྻΛඳը͢Δ
    > head(anscombe)
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    ggplot(data = anscombe) +
    geom_point(aes(x = x1, y = y1)) +
    geom_point(aes(x = x2, y = y2), color = "Red") +
    geom_point(aes(x = x3, y = y3), color = "Blue") +
    geom_point(aes(x = x4, y = y4), color = "Green")
    ͜Ε·Ͱͷ஌ࣝͰؤுΔͱ͜͏ͳΔ

    View Slide

  42. HHQMPUʹΑΔσʔλՄࢹԽ
    ࣮ଘ
    ࣸ૾ʢ؍࡯ʣ
    σʔλ
    ࣸ૾ʢσʔλՄࢹԽʣ
    άϥϑ
    𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    SBXEBUB
    写像
    aesthetic channels
    ৹ඒతνϟωϧ
    ՄࢹԽʹదͨ͠EBUBܗࣜ
    変形
    ਤͷͭͷ৹ඒతνϟωϧ͕
    σʔλͷͭͷม਺ʹରԠ͍ͯ͠Δ

    View Slide

  43. > head(anscombe)
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    > head(anscombe_long)
    key x y
    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    ggplot(data = anscombe_long) +
    aes(x = x, y = y, color = key) +
    geom_point()
    ৹ඒతνϟωϧ Y࣠ Z࣠ ৭
    ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ
    ݟ௨͠ྑ͘γϯϓϧʹՄࢹԽͰ͖Δ

    View Slide

  44. > head(anscombe)
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    > head(anscombe_long)
    key x y
    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    ৹ඒతνϟωϧ Y࣠ Z࣠ ৭
    ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ
    anscombe_long pivot_longer(data = anscombe,
    cols = everything(),
    names_to = c(".value",
    "key"),
    names_pattern = "(.)(.)")
    ԣ௕σʔλ
    ॎ௕σʔλ

    View Slide

  45. ggplot(data = anscombe_long) +
    aes(x = x, y = y, color = key) +
    geom_point()
    ggplot(data = anscombe_long) +
    aes(x = x, y = y, color = key) +
    geom_point() +
    facet_wrap(facets = . ~ key, nrow = 1)
    ਫ४ͰਤΛ෼ׂ͢Δ

    View Slide

  46. Wide Long
    Nested
    input output
    pivot_longer
    pivot_wider
    group_nest
    unnest
    ggplot
    visualization
    map
    output
    ggsave

    View Slide

  47. Enjoy!!
    KMT©

    View Slide