Upgrade to Pro — share decks privately, control downloads, hide ads and more …

統計関連学会連合大会チュートリアル資料

kilometer
September 05, 2021
2.9k

 統計関連学会連合大会チュートリアル資料

統計関連学会連合大会のチュートリアルセッションで話した際のスライド資料です。

https://confit.atlas.jp/guide/event/jfssa2021/session/1A01-03/tables?pEtzTuuUqj

kilometer

September 05, 2021
Tweet

Transcript

  1. 3ʹΑΔσʔλղੳͷͨΊͷσʔλՄࢹԽ ࡾଜ ڤੜ ߐޱ ఩࢙ ӝੜ ਅ໵ ౷ܭؔ࿈ֶձ࿈߹େձνϡʔτϦΞϧ 2021.09.05 (online)

    1: 量⼦科学技術研究開発機構 脳機能イメージング研究部 2: 千葉⼤学 予防医学センター 3: 国⽴環境研究所 ⽣物多様性領域
  2. 1. データとは 2. データ可視化事始め 3. 素晴らしい図 4. 図の装飾の明暗 5. グラフの種類

    6. 視覚の認識 7. ゲシュタルトの法則 8. R事始め 9. RStudioの使い⽅ 10. R Markdownの使い⽅ 11. 初めてのggplot2 12. グラフの調整 13. 少し複雑な作図 3とHHQMPUによるデータ可視化入門
  3. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" σʔλՄࢹԽ ࣸ૾ mapping
  4. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" σʔλՄࢹԽ ࣸ૾ mapping x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ৹ඒతνϟωϧ
  5. ϞʔγϣϯτϥοΩϯά sec part x y z 0.0 頭 0.0 胴

    0.0 腰 0.1 頭 0.1 胴 0.1 腰 0.2 頭 ... શ਎ӡಈ σʔλ Mimura et al., iScience, 2021 cf: 3dtrack.org
  6. sec part x y z 0.0 頭 0.0 胴 0.0

    腰 0.1 頭 0.1 胴 0.1 腰 0.2 頭 ... ݱ৅ σʔλ άϥϑ 100時間×3600秒× 30フレーム × 3部位× 3次元 = 97,200,000⾏ 偏向回転運動 τϥοΩϯά σʔλՄࢹԽ Mimura et al., iScience, 2021
  7. sec part x y z 0.0 頭 0.0 胴 0.0

    腰 0.1 頭 0.1 胴 0.1 腰 0.2 頭 ... ݱ৅ σʔλ άϥϑ 100時間×3600秒× 30フレーム × 3部位× 3次元 = 97,200,000⾏ 偏向回転運動 τϥοΩϯά σʔλՄࢹԽ Mimura et al., iScience, 2021
  8. άϥϑͷछྨ ΠΪϦεͷ&6཭୤ Brexit ʹؔ͢Δࠃຽ౤ථ 2016年 ref. “UK: A Divided Nation”,

    by Armstrong, M., 2016, https://www.statista.com/chart/5100/uk-chooses-brexit/
  9. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" ࣸ૾ mapping σʔλՄࢹԽ΋ࣸ૾
  10. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ࣸ૾ mapping σʔλՄࢹԽʹ͓͚Δࣸ૾ͱνϟωϧ ৹ඒతνϟωϧ
  11. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ࣸ૾ mapping ৹ඒతνϟωϧ ggplot(data = my_data) + aes(x = X, y = Y)) + goem_point() HHQMPUʹΑΔ࡞ਤ ʢޙ൒ͷ಺༰ʣ
  12. 3TUVEJPͷجຊͷΩ x <- 1 y <- 2 x + y

    > x + y [1] 3 εΫϦϓτʹॻ͘಺༰ ίϯιʔϧͷग़ྗ
  13. 3TUVEJPͷجຊͷΩ x <- 1 y <- 2 x <- 2

    > x + y [1] 4 εΫϦϓτʹॻ͘಺༰ ίϯιʔϧͷग़ྗ ಉ͡ม਺໊ʹ୅ೖ͢Δͱ্ॻ͖͞ΕΔ ίϝϯτΞ΢τه߸
  14. ύοέʔδ $3"/ 5IF$PNQSFIFOTJWF3"SDIJWF/FUXPSL 3։ൃνʔϜ͕؅ཧ͢ΔύοέʔδϦϙδτϦ https://cran.r-project.org/ install.packages(pkgs = "tidyverse") 'dyverse: データサイエンス関連パッケージ群をまとめたパッケージ

    ・dplyr: テーブルデータの加⼯・集計 ・ggplot2: グラフの描画 ・stringr: ⽂字列加⼯ ・'dyr: データの整形や変形 ・purrrr: 関数型プログラミング⽤ ・magri5r: パイプ演算⼦%>%を提供 $3"/͔ΒύοέʔδΛΠϯετʔϧ ☝
  15. 3ͰςʔϒϧσʔλΛѻ͏ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ sec part x y

    z 0.0 頭 0.0 胴 0.0 腰 0.1 頭 0.1 胴 0.1 腰 0.2 頭 ... ม਺ variable ؍࡯ observation ଌఆ͞Εͨม਺ಉ͕࢜ɺ ؍࡯ʹΑΓඥ͚ͮΒΕͨςʔϒϧ data.frame
  16. EBUBGSBNFΛ࡞Δ my_data <- data.frame(tag = c("a", "b", "c"), x =

    c(1, 2, 3), y = c(4, 7, 9)) > my_data tag x y 1 a 1 4 2 b 2 7 3 c 3 9 ؔ਺ ม਺໊ ΦϒδΣΫτ໊ ୅ೖԋࢉࢠ ϕΫτϧ จࣈܕ ਺஋ܕ
  17. نଇతͳϕΫτϧΛ࡞Δ c(1, 2, 3) c(1:3) seq(from = 1, to =

    3, by = 1) [1] 1 2 3 rep(x = c(1:2), times = 3) [1] 1 2 1 2 1 2 ౳ࠩϕΫτϧ ܁Γฦ͠ϕΫτϧ rep(x = c(1:2), each = 3) [1] 1 1 1 2 2 2
  18. EBUBGSBNFͷಡΈॻ͖ 📁 📁 ࡞ۀϑΥϧμ script.R data.csv (working directory) data 📁fig

    # 現在のwdを確認 getwd() # wdを指定 setwd("directorypath") # フォルダを作成 dir.create("data") my_data <- read.csv(file = "data/data.csv") write.csv(x = my_data file = "data/data.csv") ಡΈࠐΈ ॻ͖ग़͠
  19. 3ͰςʔϒϧσʔλΛѻ͏ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥" 𝑦! 𝑦" EBUB mapping aesthetic channels ৹ඒతνϟωϧ
  20. ॳΊͯͷHHQMPU library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each =

    2), X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) ggplot() + geom_point(data = dat, mapping = aes(x = X, y = Y))
  21. ॳΊͯͷHHQMPU library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each =

    2), X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) ggplot() + geom_point(data = dat, mapping = aes(x = X, y = Y)) EBUBGSBNFͷࢦఆ BFT ؔ਺ͷதͰ৹ඒతཁૉͱͯ͠ม਺ͱνϟωϧͷରԠΛࢦఆ ඳը։࢝Λએݴ ه߸Ͱͭͳ͙ BFT ؔ਺ͷҾ਺໊ EBUͷม਺໊ άϥϑͷछྨʹ߹ΘͤͨHFPN@ ؔ਺Λ࢖༻
  22. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) ggplot() + geom_point(data = dat, mapping = aes(x = X, y = Y)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ॳΊ͔ͯΒ൪໨ͷHHQMPU
  23. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot() + geom_point(data = dat, mapping = aes(x =

    X, y = Y)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ggplot(data = dat, mapping = aes(x = X, y = Y)) + geom_point() + geom_path() ggplot(data = dat) + aes(x = X, y = Y) + geom_point() + geom_path() ڞ௨ͷࢦఆΛHHQMPU ؔ਺ͷதͰߦ͍ɺҎԼলུ͢Δ͜ͱ͕Մೳ NBQQJOHͷ৘ใ͕ॻ͔ΕͨBFT ؔ਺ΛHHQMPU ؔ਺ͷ֎ʹஔ͘͜ͱ΋Ͱ͖Δ
  24. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot() + geom_point(data = dat, mapping = aes(x =

    X, y = Y, color = tag)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ggplot(data = dat) + aes(x = X, y = Y) + # 括り出すのは共通するものだけ geom_point(mapping = aes(color = tag)) + geom_path() ϙΠϯτͷ৭ͷNBQQJOHΛࢦఆ
  25. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot(data = dat) + aes(x = X, y =

    Y) + geom_point(aes(color = tag)) + geom_path() ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(aes(color = tag)) ͋ͱ͔Β ͰॏͶͨཁૉ͕લ໘ʹඳը͞ΕΔ
  26. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 4, height = 3, dpi = 150)
  27. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 4, height = 3, dpi = 150) αΠζ͸σϑΥϧτͰ͸Πϯν୯ҐͰࢦఆ
  28. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 10, height = 7.5, dpi = 150, units = "cm") # "cm", "mm", "in"を指定可能
  29. ෳ਺ͷܥྻΛඳը͢Δ > head(anscombe) x1 x2 x3 x4 y1 y2 y3

    y4 1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 ggplot(data = anscombe) + geom_point(aes(x = x1, y = y1)) + geom_point(aes(x = x2, y = y2), color = "Red") + geom_point(aes(x = x3, y = y3), color = "Blue") + geom_point(aes(x = x4, y = y4), color = "Green") ͜Ε·Ͱͷ஌ࣝͰؤுΔͱ͜͏ͳΔ
  30. HHQMPUʹΑΔσʔλՄࢹԽ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" SBXEBUB 写像 aesthetic channels ৹ඒతνϟωϧ ՄࢹԽʹదͨ͠EBUBܗࣜ 変形 ਤͷͭͷ৹ඒతνϟωϧ͕ σʔλͷͭͷม਺ʹରԠ͍ͯ͠Δ
  31. > head(anscombe) x1 x2 x3 x4 y1 y2 y3 y4

    1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 > head(anscombe_long) key x y 1 1 10 8.04 2 2 10 9.14 3 3 10 7.46 4 4 8 6.58 5 1 8 6.95 6 2 8 8.14 ggplot(data = anscombe_long) + aes(x = x, y = y, color = key) + geom_point() ৹ඒతνϟωϧ Y࣠ Z࣠ ৭ ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ ݟ௨͠ྑ͘γϯϓϧʹՄࢹԽͰ͖Δ
  32. > head(anscombe) x1 x2 x3 x4 y1 y2 y3 y4

    1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 > head(anscombe_long) key x y 1 1 10 8.04 2 2 10 9.14 3 3 10 7.46 4 4 8 6.58 5 1 8 6.95 6 2 8 8.14 ৹ඒతνϟωϧ Y࣠ Z࣠ ৭ ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ anscombe_long <- pivot_longer(data = anscombe, cols = everything(), names_to = c(".value", "key"), names_pattern = "(.)(.)") ԣ௕σʔλ ॎ௕σʔλ
  33. ggplot(data = anscombe_long) + aes(x = x, y = y,

    color = key) + geom_point() ggplot(data = anscombe_long) + aes(x = x, y = y, color = key) + geom_point() + facet_wrap(facets = . ~ key, nrow = 1) ਫ४ͰਤΛ෼ׂ͢Δ
  34. 3ͰςʔϒϧσʔλΛѻ͏ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ sec part x y

    z 0.0 頭 0.0 胴 0.0 腰 0.1 頭 0.1 胴 0.1 腰 0.2 頭 ... ม਺ variable ؍࡯ observation ଌఆ͞Εͨม਺ಉ͕࢜ɺ ؍࡯ʹΑΓඥ͚ͮΒΕͨςʔϒϧ data.frame
  35. EBUBGSBNFΛ࡞Δ my_data <- data.frame(tag = c("a", "b", "c"), x =

    c(1, 2, 3), y = c(4, 7, 9)) > my_data tag x y 1 a 1 4 2 b 2 7 3 c 3 9 ؔ਺ ม਺໊ ΦϒδΣΫτ໊ ୅ೖԋࢉࢠ ϕΫτϧ จࣈܕ ਺஋ܕ
  36. 3ͰςʔϒϧσʔλΛѻ͏ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥" 𝑦! 𝑦" EBUB mapping aesthetic channels ৹ඒతνϟωϧ
  37. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUʹΑΔ࡞ਤͷجຊ ggsave(filename = "fig/demo01.png", plot = g, width = 4, height = 3, dpi = 150)
  38. anscombe_long <- pivot_longer(data = dat, cols = everything(), names_to =

    c(".value", "key"), names_pattern = "(.)(.)") library(tidyverse) dat <- anscombe ggplot(data = anscombe_long) + aes(x = x, y = y, color = key) + geom_point() + facet_wrap(facets = . ~ key, nrow = 1) HHQMPUʹΑΔ࡞ਤʢσʔλΛมܗ͢Δʣ
  39. HHQMPUʹΑΔσʔλՄࢹԽ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" SBXEBUB 写像 aesthetic channels ৹ඒతνϟωϧ ՄࢹԽʹదͨ͠EBUBܗࣜ 変形 ਤͷͭͷ৹ඒతνϟωϧ͕ σʔλͷͭͷม਺ʹରԠ͍ͯ͠Δ