Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tokyo.R#93 Data processing

Tokyo.R#93 Data processing

第93回Tokyo.Rでトークした際の資料です。

8284465a94bbdf1ea82cf1a67d55f447?s=128

kilometer

July 03, 2021
Tweet

Transcript

  1. #93 @kilometer00 2021.07.03 BeginneR Session -- Data processing --

  2. Who!? Who?

  3. Who!? ・ @kilometer ・Postdoc Researcher (Ph.D. Eng.) ・Neuroscience ・Computational Behavior

    ・Functional brain imaging ・R: ~ 10 years
  4. 宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!

  5. BeginneR Session

  6. BeginneR

  7. Before A'er BeginneR Session BeginneR BeginneR

  8. BeginneR Advanced Hoxo_m If I have seen further it is

    by standing on the shoulders of Giants. -- Sir Isaac Newton, 1676
  9. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017
  10. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data processing Data science
  11. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data science Data Observa@on Hypothesis feedback Data processing Narra/ve of data
  12. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data science Data Observa@on Hypothesis Narra/ve of data feedback Data processing
  13. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 Data processing
  14. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} Data processing
  15. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame tibble Data processing
  16. data.frame

  17. vector in Excel

  18. vector in R in Excel pre <- c(1, 2, 3,

    4, 5) post <- pre * 5 > pre [1] 1 2 3 4 5 > post [1] 5 10 15 20 25
  19. vector vec1 <- c(1, 2, 3, 4, 5) vec2 <-

    1:5 vec3 <- seq(from = 1, to = 5, by = 1) > vec1 [1] 1 2 3 4 5 > vec2 [1] 1 2 3 4 5 > vec3 [1] 1 2 3 4 5
  20. vector vec1 <- seq(from = 1, to = 5, by

    = 1) vec2 <- seq(1, 5, 1) > vec1 [1] 1 2 3 4 5 > vec2 [1] 1 2 3 4 5
  21. > ?seq vector seq{base} Sequence Generation Description Generate regular sequences.

    seq is a standard generic with a default method. … Usage seq(...) ## Default S3 method: seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)), length.out = NULL, along.with = NULL, ...)
  22. vector vec1 <- rep(1:3, times = 2) vec2 <- rep(1:3,

    each = 2) vec3 <- rep(1:3, times = 2, each = 2) > vec1 [1] 1 2 3 1 2 3 > vec2 [1] 1 1 2 2 3 3 > vec3 [1] 1 1 2 2 3 3 1 1 2 2 3 3
  23. vector vec1 <- 11:15 > vec1 [1] 11 12 13

    14 15 > vec1[1] [1] 11 > vec1[3:5] [1] 13 14 15 > vec1[c(1:2, 5)] [1] 11 12 15
  24. list list1 <- list(1:6, 11:15, c("a", "b", "c")) > list1

    [[1]] [1] 1 2 3 4 5 6 [[2]] [1] 11 12 13 14 15 [[3]] [1] "a" "b" "c"
  25. list list1 <- list(1:6, 11:15, c("a", "b", "c")) > list1[[1]]

    [1] 1 2 3 4 5 6 > list1[[3]][2:3] [1] "b" "c" > list1[[2]] * 3 [1] 33 36 39 42 45
  26. named list list2 <- list(A = 1:6, B = 11:15,

    C = c("a", "b", "c")) > list2 $A [1] 1 2 3 4 5 6 $B [1] 11 12 13 14 15 $C [1] "a" "b" "c"
  27. > list2$A [1] 1 2 3 4 5 6 >

    list2$C[2:3] [1] "b" "c" > list2$B * 3 [1] 33 36 39 42 45 named list list2 <- list(A = 1:6, B = 11:15, C = c("a", "b", "c"))
  28. list1 <- list(1:6, 11:15, c("a", "b", "c")) > class(list1) [1]

    "list" > names(list1) NULL list2 <- list(A = 1:6, B = 11:15, C = c("a", "b", "c")) > class(list2) [1] "list" > names(list2) [1] "A" "B" "C" named list list
  29. list3 <- list(A = 1:3, B = 11:13) > class(list3)

    [1] "list" > names(list3) [1] "A" "B" df1 <- data.frame(A = 1:3, B = 11:13) > class(df1) [1] "data.frame" > names(df1) [1] "A" "B" named list & data.frame
  30. > str(list3) List of 2 $ A: int [1:3] 1

    2 3 $ B: int [1:3] 11 12 13 > str(df1) 'data.frame': 3 obs. of 2 variables: $ A: int 1 2 3 $ B: int 11 12 13 list3 <- list(A = 1:3, B = 11:13) df1 <- data.frame(A = 1:3, B = 11:13) named list & data.frame
  31. > list3 $A [1] 1 2 3 $B [1] 11

    12 13 > df1 A B 1 1 11 2 2 12 3 3 13 named list & data.frame
  32. > list3 $A [1] 1 2 3 $B [1] 11

    12 13 > df1 A B 1 1 11 2 2 12 3 3 13 named list & data.frame observa9on variable
  33. data.frame v.s. matrix A B 1 1 11 2 2

    12 3 3 13 [,1] [,2] [1,] 1 11 [2,] 2 12 [3,] 3 13 df1 <- data.frame(A = 1:3, B = 11:13) > str(mat1) int [1:3, 1:2] 1 2 3 11 12 13 > str(df1) 'data.frame': 3 obs. of 2 vars.: $ A: int 1 2 3 $ B: int 11 12 13 mat1 <- matrix(c(1:3, 11:13), 3, 2)
  34. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame tibble Data processing
  35. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame ,bble Data processing Transform (verb func@ons) {dplyr}
  36. vignette("dplyr")

  37. It (dplyr) provides simple “verbs” to help you translate your

    thoughts into code. func?ons that correspond to the most common data manipula?on tasks Introduc>on to dplyr h"ps://cran.r-project.org/web/packages/dplyr/vigne"es/dplyr.html WFSCT {dplyr}
  38. dplyrは、あなたの考えをコードに翻訳 するための【動詞】を提供する。 データ操作における基本のキを、 シンプルに実⾏できる関数 (群) Introduc>on to dplyr h"ps://cran.r-project.org/web/packages/dplyr/vigne"es/dplyr.html WFSCT

    {dplyr} ※ かなり意訳
  39. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on
  40. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>%
  41. 1JQFBMHFCSB X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) f(X) f(X, y) g(f(X)) f(y, X) %>% {magri8r} 「dplyr再⼊⾨(基本編)」yutanihilaCon h"ps://speakerdeck.com/yutannihila6on/dplyrzai-ru-men-ji-ben-bian
  42. ① ② ③ ④ lift take pour put Bring milk

    from the kitchen!
  43. ① lift Bring milk from the kitchen! lift(Robot, glass, table)

    -> Robot' take ② take(Robot', fridge, milk) -> Robot''
  44. Bring milk from the kitchen! Robot' <- lift(Robot, glass, table)

    Robot'' <- take(Robot', fridge, milk) Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④
  45. The @dyverse style guides h"ps://style.;dyverse.org/syntax.html#object-names "There are only two hard

    things in Computer Science: cache invalidation and naming things"
  46. Bring milk from the kitchen! Robot' <- lift(Robot, glass, table)

    Robot'' <- take(Robot', fridge, milk) Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④
  47. Robot' <- lift(Robot, glass, table) Robot'' <- take(Robot', fridge, milk)

    Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④ Thinking Reading Bring milk from the kitchen!
  48. Programing Write Run Read Think Write Run Read Think Communicate

    Share
  49. 1JQFBMHFCSB X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) f(X) f(X, y) g(f(X)) f(y, X) %>% {magrittr} 「dplyr再⼊⾨(基本編)」yutanihilation https://speakerdeck.com/yutannihilation/dplyrzai-ru-men-ji-ben-bian
  50. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>% ✔
  51. WFSCT {dplyr} mutate # カラムの追加 + mutate(dat, C = fun(A,

    B))
  52. WFSCT {dplyr} mutate # カラムの追加 + dat %>% mutate(C =

    fun(A, B))
  53. WFSCT {dplyr} filter # 行の絞り込み dat %>% filter(tag %in% c(1,

    3, 5))
  54. ブール演算⼦ Boolean Algebra A == B A != B George

    Boole 1815 - 1864 A | B A & B A %in% B # equal to # not equal to # or # and # is A in B? wikipedia
  55. "a" != "b" # is A in B? ブール演算⼦ Boolean

    Algebra [1] TRUE 1 %in% 10:100 # is A in B? [1] FALSE
  56. George Boole 1815 - 1864 A Class-Room Introduc;on to Logic

    h"ps://niyamaklogic.wordpress.com/c ategory/laws-of-thoughts/ Mathema=cian Philosopher &
  57. WFSCT {dplyr} select # カラムの選択 dat %>% select(tag, B)

  58. WFSCT {dplyr} select # カラムの選択 dat %>% select("tag", "B")

  59. WFSCT {dplyr} select # カラムの選択 dat %>% select("tag", "B") dat

    %>% select(tag, B)
  60. WFSCT {dplyr} # Select help func?ons starts_with("s") ends_with("s") contains("se") matches("^.e")

    one_of(c(”tag", ”B")) everything() https://kazutan.github.io/blog/2017/04/dplyr-select-memo/ 「dplyr::selectの活⽤例メモ」kazutan
  61. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>% ✔ ✔ ✔ ✔
  62. (SBNNBSPGEBUBNBOJQVMBUJPO By constraining your op9ons, it helps you think about

    your data manipula9on challenges. Introduc>on to dplyr hLps://cran.r-project.org/web/packages/dplyr/vigneLes/dplyr.html
  63. 選択肢を制限することで、 データ解析のステップを シンプルに考えられますヨ。 (めっちゃ意訳) Introduc>on to dplyr hLps://cran.r-project.org/web/packages/dplyr/vigneLes/dplyr.html ※ まさに意訳

    (SBNNBSPGEBUBNBOJQVMBUJPO
  64. より多くの制約を課す事で、 魂の⾜枷から、より⾃由になる。 Igor Stravinsky И8горь Ф Страви́нский The more constraints

    one imposes, the more one frees one's self of the chains that shackle the spirit. 1882 - 1971 ※ 割と意訳
  65. Enjoy!!