Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tokyo.R#93 Data processing

Tokyo.R#93 Data processing

第93回Tokyo.Rでトークした際の資料です。

kilometer

July 03, 2021
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. BeginneR Advanced Hoxo_m If I have seen further it is

    by standing on the shoulders of Giants. -- Sir Isaac Newton, 1676
  2. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data processing Data science
  3. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data science Data Observa@on Hypothesis feedback Data processing Narra/ve of data
  4. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 preprocessing Data science Data Observa@on Hypothesis Narra/ve of data feedback Data processing
  5. import Tidy Transform Visualise Model Communicate Modified from “R for

    Data Science”, H. Wickham, 2017 Data processing
  6. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} Data processing
  7. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame tibble Data processing
  8. vector in R in Excel pre <- c(1, 2, 3,

    4, 5) post <- pre * 5 > pre [1] 1 2 3 4 5 > post [1] 5 10 15 20 25
  9. vector vec1 <- c(1, 2, 3, 4, 5) vec2 <-

    1:5 vec3 <- seq(from = 1, to = 5, by = 1) > vec1 [1] 1 2 3 4 5 > vec2 [1] 1 2 3 4 5 > vec3 [1] 1 2 3 4 5
  10. vector vec1 <- seq(from = 1, to = 5, by

    = 1) vec2 <- seq(1, 5, 1) > vec1 [1] 1 2 3 4 5 > vec2 [1] 1 2 3 4 5
  11. > ?seq vector seq{base} Sequence Generation Description Generate regular sequences.

    seq is a standard generic with a default method. … Usage seq(...) ## Default S3 method: seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)), length.out = NULL, along.with = NULL, ...)
  12. vector vec1 <- rep(1:3, times = 2) vec2 <- rep(1:3,

    each = 2) vec3 <- rep(1:3, times = 2, each = 2) > vec1 [1] 1 2 3 1 2 3 > vec2 [1] 1 1 2 2 3 3 > vec3 [1] 1 1 2 2 3 3 1 1 2 2 3 3
  13. vector vec1 <- 11:15 > vec1 [1] 11 12 13

    14 15 > vec1[1] [1] 11 > vec1[3:5] [1] 13 14 15 > vec1[c(1:2, 5)] [1] 11 12 15
  14. list list1 <- list(1:6, 11:15, c("a", "b", "c")) > list1

    [[1]] [1] 1 2 3 4 5 6 [[2]] [1] 11 12 13 14 15 [[3]] [1] "a" "b" "c"
  15. list list1 <- list(1:6, 11:15, c("a", "b", "c")) > list1[[1]]

    [1] 1 2 3 4 5 6 > list1[[3]][2:3] [1] "b" "c" > list1[[2]] * 3 [1] 33 36 39 42 45
  16. named list list2 <- list(A = 1:6, B = 11:15,

    C = c("a", "b", "c")) > list2 $A [1] 1 2 3 4 5 6 $B [1] 11 12 13 14 15 $C [1] "a" "b" "c"
  17. > list2$A [1] 1 2 3 4 5 6 >

    list2$C[2:3] [1] "b" "c" > list2$B * 3 [1] 33 36 39 42 45 named list list2 <- list(A = 1:6, B = 11:15, C = c("a", "b", "c"))
  18. list1 <- list(1:6, 11:15, c("a", "b", "c")) > class(list1) [1]

    "list" > names(list1) NULL list2 <- list(A = 1:6, B = 11:15, C = c("a", "b", "c")) > class(list2) [1] "list" > names(list2) [1] "A" "B" "C" named list list
  19. list3 <- list(A = 1:3, B = 11:13) > class(list3)

    [1] "list" > names(list3) [1] "A" "B" df1 <- data.frame(A = 1:3, B = 11:13) > class(df1) [1] "data.frame" > names(df1) [1] "A" "B" named list & data.frame
  20. > str(list3) List of 2 $ A: int [1:3] 1

    2 3 $ B: int [1:3] 11 12 13 > str(df1) 'data.frame': 3 obs. of 2 variables: $ A: int 1 2 3 $ B: int 11 12 13 list3 <- list(A = 1:3, B = 11:13) df1 <- data.frame(A = 1:3, B = 11:13) named list & data.frame
  21. > list3 $A [1] 1 2 3 $B [1] 11

    12 13 > df1 A B 1 1 11 2 2 12 3 3 13 named list & data.frame
  22. > list3 $A [1] 1 2 3 $B [1] 11

    12 13 > df1 A B 1 1 11 2 2 12 3 3 13 named list & data.frame observa9on variable
  23. data.frame v.s. matrix A B 1 1 11 2 2

    12 3 3 13 [,1] [,2] [1,] 1 11 [2,] 2 12 [3,] 3 13 df1 <- data.frame(A = 1:3, B = 11:13) > str(mat1) int [1:3, 1:2] 1 2 3 11 12 13 > str(df1) 'data.frame': 3 obs. of 2 vars.: $ A: int 1 2 3 $ B: int 11 12 13 mat1 <- matrix(c(1:3, 11:13), 3, 2)
  24. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame tibble Data processing
  25. raed_csv() write_csv() Data Wide form Long form pivot_longer() Nested form

    pivot_wider() Figures group_nest() unnest() {ggplot2} {patchwork} data.frame ,bble Data processing Transform (verb func@ons) {dplyr}
  26. It (dplyr) provides simple “verbs” to help you translate your

    thoughts into code. func?ons that correspond to the most common data manipula?on tasks Introduc>on to dplyr h"ps://cran.r-project.org/web/packages/dplyr/vigne"es/dplyr.html WFSCT {dplyr}
  27. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on
  28. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>%
  29. 1JQFBMHFCSB X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) f(X) f(X, y) g(f(X)) f(y, X) %>% {magri8r} 「dplyr再⼊⾨(基本編)」yutanihilaCon h"ps://speakerdeck.com/yutannihila6on/dplyrzai-ru-men-ji-ben-bian
  30. ① lift Bring milk from the kitchen! lift(Robot, glass, table)

    -> Robot' take ② take(Robot', fridge, milk) -> Robot''
  31. Bring milk from the kitchen! Robot' <- lift(Robot, glass, table)

    Robot'' <- take(Robot', fridge, milk) Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④
  32. The @dyverse style guides h"ps://style.;dyverse.org/syntax.html#object-names "There are only two hard

    things in Computer Science: cache invalidation and naming things"
  33. Bring milk from the kitchen! Robot' <- lift(Robot, glass, table)

    Robot'' <- take(Robot', fridge, milk) Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④
  34. Robot' <- lift(Robot, glass, table) Robot'' <- take(Robot', fridge, milk)

    Robot''' <- pour(Robot'', milk, glass) result <- put(Robot''', glass, table) result <- Robot %>% lift(glass, table) %>% take(fridge, milk) %>% pour(milk, glass) %>% put(glass, table) by using pipe, # ① # ② # ③ # ④ # ① # ② # ③ # ④ Thinking Reading Bring milk from the kitchen!
  35. 1JQFBMHFCSB X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) f(X) f(X, y) g(f(X)) f(y, X) %>% {magrittr} 「dplyr再⼊⾨(基本編)」yutanihilation https://speakerdeck.com/yutannihilation/dplyrzai-ru-men-ji-ben-bian
  36. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>% ✔
  37. ブール演算⼦ Boolean Algebra A == B A != B George

    Boole 1815 - 1864 A | B A & B A %in% B # equal to # not equal to # or # and # is A in B? wikipedia
  38. "a" != "b" # is A in B? ブール演算⼦ Boolean

    Algebra [1] TRUE 1 %in% 10:100 # is A in B? [1] FALSE
  39. George Boole 1815 - 1864 A Class-Room Introduc;on to Logic

    h"ps://niyamaklogic.wordpress.com/c ategory/laws-of-thoughts/ Mathema=cian Philosopher &
  40. WFSCT {dplyr} # Select help func?ons starts_with("s") ends_with("s") contains("se") matches("^.e")

    one_of(c(”tag", ”B")) everything() https://kazutan.github.io/blog/2017/04/dplyr-select-memo/ 「dplyr::selectの活⽤例メモ」kazutan
  41. 1. mutate() 2. filter() 3. select() 4. group_by() 5. summarize()

    6. left_join() 7. arrange() Data.frame manipula9on 0. %>% ✔ ✔ ✔ ✔
  42. (SBNNBSPGEBUBNBOJQVMBUJPO By constraining your op9ons, it helps you think about

    your data manipula9on challenges. Introduc>on to dplyr hLps://cran.r-project.org/web/packages/dplyr/vigneLes/dplyr.html
  43. より多くの制約を課す事で、 魂の⾜枷から、より⾃由になる。 Igor Stravinsky И8горь Ф Страви́нский The more constraints

    one imposes, the more one frees one's self of the chains that shackle the spirit. 1882 - 1971 ※ 割と意訳