Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rethinking Arrays in R

Davis Vaughan
April 18, 2019
1.8k

Rethinking Arrays in R

Davis Vaughan

April 18, 2019
Tweet

Transcript

  1. dimensionality: The number of dimensions in an array. dimensions: The

    set of lengths describing the shape of the array.
  2. dimensionality VS dimensions Number of 1st dim elements (rows) Number

    of 2nd dim elements (columns) 4 2 6 5 3 1 (2, 3)
  3. dimensionality VS dimensions Number of 1st dim elements (rows) Number

    of 2nd dim elements (columns) The entire set makes up the dimensions 4 2 6 5 3 1 (2, 3)
  4. dimensionality VS dimensions Number of 1st dim elements (rows) Number

    of 2nd dim elements (columns) The entire set makes up the dimensions The dimensionality is 2 (2D object) 4 2 6 5 3 1 (2, 3)
  5. Oh! Let me fix that for you… 4 2 6

    5 3 1 x x[, 1, drop = FALSE] 2 1
  6. Let’s go 3D 4 2 6 5 3 1 y

    10 8 12 11 9 7 y[, 1:2] ?
  7. Let’s go 3D 4 2 6 5 3 1 y

    10 8 12 11 9 7 y[, 1:2] Error: incorrect number of dimensions
  8. Let’s go 3D 4 2 6 5 3 1 y

    10 8 12 11 9 7 y[, 1:2] Error: incorrect number of dimensions http://gph.is/1kA5eNi
  9. Let’s go 3D 4 2 6 5 3 1 y

    10 8 12 11 9 7 y[, 1:2,] 1 3 2 4 7 8 9 10
  10. One column? 4 2 6 5 3 1 y 10

    8 12 11 9 7 y[, 1,] ?
  11. One column? 4 2 6 5 3 1 y 10

    8 12 11 9 7 y[, 1,] 1 7 2 8
  12. One column? 4 2 6 5 3 1 y 10

    8 12 11 9 7 y[, 1,] 1 7 2 8
  13. One column? 4 2 6 5 3 1 y 10

    8 12 11 9 7 y[, 1,] 1 7 2 8
  14. 4 2 6 5 3 1 y 10 8 12

    11 9 7 y[, 1, , drop = FALSE] 1 2 7 8 Oh! Let me fix that for you…
  15. Summary: column selection = How Many? Drops? 2D 3D Proposed

    1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
  16. Summary: column selection = How Many? Drops? 2D 3D Proposed

    1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
  17. Summary: column selection = How Many? Drops? 2D 3D Proposed

    1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
  18. Summary: column selection = How Many? Drops? 2D 3D Proposed

    1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
  19. Summary: column selection = How Many? Drops? 2D 3D Proposed

    1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
  20. Create an rray library(rray) x "<- matrix(1:6, nrow = 2)

    x #> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6 x_rray "<- as_rray(x) x_rray #> <vctrs_rray<integer>[,3][6]> #> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6
  21. Column subsetting…round two 4 2 6 5 3 1 x_rray

    x_rray[, 1] 2 1 x_rray[, 1:2] 3 1 4 2 4 2 6 5 3 1 y_rray 10 8 12 11 9 7 2 1 y_rray[, 1] 7 8 4 2 3 1 y_rray[, 1:2] 7 10 8 9
  22. rray_extract() always drops to 1D 4 2 6 5 3

    1 y_rray 10 8 12 11 9 7 rray_extract(y_rray, , 1) 8 7 2 1
  23. Let’s do some math 4 2 6 5 3 1

    x: (2, 3) 1 + = y: (1)
  24. Let’s do some math 4 2 6 5 3 1

    1 + = 5 3 7 6 4 2 (2, 3) x: (2, 3) y: (1)
  25. x: (2, 3) y: (1) ————————— (?, ?) Step 1

    - increase dimensionality Dimensionality of 2 Dimensionality of 1 Append 1’s to the dimensionality of y until it matches the dimensionality of x x: (2, 3) y: (1, 1) ————————— (?, ?)
  26. Step 1 - increase dimensionality 4 2 6 5 3

    1 x: (2, 3) 1 + = y: (1) 4 2 6 5 3 1 x: (2, 3) 1 + = y: (1, 1)
  27. Step 2 - recycle dimensions If the rows of y

    were recycled to length 2, it would match the length of the rows of x x: (2, 3) y: (2, 1) ————————— (2, ?) x: (2, 3) y: (1, 1) ————————— (?, ?)
  28. Step 2 - recycle dimensions 4 2 6 5 3

    1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 + = y: (1, 1)
  29. Step 2 - recycle dimensions If the columns of y

    were recycled to length 3, it would match the length of the columns of x x: (2, 3) y: (2, 3) ————————— (2, 3) x: (2, 3) y: (2, 1) ————————— (2, ?)
  30. Step 2 - recycle dimensions 4 2 6 5 3

    1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 1 1 1 1 1 + y: (2, 3) =
  31. Step 2 - recycle dimensions 4 2 6 5 3

    1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 1 1 1 1 1 + y: (2, 3) = 5 3 7 6 4 2 (2, 3)
  32. What if we started here? 4 2 6 5 3

    1 x: (2, 3) 1 + = y: (1, 1)
  33. What if we started here? 4 2 6 5 3

    1 x: (2, 3) 1 + = y: (1, 1) Error: non-conformable arrays
  34. What if we started here? 4 2 6 5 3

    1 x: (2, 3) 1 + = y: (1, 1) Error: non-conformable arrays https://tenor.com/view/ronswanson-throw-computer-gif-9550833
  35. We know the result 4 2 6 5 3 1

    1 + = 5 3 7 6 4 2 (2, 3) x: (2, 3) y: (1, 1) 1 1 1 1 1 1 y: (2, 3)
  36. rray broadcasts library(rray) x "<- matrix(1:6, nrow = 2) x

    #> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6 z "<- matrix(1) z #> [,1] #> [1,] 1 x + z #> Error in x + z : non-conformable arrays as_rray(x) + z #> <vctrs_rray<double>[,3][6]> #> [,1] [,2] [,3] #> [1,] 2 4 6 #> [2,] 3 5 7
  37. Let’s go 3D 1 3 2 + = y: (3,

    1) 1 2 3 x: (1, 3, 2) 6 4 5
  38. Let’s go 3D 1 3 2 + = y: (3,

    1) 1 2 3 x: (1, 3, 2) 6 4 5 Can you even do that!"
  39. x: (1, 3, 2) y: (3, 1) ———————————— (?, ?,

    ?) Step 1 - increase dimensionality Dimensionality of 3 Dimensionality of 2 Append 1’s to the dimensionality of y until it matches the dimensionality of x x: (1, 3, 2) y: (3, 1, 1) ———————————— (?, ?, ?)
  40. Step 1 - increase dimensionality 1 3 2 + =

    y: (3, 1, 1) 1 2 3 x: (1, 3, 2) 6 4 5
  41. Step 2 - recycle dimensions x: (1, 3, 2) y:

    (3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
  42. Step 2 - recycle dimensions x: (1, 3, 2) y:

    (3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
  43. Step 2 - recycle dimensions x: (1, 3, 2) y:

    (3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
  44. Step 2 - recycle dimensions x: (1, 3, 2) y:

    (3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
  45. Step 2 - recycle dimensions 1 2 3 2 1

    3 1 3 2 + = y: (3, 3, 2) 1 3 2 3 1 2 1 2 3 x: (3, 3, 2) 6 4 5 5 6 4 6 4 5 1 2 3 2 1 3 1 3 2
  46. Step 2 - recycle dimensions 1 2 3 2 1

    3 1 3 2 + = y: (3, 3, 2) 1 3 2 3 1 2 1 2 3 x: (3, 3, 2) 6 4 5 5 6 4 6 4 5 1 2 3 2 1 3 1 3 2 4 6 5 5 3 4 2 3 4 9 7 8 7 8 6 7 5 6 (3, 3, 2)
  47. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
  48. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
  49. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
  50. cbind( , ) 1 2 4 3 Error: number of

    rows of matrices must match
  51. rbind( , ) 1 2 4 3 Error: number of

    columns of matrices must match
  52. 1 2 4 3 4 4 2 1 3 3

    rray_bind( , , axis = 2)
  53. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
  54. What if we want to “normalize” by dividing by the

    max value? Along columns? Along rows? 4 2 6 5 3 1 x
  55. x / max(x) sweep(x, 1, apply(x, 1, max), “/") sweep(x,

    2, apply(x, 2, max), “/") 4 2 6 5 3 1 4 2 6 5 3 1 4 2 6 5 3 1 .667 .333 1 .833 .500 .167 .667 .333 1 1 .600 .200 1 1 1 .833 .750 .500
  56. x / rray_max(x) x / rray_max(x, axes = 2) x

    / rray_max(x, axes = 1) 4 2 6 5 3 1 4 2 6 5 3 1 4 2 6 5 3 1 .667 .333 1 .833 .500 .167 .667 .333 1 1 .600 .200 1 1 1 .833 .750 .500
  57. x / rray_max(x, axes = 1) 4 2 6 5

    3 1 / 4 2 6 6 4 2 rray_max(x, axes = 1) 4 2 6 5 3 1 6 4 2
  58. x / rray_max(x, axes = 1) 4 2 6 5

    3 1 / 4 2 6 6 4 2 1 1 1 .833 .750 .500 rray_max(x, axes = 1) 4 2 6 5 3 1 6 4 2