1.7k

# Rethinking Arrays in R

April 18, 2019

## Transcript

April 2019

intuition.
3. ### Arrays are: 1. frustrating to work with. 2. difﬁcult to

program around. 3. underpowered.

5. ### dimensionality: The number of dimensions in an array. dimensions: The

set of lengths describing the shape of the array.
6. ### dimensionality VS dimensions Number of 1st dim elements (rows) Number

of 2nd dim elements (columns) 4 2 6 5 3 1 (2, 3)
7. ### dimensionality VS dimensions Number of 1st dim elements (rows) Number

of 2nd dim elements (columns) The entire set makes up the dimensions 4 2 6 5 3 1 (2, 3)
8. ### dimensionality VS dimensions Number of 1st dim elements (rows) Number

of 2nd dim elements (columns) The entire set makes up the dimensions The dimensionality is 2 (2D object) 4 2 6 5 3 1 (2, 3)

1:2] 3 1 4 2

1] ?

1] 1 2
14. ### Oh! Let me ﬁx that for you… 4 2 6

5 3 1 x x[, 1, drop = FALSE] 2 1
15. ### Let’s go 3D 4 2 6 5 3 1 y

10 8 12 11 9 7 y[, 1:2] ?
16. ### Let’s go 3D 4 2 6 5 3 1 y

10 8 12 11 9 7 y[, 1:2] Error: incorrect number of dimensions
17. ### Let’s go 3D 4 2 6 5 3 1 y

10 8 12 11 9 7 y[, 1:2] Error: incorrect number of dimensions http://gph.is/1kA5eNi
18. ### Let’s go 3D 4 2 6 5 3 1 y

10 8 12 11 9 7 y[, 1:2,] 1 3 2 4 7 8 9 10
19. ### One column? 4 2 6 5 3 1 y 10

8 12 11 9 7 y[, 1,] ?
20. ### One column? 4 2 6 5 3 1 y 10

8 12 11 9 7 y[, 1,] 1 7 2 8
21. ### One column? 4 2 6 5 3 1 y 10

8 12 11 9 7 y[, 1,] 1 7 2 8
22. ### One column? 4 2 6 5 3 1 y 10

8 12 11 9 7 y[, 1,] 1 7 2 8
23. ### 4 2 6 5 3 1 y 10 8 12

11 9 7 y[, 1, , drop = FALSE] 1 2 7 8 Oh! Let me ﬁx that for you…

25. ### Summary: column selection = How Many? Drops? 2D 3D Proposed

1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
26. ### Summary: column selection = How Many? Drops? 2D 3D Proposed

1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
27. ### Summary: column selection = How Many? Drops? 2D 3D Proposed

1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
28. ### Summary: column selection = How Many? Drops? 2D 3D Proposed

1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D
29. ### Summary: column selection = How Many? Drops? 2D 3D Proposed

1 No x[, 1, drop = F] x[, 1, , drop = F] x[, 1] >1 No x[, 1:2] x[, 1:2, ] x[, 1:2] 1 Yes x[, 1] x[, 1, ]* extract(x, , 1) >1 Yes x[, 1:2, drop = T] x[, 1:2, , drop = T] extract(x, , 1:2) * Drops to 2D

32. ### Create an rray library(rray) x "<- matrix(1:6, nrow = 2)

x #> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6 x_rray "<- as_rray(x) x_rray #> <vctrs_rray<integer>[,3]> #> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6
33. ### Column subsetting…round two 4 2 6 5 3 1 x_rray

x_rray[, 1] 2 1 x_rray[, 1:2] 3 1 4 2 4 2 6 5 3 1 y_rray 10 8 12 11 9 7 2 1 y_rray[, 1] 7 8 4 2 3 1 y_rray[, 1:2] 7 10 8 9
34. ### rray_extract() always drops to 1D 4 2 6 5 3

1 y_rray 10 8 12 11 9 7 rray_extract(y_rray, , 1) 8 7 2 1

37. ### Let’s do some math 4 2 6 5 3 1

x: (2, 3) 1 + = y: (1)
38. ### Let’s do some math 4 2 6 5 3 1

1 + = 5 3 7 6 4 2 (2, 3) x: (2, 3) y: (1)

40. ### x: (2, 3) y: (1) ————————— (?, ?) Step 1

- increase dimensionality Dimensionality of 2 Dimensionality of 1 Append 1’s to the dimensionality of y until it matches the dimensionality of x x: (2, 3) y: (1, 1) ————————— (?, ?)
41. ### Step 1 - increase dimensionality 4 2 6 5 3

1 x: (2, 3) 1 + = y: (1) 4 2 6 5 3 1 x: (2, 3) 1 + = y: (1, 1)
42. ### Step 2 - recycle dimensions If the rows of y

were recycled to length 2, it would match the length of the rows of x x: (2, 3) y: (2, 1) ————————— (2, ?) x: (2, 3) y: (1, 1) ————————— (?, ?)
43. ### Step 2 - recycle dimensions 4 2 6 5 3

1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 + = y: (1, 1)
44. ### Step 2 - recycle dimensions If the columns of y

were recycled to length 3, it would match the length of the columns of x x: (2, 3) y: (2, 3) ————————— (2, 3) x: (2, 3) y: (2, 1) ————————— (2, ?)
45. ### Step 2 - recycle dimensions 4 2 6 5 3

1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 1 1 1 1 1 + y: (2, 3) =
46. ### Step 2 - recycle dimensions 4 2 6 5 3

1 x: (2, 3) 1 1 + = y: (2, 1) 4 2 6 5 3 1 x: (2, 3) 1 1 1 1 1 1 + y: (2, 3) = 5 3 7 6 4 2 (2, 3)
47. ### What if we started here? 4 2 6 5 3

1 x: (2, 3) 1 + = y: (1, 1)
48. ### What if we started here? 4 2 6 5 3

1 x: (2, 3) 1 + = y: (1, 1) Error: non-conformable arrays
49. ### What if we started here? 4 2 6 5 3

1 x: (2, 3) 1 + = y: (1, 1) Error: non-conformable arrays https://tenor.com/view/ronswanson-throw-computer-gif-9550833
50. ### R doesn’t broadcast. We just got lucky that it worked

with scalars.
51. ### We know the result 4 2 6 5 3 1

1 + = 5 3 7 6 4 2 (2, 3) x: (2, 3) y: (1, 1) 1 1 1 1 1 1 y: (2, 3)

53. ### rray broadcasts library(rray) x "<- matrix(1:6, nrow = 2) x

#> [,1] [,2] [,3] #> [1,] 1 3 5 #> [2,] 2 4 6 z "<- matrix(1) z #> [,1] #> [1,] 1 x + z #> Error in x + z : non-conformable arrays as_rray(x) + z #> <vctrs_rray<double>[,3]> #> [,1] [,2] [,3] #> [1,] 2 4 6 #> [2,] 3 5 7
54. ### How? All hail our C++ overlords at QuantStack. Buy them

a beer for creating xtensor.
55. ### How? All hail our C++ overlords at QuantStack. Buy them

a beer for creating xtensor.
56. ### How? All hail our C++ overlords at QuantStack. Buy them

a beer for creating xtensor.
57. ### Let’s go 3D 1 3 2 + = y: (3,

1) 1 2 3 x: (1, 3, 2) 6 4 5
58. ### Let’s go 3D 1 3 2 + = y: (3,

1) 1 2 3 x: (1, 3, 2) 6 4 5 Can you even do that!"
59. ### x: (1, 3, 2) y: (3, 1) ———————————— (?, ?,

?) Step 1 - increase dimensionality Dimensionality of 3 Dimensionality of 2 Append 1’s to the dimensionality of y until it matches the dimensionality of x x: (1, 3, 2) y: (3, 1, 1) ———————————— (?, ?, ?)
60. ### Step 1 - increase dimensionality 1 3 2 + =

y: (3, 1, 1) 1 2 3 x: (1, 3, 2) 6 4 5
61. ### Step 2 - recycle dimensions x: (1, 3, 2) y:

(3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
62. ### Step 2 - recycle dimensions x: (1, 3, 2) y:

(3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
63. ### Step 2 - recycle dimensions x: (1, 3, 2) y:

(3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
64. ### Step 2 - recycle dimensions x: (1, 3, 2) y:

(3, 1, 1) ———————————— (?, ?, ?) x: (3, 3, 2) y: (3, 3, 2) ———————————— (3, 3, 2) recycle
65. ### Step 2 - recycle dimensions 1 2 3 2 1

3 1 3 2 + = y: (3, 3, 2) 1 3 2 3 1 2 1 2 3 x: (3, 3, 2) 6 4 5 5 6 4 6 4 5 1 2 3 2 1 3 1 3 2
66. ### Step 2 - recycle dimensions 1 2 3 2 1

3 1 3 2 + = y: (3, 3, 2) 1 3 2 3 1 2 1 2 3 x: (3, 3, 2) 6 4 5 5 6 4 6 4 5 1 2 3 2 1 3 1 3 2 4 6 5 5 3 4 2 3 4 9 7 8 7 8 6 7 5 6 (3, 3, 2)

68. ### rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …

71. ### rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
72. ### rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …

75. ### cbind( , ) 1 2 4 3 Error: number of

rows of matrices must match

77. ### rbind( , ) 1 2 4 3 Error: number of

columns of matrices must match

3 4 2 1 1 2
79. ### 1 2 4 3 4 4 2 1 3 3

rray_bind( , , axis = 2)
80. ### 1 2 4 3 rray_bind( , , axis = 3)

1 1 2 2 3 4 4 3
81. ### rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() …
82. ### What if we want to “normalize” by dividing by the

max value? Along columns? Along rows? 4 2 6 5 3 1 x
83. ### x / max(x) sweep(x, 1, apply(x, 1, max), “/") sweep(x,

2, apply(x, 2, max), “/") 4 2 6 5 3 1 4 2 6 5 3 1 4 2 6 5 3 1 .667 .333 1 .833 .500 .167 .667 .333 1 1 .600 .200 1 1 1 .833 .750 .500
84. ### x / rray_max(x) x / rray_max(x, axes = 2) x

/ rray_max(x, axes = 1) 4 2 6 5 3 1 4 2 6 5 3 1 4 2 6 5 3 1 .667 .333 1 .833 .500 .167 .667 .333 1 1 .600 .200 1 1 1 .833 .750 .500

6 4 2
86. ### x / rray_max(x, axes = 1) 4 2 6 5

3 1 / 4 2 6 6 4 2 rray_max(x, axes = 1) 4 2 6 5 3 1 6 4 2
87. ### x / rray_max(x, axes = 1) 4 2 6 5

3 1 / 4 2 6 6 4 2 1 1 1 .833 .750 .500 rray_max(x, axes = 1) 4 2 6 5 3 1 6 4 2
88. ### Arrays are: 1. frustrating to work with. 2. difﬁcult to

program around. 3. underpowered.
89. ### Arrays are: 1. intuitive to work with. 2. predictable to

program around. 3. powerful.