Slide 1

Slide 1 text

Rethinking Arrays in R Davis Vaughan @dvaughan32 Software Engineer, RStudio July 2019

Slide 2

Slide 2 text

peanut plain blue red green 12 9 10 15 6 8

Slide 3

Slide 3 text

This is a (2, 3) matrix peanut plain blue red green 12 9 10 15 6 8

Slide 4

Slide 4 text

peanut plain blue red green 12 9 10 15 6 8

Slide 5

Slide 5 text

+ = peanut plain blue red green 12 9 10 15 6 8 blue red green 2 3 1

Slide 6

Slide 6 text

Error: non-conformable arrays + = peanut plain blue red green 12 9 10 15 6 8 blue red green 2 3 1

Slide 7

Slide 7 text

peanut plain blue red green 14 11 13 18 7 9 + = blue red green 2 3 1 peanut plain blue red green 12 9 10 15 6 8

Slide 8

Slide 8 text

Subsetting Broadcasting Manipulation

Slide 9

Slide 9 text

Subsetting

Slide 10

Slide 10 text

bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6

Slide 11

Slide 11 text

bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6 6 8 12 10 blue 15 pb green 9 plain red ? bag bag[,1]

Slide 12

Slide 12 text

6 8 12 10 blue 15 peanut green 9 plain red peanut plain 10 ] [ 15 bag bag[,1] bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6

Slide 13

Slide 13 text

6 8 12 10 blue 15 peanut green 9 plain red plain 10 peanut green 15 bag bag[, 1, drop = FALSE] bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6

Slide 14

Slide 14 text

bag1 bag2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain

Slide 15

Slide 15 text

bag1 bag2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain This is a (2, 3, 2) 3D array

Slide 16

Slide 16 text

bag1 bag2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain

Slide 17

Slide 17 text

bag1 bag2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red 5 6 15 3 13 peanut 11 plain bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] bag[, 1, drop = FALSE]

Slide 18

Slide 18 text

The confusion? Subsetting is not dimensionality-stable.

Slide 19

Slide 19 text

rray

Slide 20

Slide 20 text

rray is designed to provide a stricter array class.

Slide 21

Slide 21 text

Create an rray library(rray) bag #> green red blue #> peanut 15 8 12 #> plain 10 6 9 bag_rray <- as_rray(bag) bag_rray #> [,3][2]> #> green red blue #> peanut 15 8 12 #> plain 10 6 9

Slide 22

Slide 22 text

bag1 bag2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags_rray 10 green plain 15 peanut 3 13 plain peanut bags_rray[, 1] bag1 bag2 6 red peanut 8 plain 10 15 green 3 plain 6 13 5 peanut bags_rray[, 1:2] 6 8 12 10 blue 15 peanut green 9 plain red 15 6 10 plain 8 peanut red green bag_rray bag_rray[, 1:2] peanut 10 15 green plain bag_rray[, 1] 5 6 15 3 13 peanut 11 plain

Slide 23

Slide 23 text

Broadcasting

Slide 24

Slide 24 text

Broadcasting has to do with 1) increasing dimensionality 2) recycling dimensions

Slide 25

Slide 25 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + blue red green 2 3 1 extra (1, 3) 5 6 15 3 13 peanut 11 plain = Error: non-conformable arrays

Slide 26

Slide 26 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3

Slide 27

Slide 27 text

How is extra reshaped so that this works?

Slide 28

Slide 28 text

# row # col # frame bags 2 3 2 extra 1 3 ? ? ?

Slide 29

Slide 29 text

# row # col # frame bags 2 3 2 extra 1 3 ? ? ?

Slide 30

Slide 30 text

# row # col # frame bags 2 3 2 extra 1 3 1 ? ? ?

Slide 31

Slide 31 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (1, 3, 1) = 5 6 15 3 13 peanut 11 plain blue green red 3 1 2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + blue red green 2 3 1 extra (1, 3) = 5 6 15 3 13 peanut 11 plain

Slide 32

Slide 32 text

# row # col # frame bags 2 3 2 extra 1 3 1 ? ? ?

Slide 33

Slide 33 text

# row # col # frame bags 2 3 2 extra 1 2 3 1 ? 2 ? ?

Slide 34

Slide 34 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (2, 3, 1) = 5 6 15 3 13 peanut 11 plain 2 1 3 blue green red 3 1 2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (1, 3, 1) = 5 6 15 3 13 peanut 11 plain blue green red 3 1 2

Slide 35

Slide 35 text

# row # col # frame bags 2 3 2 extra 1 2 3 1 ? 2 ? ?

Slide 36

Slide 36 text

# row # col # frame bags 2 3 2 extra 1 2 3 1 ? 2 ? 3 ?

Slide 37

Slide 37 text

# row # col # frame bags 2 3 2 extra 1 2 3 1 2 ? 2 ? 3 ? 2

Slide 38

Slide 38 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3

Slide 39

Slide 39 text

bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3

Slide 40

Slide 40 text

Match dimensionality by appending 1’s Match dimensions by recycling them Broadcasting rules:

Slide 41

Slide 41 text

rray broadcasts bag #> green red blue #> peanut 15 8 12 #> plain 10 6 9 extra #> green red blue #> [1,] 3 1 2 bag + extra #> Error in bag + extra: non-conformable arrays bag_rray + extra #> [,3][2]> #> green red blue #> peanut 18 9 14 #> plain 13 7 11

Slide 42

Slide 42 text

Manipulation

Slide 43

Slide 43 text

rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max() rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...

Slide 44

Slide 44 text

The best part? They all work with base R.

Slide 45

Slide 45 text

The best part? They all work with base R.

Slide 46

Slide 46 text

rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max() rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...

Slide 47

Slide 47 text

rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max() rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...

Slide 48

Slide 48 text

6 8 12 10 blue 15 peanut green 9 plain red Compute proportions 1) Overall 2) By filling 3) By color

Slide 49

Slide 49 text

0.43 0.57 0.57 0.40 blue 0.60 peanut green 0.43 plain red 0.24 0.23 0.34 0.40 blue 0.43 peanut green 0.36 plain red 0.10 0.13 0.20 0.17 blue 0.25 peanut green 0.15 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red bag / sum(bag) sweep(bag, 1, apply(bag, 1, sum), "/") sweep(bag, 2, apply(bag, 2, sum), "/") Overall By Filling By Color

Slide 50

Slide 50 text

bag / rray_sum(bag) bag / rray_sum(bag, axes = 2) bag / rray_sum(bag, axes = 1) Overall By Filling By Color 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 0.43 0.57 0.57 0.40 blue 0.60 peanut green 0.43 plain red 0.24 0.23 0.34 0.40 blue 0.43 peanut green 0.36 plain red 0.10 0.13 0.20 0.17 blue 0.25 peanut green 0.15 plain red

Slide 51

Slide 51 text

In conclusion...

Slide 52

Slide 52 text

1) Stricter rray class 2) Broadcasting 3) Toolkit

Slide 53

Slide 53 text

GitHub https://github.com/r-lib/rray Website https://rray.r-lib.org Powered by: xtensor https://github.com/QuantStack/xtensor Questions?