useR-2019-rray.pdf

F3a9889311273df8c6f72ed94a91a3fd?s=47 Davis Vaughan
July 05, 2019
540

 useR-2019-rray.pdf

F3a9889311273df8c6f72ed94a91a3fd?s=128

Davis Vaughan

July 05, 2019
Tweet

Transcript

  1. Rethinking Arrays in R Davis Vaughan @dvaughan32 Software Engineer, RStudio

    July 2019
  2. peanut plain blue red green 12 9 10 15 6

    8
  3. This is a (2, 3) matrix peanut plain blue red

    green 12 9 10 15 6 8
  4. peanut plain blue red green 12 9 10 15 6

    8
  5. + = peanut plain blue red green 12 9 10

    15 6 8 blue red green 2 3 1
  6. Error: non-conformable arrays + = peanut plain blue red green

    12 9 10 15 6 8 blue red green 2 3 1
  7. peanut plain blue red green 14 11 13 18 7

    9 + = blue red green 2 3 1 peanut plain blue red green 12 9 10 15 6 8
  8. Subsetting Broadcasting Manipulation

  9. Subsetting

  10. bag bag[,1:2] peanut plain blue red green 12 9 10

    15 6 8 red 15 8 green peanut plain 10 6
  11. bag bag[,1:2] peanut plain blue red green 12 9 10

    15 6 8 red 15 8 green peanut plain 10 6 6 8 12 10 blue 15 pb green 9 plain red ? bag bag[,1]
  12. 6 8 12 10 blue 15 peanut green 9 plain

    red peanut plain 10 ] [ 15 bag bag[,1] bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6
  13. 6 8 12 10 blue 15 peanut green 9 plain

    red plain 10 peanut green 15 bag bag[, 1, drop = FALSE] bag bag[,1:2] peanut plain blue red green 12 9 10 15 6 8 red 15 8 green peanut plain 10 6
  14. bag1 bag2 bag1 bag2 6 8 12 10 blue 15

    peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain
  15. bag1 bag2 bag1 bag2 6 8 12 10 blue 15

    peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain This is a (2, 3, 2) 3D array
  16. bag1 bag2 bag1 bag2 6 8 12 10 blue 15

    peanut green 9 plain red bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] 5 6 15 3 13 peanut 11 plain
  17. bag1 bag2 bag1 bag2 6 8 12 10 blue 15

    peanut green 9 plain red 5 6 15 3 13 peanut 11 plain bags 10 green plain 15 peanut 3 13 plain peanut bags[, 1, , drop = FALSE] bag[, 1, drop = FALSE]
  18. The confusion? Subsetting is not dimensionality-stable.

  19. rray

  20. rray is designed to provide a stricter array class.

  21. Create an rray library(rray) bag #> green red blue #>

    peanut 15 8 12 #> plain 10 6 9 bag_rray <- as_rray(bag) bag_rray #> <rray<dbl>[,3][2]> #> green red blue #> peanut 15 8 12 #> plain 10 6 9
  22. bag1 bag2 bag1 bag2 6 8 12 10 blue 15

    peanut green 9 plain red bags_rray 10 green plain 15 peanut 3 13 plain peanut bags_rray[, 1] bag1 bag2 6 red peanut 8 plain 10 15 green 3 plain 6 13 5 peanut bags_rray[, 1:2] 6 8 12 10 blue 15 peanut green 9 plain red 15 6 10 plain 8 peanut red green bag_rray bag_rray[, 1:2] peanut 10 15 green plain bag_rray[, 1] 5 6 15 3 13 peanut 11 plain
  23. Broadcasting

  24. Broadcasting has to do with 1) increasing dimensionality 2) recycling

    dimensions
  25. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + blue red green 2 3 1 extra (1, 3) 5 6 15 3 13 peanut 11 plain = Error: non-conformable arrays
  26. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3
  27. How is extra reshaped so that this works?

  28. # row # col # frame bags 2 3 2

    extra 1 3 ? ? ?
  29. # row # col # frame bags 2 3 2

    extra 1 3 ? ? ?
  30. # row # col # frame bags 2 3 2

    extra 1 3 1 ? ? ?
  31. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + extra (1, 3, 1) = 5 6 15 3 13 peanut 11 plain blue green red 3 1 2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + blue red green 2 3 1 extra (1, 3) = 5 6 15 3 13 peanut 11 plain
  32. # row # col # frame bags 2 3 2

    extra 1 3 1 ? ? ?
  33. # row # col # frame bags 2 3 2

    extra 1 2 3 1 ? 2 ? ?
  34. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + extra (2, 3, 1) = 5 6 15 3 13 peanut 11 plain 2 1 3 blue green red 3 1 2 bag1 bag2 6 8 12 10 blue 15 peanut green 9 plain red bags (2, 3, 2) + extra (1, 3, 1) = 5 6 15 3 13 peanut 11 plain blue green red 3 1 2
  35. # row # col # frame bags 2 3 2

    extra 1 2 3 1 ? 2 ? ?
  36. # row # col # frame bags 2 3 2

    extra 1 2 3 1 ? 2 ? 3 ?
  37. # row # col # frame bags 2 3 2

    extra 1 2 3 1 2 ? 2 ? 3 ? 2
  38. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3
  39. bag1 bag2 6 8 12 10 blue 15 peanut green

    9 plain red bags (2, 3, 2) + extra (2, 3, 2) = bag1 bag2 7 9 14 13 blue 18 peanut green 11 plain red (2, 3, 2) 5 6 15 3 13 peanut 11 plain 6 7 17 6 16 peanut 13 plain 3 red 3 green blue 2 1 1 2 2 1 2 3 1 3
  40. Match dimensionality by appending 1’s Match dimensions by recycling them

    Broadcasting rules:
  41. rray broadcasts bag #> green red blue #> peanut 15

    8 12 #> plain 10 6 9 extra #> green red blue #> [1,] 3 1 2 bag + extra #> Error in bag + extra: non-conformable arrays bag_rray + extra #> <rray<dbl>[,3][2]> #> green red blue #> peanut 18 9 14 #> plain 13 7 11
  42. Manipulation

  43. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...
  44. The best part? They all work with base R.

  45. The best part? They all work with base R.

  46. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...
  47. rray as a toolkit rray_bind() rray_duplicate_any() rray_expand_dims() rray_broadcast() rray_flip() rray_max()

    rray_sum() rray_mean() rray_reshape() rray_rotate() rray_split() rray_tile() rray_unique() ...
  48. 6 8 12 10 blue 15 peanut green 9 plain

    red Compute proportions 1) Overall 2) By filling 3) By color
  49. 0.43 0.57 0.57 0.40 blue 0.60 peanut green 0.43 plain

    red 0.24 0.23 0.34 0.40 blue 0.43 peanut green 0.36 plain red 0.10 0.13 0.20 0.17 blue 0.25 peanut green 0.15 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red bag / sum(bag) sweep(bag, 1, apply(bag, 1, sum), "/") sweep(bag, 2, apply(bag, 2, sum), "/") Overall By Filling By Color
  50. bag / rray_sum(bag) bag / rray_sum(bag, axes = 2) bag

    / rray_sum(bag, axes = 1) Overall By Filling By Color 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 6 8 12 10 blue 15 peanut green 9 plain red 0.43 0.57 0.57 0.40 blue 0.60 peanut green 0.43 plain red 0.24 0.23 0.34 0.40 blue 0.43 peanut green 0.36 plain red 0.10 0.13 0.20 0.17 blue 0.25 peanut green 0.15 plain red
  51. In conclusion...

  52. 1) Stricter rray class 2) Broadcasting 3) Toolkit

  53. GitHub https://github.com/r-lib/rray Website https://rray.r-lib.org Powered by: xtensor https://github.com/QuantStack/xtensor Questions?