Jennifer (Jenny) Bryan
February 02, 2018
# Data Rectangling

Talk about data rectangling and list-columns at RStudio Conf 2018 in San Diego
https://www.rstudio.com/conference/
Gist of code I showed:
https://gist.github.com/jennybc/3afafce0a06fde314b5c9844912d6bd7

## Transcript

1. Data Wrangling
@JennyBryan
@jennybc

Rect

3. atomic vectors
logical factor
integer, double

4. vectors of same length? DATA FRAME!

5. vectors don’t have to be atomic
works for lists too! list column

7. a list

8. a homogeneous list

9. Why work with lists?
You have no choice.
•String processing, e.g., splitting
•JSON or XML, e.g. web APIs
•Models, plots, & collections thereof

10. An API Of Ice And Fire
https://anapioficeandfire.com
https://cran.r-project.org/package=repurrrsive

11. "Combines the excitement of iris and mtcars,
with the complexity of recursive lists.
W00t!"
install.packages("repurrrsive")

12. https://blog.rstudio.com/2017/08/22/rstudio-v1-1-preview-object-explorer/
View(YOUR_HAIRY_LIST)

13. got_chars[[9]][["name"]]
got_chars[[9]][["titles"]]

14. x[[i]]
x[i]
x
from

15. http://blog.codinghorror.com/falling-into-the-pit-o
pit of success

16. https://shibumo.wordpress.com
gentle hill of striving

17. map(.x, .f, ...)
purrr::

18. map(.x, .f, ...)
for every element of .x
apply .f

19. map(.x, .f, ...)
.f has some special shortcuts
map(.x, "TEXT")
map(.x, i)

20. .x = minis

21. map(minis, "pants")

22. go to R

23. map_lgl(.x, .f, ...)
map_int(.x, .f, ...)
map_dbl(.x, .f, ...)
map_chr(.x, .f, ...)

24. map_dfr(minis, `[`,

25. If everything is equally easy,
everything is equally hard.
paraphrasing David Heinemeier Hansson re: Ruby on Rails

26. map(.x, .f, ...)
.f can take many forms
• existing function
• anonymous function
• formula

27. .x = minis

28. map(minis, antennate)

29. library(glue)

glue_data(
list(name = "Jenny", born = "in Atlanta"),
"{name} was born {born}."
)
#> Jenny was born in Atlanta.

glue_data(got_chars[[2]], "{name} was born {born}.")
#> Tyrion Lannister was born In 273 AC, at Casterly Rock.

glue_data(got_chars[[9]], "{name} was born {born}.")
#> Daenerys Targaryen was born In 284 AC, at Dragonstone.

30. glue_data(got_chars[[9]], "{name} was born {born}.")
~ glue_data( .x , "{name} was born {born}.")
replace your
example with .x
prefix with ~ to say
"it's a formula!"

31. map_chr(got_chars, ~ glue_data(.x, "{name} was born {born}."))
#> [1] "Theon Greyjoy was born In 278 AC or 279 AC, at Pyke."
#> [2] "Tyrion Lannister was born In 273 AC, at Casterly Rock."
#> [3] "Victarion Greyjoy was born In 268 AC or before, at Pyke."
#> [4] "Will was born ."
#> [5] "Areo Hotah was born In 257 AC or before, at Norvos."
#> [6] "Chett was born At Hag's Mire."
#> [7] "Cressen was born In 219 AC or 220 AC."
#> [8] "Arianne Martell was born In 276 AC, at Sunspear."
#> [9] "Daenerys Targaryen was born In 284 AC, at Dragonstone."
drop-in to any member
of the map_*() family

33. Why put a list into a data frame?
safety & convenience
•Manage multiple vectors holistically
•Use existing toolkit for filter, select, etc.

34. What happens in the
data frame
Stays in the data frame

35. last R example:
list in a data frame = list-column

36. lists are part of life
RStudio Object viewer helps
tibbles are list-friendly