Jennifer (Jenny) Bryan
February 02, 2018
4.7k

# Data Rectangling

Talk about data rectangling and list-columns at RStudio Conf 2018 in San Diego
https://www.rstudio.com/conference/
Gist of code I showed:
https://gist.github.com/jennybc/3afafce0a06fde314b5c9844912d6bd7

## Jennifer (Jenny) Bryan

February 02, 2018

## Transcript

1. Data Wrangling
@JennyBryan
@jennybc

2. Data Wrangling
@JennyBryan
@jennybc

Rect

3. atomic vectors
logical factor
integer, double

4. vectors of same length? DATA FRAME!

5. vectors don’t have to be atomic
works for lists too! list column

6. name

stuff

this is a data frame!
a tibble, specifically

7. a list

8. a homogeneous list

9. Why work with lists?
You have no choice.
•String processing, e.g., splitting
•JSON or XML, e.g. web APIs
•Models, plots, & collections thereof

10. An API Of Ice And Fire
https://anapioficeandfire.com
https://cran.r-project.org/package=repurrrsive

11. "Combines the excitement of iris and mtcars,
with the complexity of recursive lists.
W00t!"
install.packages("repurrrsive")

12. https://blog.rstudio.com/2017/08/22/rstudio-v1-1-preview-object-explorer/
View(YOUR_HAIRY_LIST)

13. got_chars[[9]][["name"]]
got_chars[[9]][["titles"]]

14. x[[i]]
x[i]
x
from

15. http://blog.codinghorror.com/falling-into-the-pit-o
pit of success

16. https://shibumo.wordpress.com
gentle hill of striving

17. map(.x, .f, ...)
purrr::

18. map(.x, .f, ...)
for every element of .x
apply .f

19. map(.x, .f, ...)
.f has some special shortcuts
map(.x, "TEXT")
map(.x, i)

20. .x = minis

21. map(minis, "pants")

22. go to R

23. map_lgl(.x, .f, ...)
map_int(.x, .f, ...)
map_dbl(.x, .f, ...)
map_chr(.x, .f, ...)

24. map_dfr(minis, `[`,

25. If everything is equally easy,
everything is equally hard.
paraphrasing David Heinemeier Hansson re: Ruby on Rails

26. map(.x, .f, ...)
.f can take many forms
• existing function
• anonymous function
• formula

27. .x = minis

28. map(minis, antennate)

29. library(glue)

glue_data(
list(name = "Jenny", born = "in Atlanta"),
"{name} was born {born}."
)
#> Jenny was born in Atlanta.

glue_data(got_chars[[2]], "{name} was born {born}.")
#> Tyrion Lannister was born In 273 AC, at Casterly Rock.

glue_data(got_chars[[9]], "{name} was born {born}.")
#> Daenerys Targaryen was born In 284 AC, at Dragonstone.

30. glue_data(got_chars[[9]], "{name} was born {born}.")
~ glue_data( .x , "{name} was born {born}.")
replace your
example with .x
prefix with ~ to say
"it's a formula!"

31. map_chr(got_chars, ~ glue_data(.x, "{name} was born {born}."))
#> [1] "Theon Greyjoy was born In 278 AC or 279 AC, at Pyke."
#> [2] "Tyrion Lannister was born In 273 AC, at Casterly Rock."
#> [3] "Victarion Greyjoy was born In 268 AC or before, at Pyke."
#> [4] "Will was born ."
#> [5] "Areo Hotah was born In 257 AC or before, at Norvos."
#> [6] "Chett was born At Hag's Mire."
#> [7] "Cressen was born In 219 AC or 220 AC."
#> [8] "Arianne Martell was born In 276 AC, at Sunspear."
#> [9] "Daenerys Targaryen was born In 284 AC, at Dragonstone."
drop-in to any member
of the map_*() family

32. name

stuff

this is a data frame!
a tibble, specifically

33. Why put a list into a data frame?
safety & convenience
•Manage multiple vectors holistically
•Use existing toolkit for filter, select, etc.

34. What happens in the
data frame
Stays in the data frame

35. last R example:
list in a data frame = list-column

36. lists are part of life
RStudio Object viewer helps
tibbles are list-friendly