Slide 1

Slide 1 text

Data Wrangling @JennyBryan @jennybc  

Slide 2

Slide 2 text

Data Wrangling @JennyBryan @jennybc   Rect

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

atomic vectors logical factor integer, double

Slide 7

Slide 7 text

vectors of same length? DATA FRAME!

Slide 8

Slide 8 text

vectors don’t have to be atomic works for lists too! list column

Slide 9

Slide 9 text

name
 stuff
 this is a data frame! a tibble, specifically

Slide 10

Slide 10 text

a list

Slide 11

Slide 11 text

a homogeneous list

Slide 12

Slide 12 text

Why work with lists? You have no choice. •String processing, e.g., splitting •JSON or XML, e.g. web APIs •Models, plots, & collections thereof

Slide 13

Slide 13 text

An API Of Ice And Fire https://anapioficeandfire.com https://cran.r-project.org/package=repurrrsive

Slide 14

Slide 14 text

"Combines the excitement of iris and mtcars, with the complexity of recursive lists. W00t!" install.packages("repurrrsive")

Slide 15

Slide 15 text

https://blog.rstudio.com/2017/08/22/rstudio-v1-1-preview-object-explorer/ View(YOUR_HAIRY_LIST)

Slide 16

Slide 16 text

got_chars[[9]][["name"]] got_chars[[9]][["titles"]]

Slide 17

Slide 17 text

x[[i]] x[i] x from http://r4ds.had.co.nz/vectors.html#lists-of-condiments

Slide 18

Slide 18 text

http://blog.codinghorror.com/falling-into-the-pit-o pit of success

Slide 19

Slide 19 text

https://shibumo.wordpress.com gentle hill of striving

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

map(.x, .f, ...) purrr::

Slide 22

Slide 22 text

map(.x, .f, ...) for every element of .x apply .f

Slide 23

Slide 23 text

map(.x, .f, ...) .f has some special shortcuts to make common tasks easy map(.x, "TEXT") map(.x, i)

Slide 24

Slide 24 text

.x = minis

Slide 25

Slide 25 text

map(minis, "pants")

Slide 26

Slide 26 text

go to R

Slide 27

Slide 27 text

map_lgl(.x, .f, ...) map_int(.x, .f, ...) map_dbl(.x, .f, ...) map_chr(.x, .f, ...)

Slide 28

Slide 28 text

map_dfr(minis, `[`, c("pants", "torso", "head")

Slide 29

Slide 29 text

If everything is equally easy, everything is equally hard. paraphrasing David Heinemeier Hansson re: Ruby on Rails

Slide 30

Slide 30 text

map(.x, .f, ...) .f can take many forms • existing function • anonymous function • formula

Slide 31

Slide 31 text

.x = minis

Slide 32

Slide 32 text

map(minis, antennate)

Slide 33

Slide 33 text

library(glue) 
 
 glue_data(
 list(name = "Jenny", born = "in Atlanta"),
 "{name} was born {born}."
 )
 #> Jenny was born in Atlanta. 
 
 glue_data(got_chars[[2]], "{name} was born {born}.")
 #> Tyrion Lannister was born In 273 AC, at Casterly Rock. 
 glue_data(got_chars[[9]], "{name} was born {born}.")
 #> Daenerys Targaryen was born In 284 AC, at Dragonstone.

Slide 34

Slide 34 text

glue_data(got_chars[[9]], "{name} was born {born}.") ~ glue_data( .x , "{name} was born {born}.") replace your example with .x prefix with ~ to say "it's a formula!"

Slide 35

Slide 35 text

map_chr(got_chars, ~ glue_data(.x, "{name} was born {born}."))
 #> [1] "Theon Greyjoy was born In 278 AC or 279 AC, at Pyke." #> [2] "Tyrion Lannister was born In 273 AC, at Casterly Rock." #> [3] "Victarion Greyjoy was born In 268 AC or before, at Pyke." #> [4] "Will was born ." #> [5] "Areo Hotah was born In 257 AC or before, at Norvos." #> [6] "Chett was born At Hag's Mire." #> [7] "Cressen was born In 219 AC or 220 AC." #> [8] "Arianne Martell was born In 276 AC, at Sunspear." #> [9] "Daenerys Targaryen was born In 284 AC, at Dragonstone." drop-in to any member of the map_*() family

Slide 36

Slide 36 text

name
 stuff
 this is a data frame! a tibble, specifically

Slide 37

Slide 37 text

Why put a list into a data frame? safety & convenience •Manage multiple vectors holistically •Use existing toolkit for filter, select, etc.

Slide 38

Slide 38 text

What happens in the data frame Stays in the data frame

Slide 39

Slide 39 text

last R example: list in a data frame = list-column

Slide 40

Slide 40 text

lists are part of life RStudio Object viewer helps tibbles are list-friendly map() functions help you compute on & simplify lists