dplyr episode 9: summarise() of the vctrs

dplyr episode 9: summarise() of the vctrs

Lightning talk given Karaoke style as part of the #tidyverse team meetup, co-hosted by RLadies Boston and Boston useR group. https://www.meetup.com/fr-FR/rladies-boston/events/265179803/

9e9ca2dab45c14e812f5c1a2afdef2f7?s=128

Romain François

October 29, 2019
Tweet

Transcript

  1. 2.
  2. 3.

    quantibble <- function(x, ...) { tibble(!!!quantile(x, ...)) } quantibble(iris$Sepal.Length) #>

    # A tibble: 1 x 5 #> `0%` `25%` `50%` `75%` `100%` #> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4.3 5.1 5.8 6.4 7.9 iris %>% group_by(Species) %>% summarise(q = quantibble(Sepal.Length)) #> # A tibble: 3 x 2 #> Species q$`0%` $`25%` $`50%` $`75%` $`100%` #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 4.3 4.8 5 5.2 5.8 #> 2 versicolor 4.9 5.6 5.9 6.3 7 #> 3 virginica 4.9 6.22 6.5 6.9 7.9 Packing
  3. 4.

    iris %>% group_by(Species) %>% summarise(quantibble(Sepal.Length)) #> # A tibble: 3

    x 6 #> Species `0%` `25%` `50%` `75%` `100%` #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 4.3 4.8 5 5.2 5.8 #> 2 versicolor 4.9 5.6 5.9 6.3 7 #> 3 virginica 4.9 6.22 6.5 6.9 7.9 Auto splice quantibble <- function(x, ...) { tibble(!!!quantile(x, ...)) }
  4. 5.

    across() iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), mean)) #> # A

    tibble: 3 x 3 #> Species Sepal.Length Sepal.Width #> <fct> <dbl> <dbl> #> 1 setosa 5.01 3.43 #> 2 versicolor 5.94 2.77 #> 3 virginica 6.59 2.97 1 function
  5. 6.

    across() function list iris %>% group_by(Species) %>% summarise( across(starts_with("Sepal"), list(mean

    = mean, sd = sd)) ) #> # A tibble: 3 x 3 #> Species mean$Sepal.Length $Sepal.Width sd$Sepal.Length $Sepal.Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 3.43 0.352 0.379 #> 2 versicolor 5.94 2.77 0.516 0.314 #> 3 virginica 6.59 2.97 0.636 0.322 "packed" by function
  6. 7.

    across() + tidyr::unpack() iris %>% group_by(Species) %>% summarise( across(starts_with("Sepal"), list(mean

    = mean, sd = sd)) ) %>% tidyr::unpack(c(mean, sd), names_sep = "_") #> # A tibble: 3 x 5 #> Species mean_Sepal.Leng… mean_Sepal.Width sd_Sepal.Length sd_Sepal.Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 3.43 0.352 0.379 #> 2 versico… 5.94 2.77 0.516 0.314 #> 3 virgini… 6.59 2.97 0.636 0.322
  7. 8.

    across() Manual packing iris %>% group_by(Species) %>% summarise( across(starts_with("Sepal"), ~

    tibble(mean = mean(.x), sd = sd(.x)) ) ) #> # A tibble: 3 x 3 #> Species Sepal.Length$mean $sd Sepal.Width$mean $sd #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 0.352 3.43 0.379 #> 2 versicolor 5.94 0.516 2.77 0.314 #> 3 virginica 6.59 0.636 2.97 0.322 Single function returning a data frame
  8. 9.

    across() Single function iris %>% group_by(Species) %>% summarise( across(starts_with("Sepal"), ~quantibble(.x,

    probs = c(.25, .5, .75)) ) ) #> # A tibble: 3 x 3 #> Species Sepal.Length$`25%` $`50%` $`75%` Sepal.Width$`25… $`50%` $`75%` #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 4.8 5 5.2 3.2 3.4 3.68 #> 2 versicol… 5.6 5.9 6.3 2.52 2.8 3 #> 3 virginica 6.22 6.5 6.9 2.8 3 3.18