Pro Yearly is on sale from $80 to $50! »

Glue strings to data with glue

Glue strings to data with glue

String interpolation, evaluating a variable name to a value within a string, isa feature of many programming languages including Python, Julia, Javascript, Rust, and most Unix Shells. R's `sprintf()` and `paste()` functions provide some of this functionality, but have limitations which make them cumbersome to use. There are also some existing add on packages with similar functionality,however each has drawbacks.The glue package performs robust string interpolation for R. This includes evaluation of variables and arbitrary R code,with a clean and simple syntax. Because it is dependency-free, it is easy to incorporate into packages. In addition, glue provides an extensible interface to perform more complex transformations; such as `glue_sql()` to construct SQL queries with automatically quoted variables.This talk will show how to utilize glue to write beautiful code which is easy to read, write and maintain. We will also discuss ways to best use glue when performance is a concern. Finally we will create custom glue functions tailored towards specific use cases, such as JSON construction, colored messages, emoji interpolation and more.

6170c1d1970baf2a36a9ae2955e47ff3?s=128

Jim Hester

July 12, 2018
Tweet

Transcript

  1. rstd.io/glue GLUE STRINGS TO DATA WITH GLUE STRINGS TO DATA

    WITH Jim Hester    @jimhester_ jimhester jim.hester@rstudio.com
  2. GLUING STRINGS IS EVERYWHERE GLUING STRINGS IS EVERYWHERE EXCEL EXCEL

    =CONCATENATE(A1, " ", B1) BASH BASH dir=/tmp for file in $dir/*; do cat $dir/$file done PYTHON PYTHON apples = 4 print("I have {a} apples".format(a=apples)) # Python 3.6+ print(f"I have {apples} apples")
  3. GLUING STRINGS IN R IS PAINFUL GLUING STRINGS IN R

    IS PAINFUL apples <- 3 bananas <- 2 paste0("Inventory", "\n", " Apples: ", apples, "\n", " Bananas: ", bananas, "\n", "Total: ", apples + bananas) ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5 sprintf("Inventory\n Apples: %i\n Bananas: %i\nTotal: %i", apples, bananas, apples + bananas) ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5 if (!file.exists(file)) { stop("'", file, "' not found") }
  4. GLUE GLUE GLUE GLUE MAKES GLUING STRINGS EASY! MAKES GLUING

    STRINGS EASY! apples <- 3 bananas <- 2 glue(" Inventory Apples: {apples} Bananas: {bananas} Total: {apples + bananas}") ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5
  5. GLUE GLUE GLUE GLUE IS CONVIENENT IS CONVIENENT if (TRUE)

    { glue(" You can indent naturally \\ and break up long lines \\ if needed. ") } ## You can indent naturally and break up long lines if needed.
  6. GLUE GLUE GLUE GLUE IS SAFE IS SAFE glue("{1:3} will

    recycle with a length of 1 {letters[1]}") ## 1 will recycle with a length of 1 a ## 2 will recycle with a length of 1 a ## 3 will recycle with a length of 1 a glue("{1:3} will recycle with the same length {letters[1:3]}") ## 1 will recycle with the same length a ## 2 will recycle with the same length b ## 3 will recycle with the same length c glue("{1:3} will not recycle with inconsistent length {letters[1:2]}") ## Error: Variables must be length 1 or 3
  7. GLUE GLUE GLUE GLUE HANDLES MISSINGS HANDLES MISSINGS footies %>%

    mutate(glue("{first} {middle} {last}")) %>% pull() ## Tim Filiga Cahill ## Harry NA Kewell ## Mark NA Schwarzer footies %>% mutate(glue("{first} {middle} {last}", .na = NULL)) %>% pull() ## Tim Filiga Cahill ## NA ## NA footies %>% mutate(glue("{first} {middle} {last}", .na = "")) %>% pull() ## Tim Filiga Cahill ## Harry Kewell ## Mark Schwarzer
  8. GLUE GLUE GLUE GLUE IS FOR PACKAGES IS FOR PACKAGES

    Zero dependencies, tested to R 3.1 Customizable Fast abort <- function(..., .sep = "", .envir = parent.frame()) { stop(glue(..., .envir = .envir), call. = FALSE) } if (actual != expected) { abort(" Expected content-type: * {expected} Actual content-type: * {actual} ") }
  9. GLUE GLUE GLUE GLUE IS CUSTOMIZABLE IS CUSTOMIZABLE

  10. .OPEN .OPEN AND AND .CLOSE .CLOSE CHANGE DELIMITERS CHANGE DELIMITERS

    j_glue <- function(..., .envir = parent.frame()) { glue(..., .open = "<", .close = ">", .envir = .envir) } apples <- 1; bananas <- 2 json <- j_glue(' { "fruits": { "apples": <apples>, "bananas": <bananas> } }') jsonlite::fromJSON(json) ## $fruits ## $fruits$apples ## [1] 1 ## ## $fruits$bananas ## [1] 2
  11. .TRANSFORMER .TRANSFORMER ARE ROBUST ARE ROBUST shell_transformer <- function(code, envir)

    { shQuote(eval(parse(text = code), envir)) } glue_sh <- function(..., .envir = parent.frame()) { glue(..., .envir = .envir, .transformer = shell_transformer) } filename <- "test" writeLines("hello!", filename) cmd <- glue_sh("cat {filename}") cmd ## cat 'test'
  12. .TRANSFORMER .TRANSFORMER ARE USEFUL ARE USEFUL glue_fmt("π: {pi:.7}") ## π:

    3.1415927 ratio <- sum(mtcars$gear == 3) / nrow(mtcars) glue_fmt("{ratio * 100:.2}% of mtcars have 3 gears") ## 46.88% of mtcars have 3 gears
  13. .TRANSFORMER .TRANSFORMER ARE FUN ARE FUN glue_ji("If life gives you

    :lemon: make :tropical_drink:") ## If life gives you make glue_ji("Many :hands*: make :light: :work:") ## Many make
  14. GLUE GLUE GLUE GLUE IS FAST IS FAST

  15. TYPES OF SPEED TYPES OF SPEED Speed of writing Speed

    of recall Speed of execution
  16. GLUE GLUE GLUE GLUE IS FAST ON SINGLE STRINGS IS

    FAST ON SINGLE STRINGS bar <- "bar" glue("foo{bar}")
  17. GLUE GLUE GLUE GLUE IS FAST ON LOTS OF STRINGS

    IS FAST ON LOTS OF STRINGS bar <- rep("bar", 100000) glue("foo{bar}")
  18. GLUE GLUE GLUE GLUE IS FAST IS FAST { }

    parsing in C performance dominated by parse() eval() and paste0(). still slow? Vectorize! https://glue.tidyverse.org/articles/speed.html
  19. GLUE GLUE GLUE GLUE CONTAINS MORE THAN CONTAINS MORE THAN

    GLUE() GLUE()
  20. GLUE_DATA() GLUE_DATA() BINDS STRINGS TO ANY LIST / BINDS STRINGS

    TO ANY LIST / ENVIRONMENT ENVIRONMENT library(magrittr) head(mtcars) %>% glue_data("{rownames(.)} has {cyl} cylinders and {hp}hp") ## Mazda RX4 has 6 cylinders and 110hp ## Mazda RX4 Wag has 6 cylinders and 110hp ## Datsun 710 has 4 cylinders and 93hp ## Hornet 4 Drive has 6 cylinders and 110hp ## Hornet Sportabout has 8 cylinders and 175hp ## Valiant has 6 cylinders and 105hp
  21. GLUE_COLLAPSE() GLUE_COLLAPSE() COLLAPSES VECTORS COLLAPSES VECTORS glue_collapse(1:10, sep = "

    ") ## 1 2 3 4 5 6 7 8 9 10 glue_collapse(1:10, sep = " ", width = 10) ## 1 2 3 4... glue_collapse(backtick(1:10), sep = ", ", last = " and ") ## `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9` and `10`
  22. GLUE_SQL() GLUE_SQL() QUOTES SQL STRINGS QUOTES SQL STRINGS con <-

    DBI::dbConnect(RSQLite::SQLite(), ":memory:") tbl <- DBI::Id(schema = "xyz", table = "iris") glue_sql( "SELECT * FROM {`tbl`} WHERE species IN ({vals*})", vals = c("setosa", "versicolor"), .con = con) ## <SQL> SELECT * FROM `xyz`.`iris` WHERE species IN ('setosa', 'versicol
  23. STR_GLUE() STR_GLUE() IS IN IS IN tidyverse www.rstudio.com library(tidyverse) str_glue("

    glue version: {packageVersion('glue')} tidyverse version: {packageVersion('tidyverse')} ") ## glue version: 1.2.0.9000 ## tidyverse version: 1.2.1
  24. GLUE GLUE GLUE GLUE IS POPULAR??? IS POPULAR???

  25. 60 reverse dependencies Monthly downloads downloads downloads 419K/month 419K/month dplyr,

    stringr, tidyr dependency, maybe already installed
  26. install.packages("glue") glue.tidyverse.org adjective <- c("luminous", "stylish", "super", "striking", "impressive", "fantastic")

    glue("Have a {sample(adjective, 1)} day!") ## Have a impressive day!    @jimhester_ jimhester jim.hester@rstudio.com