Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Glue strings to data with glue

Glue strings to data with glue

String interpolation, evaluating a variable name to a value within a string, isa feature of many programming languages including Python, Julia, Javascript, Rust, and most Unix Shells. R's `sprintf()` and `paste()` functions provide some of this functionality, but have limitations which make them cumbersome to use. There are also some existing add on packages with similar functionality,however each has drawbacks.The glue package performs robust string interpolation for R. This includes evaluation of variables and arbitrary R code,with a clean and simple syntax. Because it is dependency-free, it is easy to incorporate into packages. In addition, glue provides an extensible interface to perform more complex transformations; such as `glue_sql()` to construct SQL queries with automatically quoted variables.This talk will show how to utilize glue to write beautiful code which is easy to read, write and maintain. We will also discuss ways to best use glue when performance is a concern. Finally we will create custom glue functions tailored towards specific use cases, such as JSON construction, colored messages, emoji interpolation and more.

Jim Hester

July 12, 2018
Tweet

More Decks by Jim Hester

Other Decks in Programming

Transcript

  1. rstd.io/glue GLUE STRINGS TO DATA WITH GLUE STRINGS TO DATA

    WITH Jim Hester    @jimhester_ jimhester [email protected]
  2. GLUING STRINGS IS EVERYWHERE GLUING STRINGS IS EVERYWHERE EXCEL EXCEL

    =CONCATENATE(A1, " ", B1) BASH BASH dir=/tmp for file in $dir/*; do cat $dir/$file done PYTHON PYTHON apples = 4 print("I have {a} apples".format(a=apples)) # Python 3.6+ print(f"I have {apples} apples")
  3. GLUING STRINGS IN R IS PAINFUL GLUING STRINGS IN R

    IS PAINFUL apples <- 3 bananas <- 2 paste0("Inventory", "\n", " Apples: ", apples, "\n", " Bananas: ", bananas, "\n", "Total: ", apples + bananas) ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5 sprintf("Inventory\n Apples: %i\n Bananas: %i\nTotal: %i", apples, bananas, apples + bananas) ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5 if (!file.exists(file)) { stop("'", file, "' not found") }
  4. GLUE GLUE GLUE GLUE MAKES GLUING STRINGS EASY! MAKES GLUING

    STRINGS EASY! apples <- 3 bananas <- 2 glue(" Inventory Apples: {apples} Bananas: {bananas} Total: {apples + bananas}") ## Inventory ## Apples: 3 ## Bananas: 2 ## Total: 5
  5. GLUE GLUE GLUE GLUE IS CONVIENENT IS CONVIENENT if (TRUE)

    { glue(" You can indent naturally \\ and break up long lines \\ if needed. ") } ## You can indent naturally and break up long lines if needed.
  6. GLUE GLUE GLUE GLUE IS SAFE IS SAFE glue("{1:3} will

    recycle with a length of 1 {letters[1]}") ## 1 will recycle with a length of 1 a ## 2 will recycle with a length of 1 a ## 3 will recycle with a length of 1 a glue("{1:3} will recycle with the same length {letters[1:3]}") ## 1 will recycle with the same length a ## 2 will recycle with the same length b ## 3 will recycle with the same length c glue("{1:3} will not recycle with inconsistent length {letters[1:2]}") ## Error: Variables must be length 1 or 3
  7. GLUE GLUE GLUE GLUE HANDLES MISSINGS HANDLES MISSINGS footies %>%

    mutate(glue("{first} {middle} {last}")) %>% pull() ## Tim Filiga Cahill ## Harry NA Kewell ## Mark NA Schwarzer footies %>% mutate(glue("{first} {middle} {last}", .na = NULL)) %>% pull() ## Tim Filiga Cahill ## NA ## NA footies %>% mutate(glue("{first} {middle} {last}", .na = "")) %>% pull() ## Tim Filiga Cahill ## Harry Kewell ## Mark Schwarzer
  8. GLUE GLUE GLUE GLUE IS FOR PACKAGES IS FOR PACKAGES

    Zero dependencies, tested to R 3.1 Customizable Fast abort <- function(..., .sep = "", .envir = parent.frame()) { stop(glue(..., .envir = .envir), call. = FALSE) } if (actual != expected) { abort(" Expected content-type: * {expected} Actual content-type: * {actual} ") }
  9. .OPEN .OPEN AND AND .CLOSE .CLOSE CHANGE DELIMITERS CHANGE DELIMITERS

    j_glue <- function(..., .envir = parent.frame()) { glue(..., .open = "<", .close = ">", .envir = .envir) } apples <- 1; bananas <- 2 json <- j_glue(' { "fruits": { "apples": <apples>, "bananas": <bananas> } }') jsonlite::fromJSON(json) ## $fruits ## $fruits$apples ## [1] 1 ## ## $fruits$bananas ## [1] 2
  10. .TRANSFORMER .TRANSFORMER ARE ROBUST ARE ROBUST shell_transformer <- function(code, envir)

    { shQuote(eval(parse(text = code), envir)) } glue_sh <- function(..., .envir = parent.frame()) { glue(..., .envir = .envir, .transformer = shell_transformer) } filename <- "test" writeLines("hello!", filename) cmd <- glue_sh("cat {filename}") cmd ## cat 'test'
  11. .TRANSFORMER .TRANSFORMER ARE USEFUL ARE USEFUL glue_fmt("π: {pi:.7}") ## π:

    3.1415927 ratio <- sum(mtcars$gear == 3) / nrow(mtcars) glue_fmt("{ratio * 100:.2}% of mtcars have 3 gears") ## 46.88% of mtcars have 3 gears
  12. .TRANSFORMER .TRANSFORMER ARE FUN ARE FUN glue_ji("If life gives you

    :lemon: make :tropical_drink:") ## If life gives you make glue_ji("Many :hands*: make :light: :work:") ## Many make
  13. GLUE GLUE GLUE GLUE IS FAST ON SINGLE STRINGS IS

    FAST ON SINGLE STRINGS bar <- "bar" glue("foo{bar}")
  14. GLUE GLUE GLUE GLUE IS FAST ON LOTS OF STRINGS

    IS FAST ON LOTS OF STRINGS bar <- rep("bar", 100000) glue("foo{bar}")
  15. GLUE GLUE GLUE GLUE IS FAST IS FAST { }

    parsing in C performance dominated by parse() eval() and paste0(). still slow? Vectorize! https://glue.tidyverse.org/articles/speed.html
  16. GLUE_DATA() GLUE_DATA() BINDS STRINGS TO ANY LIST / BINDS STRINGS

    TO ANY LIST / ENVIRONMENT ENVIRONMENT library(magrittr) head(mtcars) %>% glue_data("{rownames(.)} has {cyl} cylinders and {hp}hp") ## Mazda RX4 has 6 cylinders and 110hp ## Mazda RX4 Wag has 6 cylinders and 110hp ## Datsun 710 has 4 cylinders and 93hp ## Hornet 4 Drive has 6 cylinders and 110hp ## Hornet Sportabout has 8 cylinders and 175hp ## Valiant has 6 cylinders and 105hp
  17. GLUE_COLLAPSE() GLUE_COLLAPSE() COLLAPSES VECTORS COLLAPSES VECTORS glue_collapse(1:10, sep = "

    ") ## 1 2 3 4 5 6 7 8 9 10 glue_collapse(1:10, sep = " ", width = 10) ## 1 2 3 4... glue_collapse(backtick(1:10), sep = ", ", last = " and ") ## `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9` and `10`
  18. GLUE_SQL() GLUE_SQL() QUOTES SQL STRINGS QUOTES SQL STRINGS con <-

    DBI::dbConnect(RSQLite::SQLite(), ":memory:") tbl <- DBI::Id(schema = "xyz", table = "iris") glue_sql( "SELECT * FROM {`tbl`} WHERE species IN ({vals*})", vals = c("setosa", "versicolor"), .con = con) ## <SQL> SELECT * FROM `xyz`.`iris` WHERE species IN ('setosa', 'versicol
  19. STR_GLUE() STR_GLUE() IS IN IS IN tidyverse www.rstudio.com library(tidyverse) str_glue("

    glue version: {packageVersion('glue')} tidyverse version: {packageVersion('tidyverse')} ") ## glue version: 1.2.0.9000 ## tidyverse version: 1.2.1
  20. install.packages("glue") glue.tidyverse.org adjective <- c("luminous", "stylish", "super", "striking", "impressive", "fantastic")

    glue("Have a {sample(adjective, 1)} day!") ## Have a impressive day!    @jimhester_ jimhester [email protected]