$30 off During Our Annual Pro Sale. View Details »

Glue strings to data with glue

Glue strings to data with glue

String interpolation, evaluating a variable name to a value within a string, isa feature of many programming languages including Python, Julia, Javascript, Rust, and most Unix Shells. R's `sprintf()` and `paste()` functions provide some of this functionality, but have limitations which make them cumbersome to use. There are also some existing add on packages with similar functionality,however each has drawbacks.The glue package performs robust string interpolation for R. This includes evaluation of variables and arbitrary R code,with a clean and simple syntax. Because it is dependency-free, it is easy to incorporate into packages. In addition, glue provides an extensible interface to perform more complex transformations; such as `glue_sql()` to construct SQL queries with automatically quoted variables.This talk will show how to utilize glue to write beautiful code which is easy to read, write and maintain. We will also discuss ways to best use glue when performance is a concern. Finally we will create custom glue functions tailored towards specific use cases, such as JSON construction, colored messages, emoji interpolation and more.

Jim Hester

July 12, 2018
Tweet

More Decks by Jim Hester

Other Decks in Programming

Transcript

  1. rstd.io/glue
    GLUE STRINGS TO DATA WITH
    GLUE STRINGS TO DATA WITH
    Jim Hester



    @jimhester_
    jimhester
    [email protected]

    View Slide

  2. GLUING STRINGS IS EVERYWHERE
    GLUING STRINGS IS EVERYWHERE
    EXCEL
    EXCEL
    =CONCATENATE(A1, " ", B1)
    BASH
    BASH
    dir=/tmp
    for file in $dir/*; do
    cat $dir/$file
    done
    PYTHON
    PYTHON
    apples = 4
    print("I have {a} apples".format(a=apples))
    # Python 3.6+
    print(f"I have {apples} apples")

    View Slide

  3. GLUING STRINGS IN R IS PAINFUL
    GLUING STRINGS IN R IS PAINFUL
    apples <- 3
    bananas <- 2
    paste0("Inventory", "\n",
    " Apples: ", apples, "\n",
    " Bananas: ", bananas, "\n",
    "Total: ", apples + bananas)
    ## Inventory
    ## Apples: 3
    ## Bananas: 2
    ## Total: 5
    sprintf("Inventory\n Apples: %i\n Bananas: %i\nTotal: %i",
    apples, bananas, apples + bananas)
    ## Inventory
    ## Apples: 3
    ## Bananas: 2
    ## Total: 5
    if (!file.exists(file)) {
    stop("'", file, "' not found")
    }

    View Slide

  4. GLUE
    GLUE
    GLUE
    GLUE MAKES GLUING STRINGS EASY!
    MAKES GLUING STRINGS EASY!
    apples <- 3
    bananas <- 2
    glue("
    Inventory
    Apples: {apples}
    Bananas: {bananas}
    Total: {apples + bananas}")
    ## Inventory
    ## Apples: 3
    ## Bananas: 2
    ## Total: 5

    View Slide

  5. GLUE
    GLUE
    GLUE
    GLUE IS CONVIENENT
    IS CONVIENENT
    if (TRUE) {
    glue("
    You can indent naturally \\
    and break up long lines \\
    if needed.
    ")
    }
    ## You can indent naturally and break up long lines if needed.

    View Slide

  6. GLUE
    GLUE
    GLUE
    GLUE IS SAFE
    IS SAFE
    glue("{1:3} will recycle with a length of 1 {letters[1]}")
    ## 1 will recycle with a length of 1 a
    ## 2 will recycle with a length of 1 a
    ## 3 will recycle with a length of 1 a
    glue("{1:3} will recycle with the same length {letters[1:3]}")
    ## 1 will recycle with the same length a
    ## 2 will recycle with the same length b
    ## 3 will recycle with the same length c
    glue("{1:3} will not recycle with inconsistent length {letters[1:2]}")
    ## Error: Variables must be length 1 or 3

    View Slide

  7. GLUE
    GLUE
    GLUE
    GLUE HANDLES MISSINGS
    HANDLES MISSINGS
    footies %>% mutate(glue("{first} {middle} {last}")) %>% pull()
    ## Tim Filiga Cahill
    ## Harry NA Kewell
    ## Mark NA Schwarzer
    footies %>% mutate(glue("{first} {middle} {last}", .na = NULL)) %>%
    pull()
    ## Tim Filiga Cahill
    ## NA
    ## NA
    footies %>% mutate(glue("{first} {middle} {last}", .na = "")) %>%
    pull()
    ## Tim Filiga Cahill
    ## Harry Kewell
    ## Mark Schwarzer

    View Slide

  8. GLUE
    GLUE
    GLUE
    GLUE IS FOR PACKAGES
    IS FOR PACKAGES
    Zero dependencies, tested to R 3.1
    Customizable
    Fast
    abort <- function(..., .sep = "", .envir = parent.frame()) {
    stop(glue(..., .envir = .envir), call. = FALSE)
    }
    if (actual != expected) {
    abort("
    Expected content-type:
    * {expected}
    Actual content-type:
    * {actual}
    ")
    }

    View Slide

  9. GLUE
    GLUE
    GLUE
    GLUE IS CUSTOMIZABLE
    IS CUSTOMIZABLE

    View Slide

  10. .OPEN
    .OPEN AND
    AND .CLOSE
    .CLOSE CHANGE DELIMITERS
    CHANGE DELIMITERS
    j_glue <- function(..., .envir = parent.frame()) {
    glue(..., .open = "<", .close = ">", .envir = .envir)
    }
    apples <- 1; bananas <- 2
    json <- j_glue('
    {
    "fruits": {
    "apples": ,
    "bananas":
    }
    }')
    jsonlite::fromJSON(json)
    ## $fruits
    ## $fruits$apples
    ## [1] 1
    ##
    ## $fruits$bananas
    ## [1] 2

    View Slide

  11. .TRANSFORMER
    .TRANSFORMER ARE ROBUST
    ARE ROBUST
    shell_transformer <- function(code, envir) {
    shQuote(eval(parse(text = code), envir))
    }
    glue_sh <- function(..., .envir = parent.frame()) {
    glue(..., .envir = .envir, .transformer = shell_transformer)
    }
    filename <- "test"
    writeLines("hello!", filename)
    cmd <- glue_sh("cat {filename}")
    cmd
    ## cat 'test'

    View Slide

  12. .TRANSFORMER
    .TRANSFORMER ARE USEFUL
    ARE USEFUL
    glue_fmt("π: {pi:.7}")
    ## π: 3.1415927
    ratio <- sum(mtcars$gear == 3) / nrow(mtcars)
    glue_fmt("{ratio * 100:.2}% of mtcars have 3 gears")
    ## 46.88% of mtcars have 3 gears

    View Slide

  13. .TRANSFORMER
    .TRANSFORMER ARE FUN
    ARE FUN
    glue_ji("If life gives you :lemon: make :tropical_drink:")
    ## If life gives you make
    glue_ji("Many :hands*: make :light: :work:")
    ## Many make

    View Slide

  14. GLUE
    GLUE
    GLUE
    GLUE IS FAST
    IS FAST

    View Slide

  15. TYPES OF SPEED
    TYPES OF SPEED
    Speed of writing
    Speed of recall
    Speed of execution

    View Slide

  16. GLUE
    GLUE
    GLUE
    GLUE IS FAST ON SINGLE STRINGS
    IS FAST ON SINGLE STRINGS
    bar <- "bar"
    glue("foo{bar}")

    View Slide

  17. GLUE
    GLUE
    GLUE
    GLUE IS FAST ON LOTS OF STRINGS
    IS FAST ON LOTS OF STRINGS
    bar <- rep("bar", 100000)
    glue("foo{bar}")

    View Slide

  18. GLUE
    GLUE
    GLUE
    GLUE IS FAST
    IS FAST
    { } parsing in C
    performance dominated by parse() eval() and
    paste0().
    still slow? Vectorize!
    https://glue.tidyverse.org/articles/speed.html

    View Slide

  19. GLUE
    GLUE
    GLUE
    GLUE CONTAINS MORE THAN
    CONTAINS MORE THAN GLUE()
    GLUE()

    View Slide

  20. GLUE_DATA()
    GLUE_DATA() BINDS STRINGS TO ANY LIST /
    BINDS STRINGS TO ANY LIST /
    ENVIRONMENT
    ENVIRONMENT
    library(magrittr)
    head(mtcars) %>%
    glue_data("{rownames(.)} has {cyl} cylinders and {hp}hp")
    ## Mazda RX4 has 6 cylinders and 110hp
    ## Mazda RX4 Wag has 6 cylinders and 110hp
    ## Datsun 710 has 4 cylinders and 93hp
    ## Hornet 4 Drive has 6 cylinders and 110hp
    ## Hornet Sportabout has 8 cylinders and 175hp
    ## Valiant has 6 cylinders and 105hp

    View Slide

  21. GLUE_COLLAPSE()
    GLUE_COLLAPSE() COLLAPSES VECTORS
    COLLAPSES VECTORS
    glue_collapse(1:10, sep = " ")
    ## 1 2 3 4 5 6 7 8 9 10
    glue_collapse(1:10, sep = " ", width = 10)
    ## 1 2 3 4...
    glue_collapse(backtick(1:10), sep = ", ", last = " and ")
    ## `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9` and `10`

    View Slide

  22. GLUE_SQL()
    GLUE_SQL() QUOTES SQL STRINGS
    QUOTES SQL STRINGS
    con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
    tbl <- DBI::Id(schema = "xyz", table = "iris")
    glue_sql(
    "SELECT * FROM {`tbl`} WHERE species IN ({vals*})",
    vals = c("setosa", "versicolor"), .con = con)
    ## SELECT * FROM `xyz`.`iris` WHERE species IN ('setosa', 'versicol

    View Slide

  23. STR_GLUE()
    STR_GLUE() IS IN
    IS IN tidyverse
    www.rstudio.com
    library(tidyverse)
    str_glue("
    glue version: {packageVersion('glue')}
    tidyverse version: {packageVersion('tidyverse')}
    ")
    ## glue version: 1.2.0.9000
    ## tidyverse version: 1.2.1

    View Slide

  24. GLUE
    GLUE
    GLUE
    GLUE IS POPULAR???
    IS POPULAR???

    View Slide

  25. 60 reverse dependencies
    Monthly downloads downloads
    downloads 419K/month
    419K/month
    dplyr, stringr, tidyr dependency, maybe already installed

    View Slide

  26. install.packages("glue")
    glue.tidyverse.org
    adjective <- c("luminous", "stylish", "super", "striking",
    "impressive", "fantastic")
    glue("Have a {sample(adjective, 1)} day!")
    ## Have a impressive day!
      
    @jimhester_ jimhester [email protected]

    View Slide