The DESIGN
of EVERYDAY
FUNCTIONS
Hadley Wickham
@hadleywickham
RStudio
September 2018
Slide 2
Slide 2 text
Practitioner Programmer
Slide 3
Slide 3 text
Interactive
Easily detect & resolve problems
Packaged
In production
You hear your code Things break and
people at you
Practitioner Programmer
Slide 4
Slide 4 text
Notebook IDE
https://yihui.name/en/2018/09/notebook-war/
The First Notebook War
Data analyst Engineer
Slide 5
Slide 5 text
Me
You
Practitioner Programmer
Slide 6
Slide 6 text
Code is a conversation
Ambiguity can be tolerated
early and often
Practitioner Programmer
Implicit Explicit
Slide 7
Slide 7 text
Practitioner Programmer
Implicit Explicit
Slide 8
Slide 8 text
What makes a
good door?
https://99percentinvisible.org/article/norman-doors
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
“When you have trouble with things—whether
it’s figuring out whether to push or pull a door or
the arbitrary vagaries of the modern computer
and electronics industries—it’s not your fault.
Don't blame yourself: blame the designer...”
— Donald A. Norman
“Rule of thumb: if you think
something is clever and
sophisticated, beware — it is
probably self-indulgence.”
— Donald A. Norman
Slide 19
Slide 19 text
WAT makes a bad
function?
https://www.destroyallsoftware.com/talks/wat
Slide 20
Slide 20 text
c(factor("a"), factor("b"))
What happens when you combine two factors?
Slide 21
Slide 21 text
c(factor("a"), factor("b"))
#> [1] 1 1
What happens when you combine two factors?
WAT!
Slide 22
Slide 22 text
today <- as.Date("2018-09-18")
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
c(today, lunch)
What happens when you combine a date and a date-time?
Slide 23
Slide 23 text
today <- as.Date("2018-09-18")
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
c(today, lunch)
#> [1] "2018-09-18"
#> [2] "4210927-01-24"
What happens when you combine a date and a date time?
WAT!
Slide 24
Slide 24 text
today <- as.Date("2018-09-18")
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
c(lunch, today)
What happens when you combine a date and a date-time?
Slide 25
Slide 25 text
today <- as.Date("2018-09-18")
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
c(lunch, today)
#> [1] "2018-09-18 12:00:00 CDT"
#> [2] "1969-12-31 22:56:32 CST"
What happens if you touch a date-time the wrong way?
WAT!
Slide 26
Slide 26 text
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
lunch
#> [1] "2018-09-18 12:00:00 CEST"
c(lunch)
c(NULL, lunch)
What happens if you look at a date-time the wrong way?
Slide 27
Slide 27 text
lunch <- as.POSIXct("2018-09-18 12:00",
tz = "Europe/Belgrade")
lunch
#> [1] "2018-09-18 12:00:00 CEST"
c(lunch)
#> [1] "2018-09-18 05:00:00 CDT"
c(NULL, lunch)
#> [1] 1537264800
What happens if you look at a date-time the wrong way?
WAT!
WAT!!
Slide 28
Slide 28 text
What makes a good
function?
one aspect of
^
Slide 29
Slide 29 text
c(, ) ->
c(>) ->
c(NULL, >) ->
Types can give us a high-level overview of a function
I’ll put types in angle brackets to make
clear that this is not real R code
Slide 30
Slide 30 text
To do that we need to review some foundations
Atomic
Numeric
Logical Integer Double Character
For atomic vectors, the rules are simple
Logical Integer Double Character
Logical logical integer double character
Integer integer integer double character
Double double double double character
Character character character character character
Slide 33
Slide 33 text
For atomic vectors, the rules are simple
Logical
Integer
Double
Character
Even if you’re never explicitly
learned this, I think you
internalise it quickly.
Slide 34
Slide 34 text
Unfortunately c() breaks down when we get to S3 vectors
Atomic
Numeric
Logical Integer Double Character
factor POSIXct Date
S3 vectors
Slide 35
Slide 35 text
Figuring out the rules is the goal of the vctrs package
http://vctrs.r-lib.org
Slide 36
Slide 36 text
mutate() ->
filter() ->
select() ->
arrange() ->
summarise() ->
group_by() ->
The types of the primary dplyr functions are simple
Slide 37
Slide 37 text
if_else(, , ) ->
if_else(, , ) ->
if_else(, , ) -> ???
if_else(, , ) -> ???
if_else(, , ) -> ???
if_else(, , ) -> ???
But there are a few that are more complex
Slide 38
Slide 38 text
x <- runif(6)
if_else(x > 0.5, x, NA)
#> Error: `false` must be type double,
#> not logical
Which leads to this annoyance
Slide 39
Slide 39 text
x <- runif(6)
if_else(x > 0.5, x, NA)
#> Error: `false` must be type double,
#> not logical
if_else(x > 5, x, NA_real_)
#> [1] NA 0.700 0.557 NA NA NA
Which leads to this annoyance
You’re currently forced to learn
about the “typed” NAs
Slide 40
Slide 40 text
if_else(, , ) ->
vec_c(, ) ->
if_else(, , ) ->
vec_c(, ) ->
I think I'm starting to see the principles
Slide 41
Slide 41 text
Fin
Slide 42
Slide 42 text
Practitioner Programmer
Implicit Explicit
Interactive
Easily detect & resolve problems
Packaged
In production
Code is a conversation
Ambiguity can be tolerated early and often