Davis Vaughan
October 08, 2019
490

# slide-almanac.pdf

A tour of two new R packages, {slide} and {almanac}.

{slide} empowers you to perform arbitrary rolling computations, like cumulative functions, rolling averages, and rolling regressions.

{almanac} provides the tools to construct recurrence rules and schedules, and then adjust dates relative to them. This allows you to shift dates by "3 business days".

Together, you can perform computations that are highly relevant in a business setting, such as "a rolling average over the past 20 business days".

October 08, 2019

## Transcript

1. Moving Averages and Calendars
Davis Vaughan
@dvaughan32
Software Engineer, RStudio
October 2019

2. Window Functions
Schedules

3. Window What?

4. speakerdeck.com/davisvaughan/slide-almanac

5. speakerdeck.com/davisvaughan/slide-almanac
function applied
=
mean()

6. speakerdeck.com/davisvaughan/slide-almanac
function applied
=
mean()
function applied
=
sd()

7. Function applied
is arbitrary

8. Function applied
is arbitrary
rolling regression = lm()

9. Types of windows:
1) Sliding
2) Expanding

10. speakerdeck.com/davisvaughan/slide-almanac Idea from: https://eng.uber.com/forecasting-introduction/

11. speakerdeck.com/davisvaughan/slide-almanac
Moving averages, rolling regressions…
Idea from: https://eng.uber.com/forecasting-introduction/

12. speakerdeck.com/davisvaughan/slide-almanac Idea from: https://eng.uber.com/forecasting-introduction/

13. speakerdeck.com/davisvaughan/slide-almanac
Cumulative sums, expanding window regression…
Idea from: https://eng.uber.com/forecasting-introduction/

14. In R?

15. speakerdeck.com/davisvaughan/slide-almanac
So many attempts:
- zoo::rollapply()
- tibbletime::rollify()
- tsibble::slide() / stretch()
- data.table::frollapply() (2019-10-03)

16. {slide}

17. speakerdeck.com/davisvaughan/slide-almanac
slide(1:4, ~.x, .before = 2)
3
2
1
[[3]]
2
1
[[2]]
4
3
2
[[4]]
1
[[1]]

18. speakerdeck.com/davisvaughan/slide-almanac
3
2
1
[[3]]
NULL
[[2]]
4
3
2
[[4]]
NULL
[[1]]
slide(1:4, ~.x, .before = 2, .complete = TRUE)
Ignore partial results

19. speakerdeck.com/davisvaughan/slide-almanac
slide(1:4, ~.x, .before = 1, .after = 1)
4
3
2
[[3]]
4
3
[[4]]
3
2
1
[[2]]
2
1
[[1]]
Center alignment

20. speakerdeck.com/davisvaughan/slide-almanac
slide(1:4, ~.x, .before = Inf)
3
2
1
[[3]]
2
1
[[2]]
4
3
2
1
[[4]]
1
[[1]]
Cumulative sliding

21. speakerdeck.com/davisvaughan/slide-almanac
slide()
slide_dbl()
slide_int()
...
Type Stability

22. speakerdeck.com/davisvaughan/slide-almanac
sales_vec
[1] 2 4 6 2
slide_dbl(sales_vec, mean, .before = 2)
[1] 2 3 4 4

23. speakerdeck.com/davisvaughan/slide-almanac
sales_vec
[1] 2 4 6 2
slide_dbl(sales_vec, mean, .before = 2)
[1] 2 3 4 4

24. speakerdeck.com/davisvaughan/slide-almanac
sales_vec
[1] 2 4 6 2
slide_dbl(sales_vec, mean, .before = 2)
[1] 2 3 4 4

25. speakerdeck.com/davisvaughan/slide-almanac
sales_vec
[1] 2 4 6 2
slide_dbl(sales_vec, mean, .before = 2)
[1] 2 3 4 4

26. speakerdeck.com/davisvaughan/slide-almanac
sales_vec
[1] 2 4 6 2
slide_dbl(sales_vec, mean, .before = 2)
[1] 2 3 4 4

27. speakerdeck.com/davisvaughan/slide-almanac
index_vec <- as.Date("2019-08-29") + c(0, 1, 5, 6)
wday_vec <- wday(index_vec, label = TRUE)
company <- tibble(
sales = sales_vec,
index = index_vec,
wday = wday_vec
)
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

28. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

29. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

30. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

31. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

32. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

33. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

34. speakerdeck.com/davisvaughan/slide-almanac
company <- company %>%
mutate(
roll_row = slide_dbl(sales, mean, .before = 2)
)
roll_row

3
4
4
2

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

35. speakerdeck.com/davisvaughan/slide-almanac
# Construct a regular index
full_index <- expand(
company,
index = full_seq(index, 1)
)
# Join with original data
company_full_raw <- left_join(
full_index,
company
)
# Slide over this, then filter back down
company_three_day <- company_full_raw %>%
mutate(
roll_day = slide_dbl(
sales,
mean,
na.rm = TRUE,
.before = 2
)
) %>%
filter(
index %in% company\$index
)
Solution?
7 2019-09-04
2
6 6 2019-09-03
2019-09-02
5 NA
4 NA 2019-09-01
1
3
2
2019-08-31
NA
2019-08-30
4
2 2019-08-29
sales index

3
6
4
roll_day
2
4 2 2019-09-04
1
3
2
2019-09-03
6
2019-08-30
4
2 2019-08-29
sales index

company_full_raw
company_three_day

36. speakerdeck.com/davisvaughan/slide-almanac
# Construct a regular index
full_index <- expand(
company,
index = full_seq(index, 1)
)
# Join with original data
company_full_raw <- left_join(
full_index,
company
)
# Slide over this, then filter back down
company_three_day <- company_full_raw %>%
mutate(
roll_day = slide_dbl(
sales,
mean,
na.rm = TRUE,
.before = 2
)
) %>%
filter(
index %in% company\$index
)
Solution?
7 2019-09-04
2
6 6 2019-09-03
2019-09-02
5 NA
4 NA 2019-09-01
1
3
2
2019-08-31
NA
2019-08-30
4
2 2019-08-29
sales index

3
6
4
roll_day
2
4 2 2019-09-04
1
3
2
2019-09-03
6
2019-08-30
4
2 2019-08-29
sales index

company_full_raw
company_three_day
I
JUST
WANT
A
3
DAY
AVERAGE

37. slide(.x, .f, …)
slide_index(.x, .i, .f, …)

38. speakerdeck.com/davisvaughan/slide-almanac
slide_index(
.x = wday_vec,
.i = index_vec,
.f = ~.x,
.before = days(2)
)
Tue
[[3]]
Fri
Thu
[[2]]
Wed
Tue
[[4]]
Thu
[[1]]
slide(
.x = wday_vec,
.f = ~.x,
.before = 2
)
Tue
Fri
Thu
[[3]]
Fri
Thu
[[2]]
Wed
Tue
Fri
[[4]]
Thu
[[1]]

39. speakerdeck.com/davisvaughan/slide-almanac
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

40. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

41. speakerdeck.com/davisvaughan/slide-almanac
“3 day rolling average?”

42. speakerdeck.com/davisvaughan/slide-almanac

43. speakerdeck.com/davisvaughan/slide-almanac
roll_day

3
4
6
2

3
5
4
roll_bday
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

3 bday = [Fri, Mon, Tue]
3 day = [Sun, Mon, Tue]

44. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally

45. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally
“knows” about custom holidays and weekends

46. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally
“knows” about custom holidays and weekends
“adjusts” dates relative to the calendar

47. {almanac}

48. Recurrence rule:
A set of conditions that define a
recurring event, such as a weekend
or holiday.

49. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)

50. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)
Base frequency of the event

51. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)
Base frequency of the event
Recurrence conditions

52. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)
Base frequency of the event
Recurrence conditions
sch_in(c("2019-09-02", "2019-09-03"), on_labor_day)
#> [1] TRUE FALSE

53. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)
Base frequency of the event
Recurrence conditions
sch_in(c("2019-09-02", "2019-09-03"), on_labor_day)
#> [1] TRUE FALSE
sch_seq("2017-01-01", "2019-12-31", on_labor_day)
#> [1] "2017-09-04" "2018-09-03" "2019-09-02"

54. Schedule:
A collection of recurrence rules,
required dates, and exclusion
dates.

55. speakerdeck.com/davisvaughan/slide-almanac
on_labor_day <- yearly() %>%
recur_on_ymonth(“September”) %>%
recur_on_wday(“Monday”, nth = 1)
on_christmas <- yearly() %>%
recur_on_ymonth(“December”) %>%
recur_on_mday(25)
on_weekends <- weekly() %>%
recur_on_weekends()
on_weekends_or_holidays <- schedule() %>%

56. speakerdeck.com/davisvaughan/slide-almanac
sch_seq("2019-09-01", "2019-12-31", on_weekends_or_holidays)
#> [1] "2019-09-01" "2019-09-02" "2019-09-07" "2019-09-08" "2019-09-14"
#> [6] "2019-09-15" "2019-09-21" "2019-09-22" "2019-09-28" "2019-09-29"
#> ...
#> [31] "2019-12-14" "2019-12-15" "2019-12-21" "2019-12-22" "2019-12-25"
#> [36] "2019-12-28" "2019-12-29"

57. speakerdeck.com/davisvaughan/slide-almanac
* These will probably move to their own
hldy_christmas()
hldy_easter()
hldy_thanksgiving()
...
calendar_us_federal()
calendar_us_nyse()
Prebuilt holidays and calendars

58. speakerdeck.com/davisvaughan/slide-almanac
* These will probably move to their own
hldy_christmas()
hldy_easter()
hldy_thanksgiving()
...
calendar_us_federal()
calendar_us_nyse()
Prebuilt holidays and calendars
Particularly
challenging!

59. speakerdeck.com/davisvaughan/slide-almanac
# A Monday
labor_day <- “2019-09-02"
# Find the next business day?
# - Sees labor day, adjust by 1 day
# - Lands on 2019-09-03, done!
#> [1] "2019-09-03"
# - Sees labor day, adjust by -1 day
# - Lands on 2019-09-01, a Sunday, adjust by -1 day
# - Lands on 2019-08-31, a Saturday, adjust by -1 day
# - Lands on 2019-08-30, done!
#> [1] “2019-08-30"

60. speakerdeck.com/davisvaughan/slide-almanac
# A Monday
labor_day <- “2019-09-02"
# Find the next business day?
# - Sees labor day, adjust by 1 day
# - Lands on 2019-09-03, done!
#> [1] "2019-09-03"
# - Sees labor day, adjust by -1 day
# - Lands on 2019-09-01, a Sunday, adjust by -1 day
# - Lands on 2019-08-31, a Saturday, adjust by -1 day
# - Lands on 2019-08-30, done!
#> [1] “2019-08-30"

61. speakerdeck.com/davisvaughan/slide-almanac
# A Monday
labor_day <- “2019-09-02"
# Find the next business day?
# - Sees labor day, adjust by 1 day
# - Lands on 2019-09-03, done!
#> [1] "2019-09-03"
# - Sees labor day, adjust by -1 day
# - Lands on 2019-09-01, a Sunday, adjust by -1 day
# - Lands on 2019-08-31, a Saturday, adjust by -1 day
# - Lands on 2019-08-30, done!
#> [1] “2019-08-30"

62. speakerdeck.com/davisvaughan/slide-almanac
# A Monday
labor_day <- “2019-09-02"
# Find the next business day?
# - Sees labor day, adjust by 1 day
# - Lands on 2019-09-03, done!
#> [1] "2019-09-03"
# - Sees labor day, adjust by -1 day
# - Lands on 2019-09-01, a Sunday, adjust by -1 day
# - Lands on 2019-08-31, a Saturday, adjust by -1 day
# - Lands on 2019-08-30, done!
#> [1] “2019-08-30"
This can also be a function

63. Modified following:
after x, unless it falls in a
different month, in which case the
first business day before x is

64. speakerdeck.com/davisvaughan/slide-almanac
on_15th_and_last <- monthly() %>%
recur_on_mday(c(15, -1))
payments <- tibble(
dates = sch_seq("2019-09-01", "2019-12-31", on_15th_and_last),
wday = wday(dates, label = TRUE)
)
2019-12-31 2019-12-31 Tue
8 Tue
2019-12-15
7 Sun Mon
2019-12-16
6 2019-11-30 Sat Fri
2019-11-29
2019-11-15 Fri 2019-11-15 Fri
5

Mon
Thu
Tue
Mon

2019-09-30
2019-10-15
2019-10-31
2019-09-16
4 Thu
2019-10-31
1
3
2
2019-10-15 Tue
2019-09-30 Mon
Sun
2019-09-15
wday
dates

65. speakerdeck.com/davisvaughan/slide-almanac
2019-12-31 2019-12-31 Tue
8 Tue
2019-12-15
7 Sun Mon
2019-12-16
6 2019-11-30 Sat Fri
2019-11-29
2019-11-15 Fri 2019-11-15 Fri
5

Mon
Thu
Tue
Mon

2019-09-30
2019-10-15
2019-10-31
2019-09-16
4 Thu
2019-10-31
1
3
2
2019-10-15 Tue
2019-09-30 Mon
Sun
2019-09-15
wday
dates

on_weekends <- weekly() %>% recur_on_weekends()
payments %>%
mutate(
)

66. speakerdeck.com/davisvaughan/slide-almanac
friday_before_labor_day <- “2019-08-30"
# Move forward two business days?
# - Steps forward 1 day to Saturday 2019-08-31
# - Steps forward 1 day to Wednesday 2019-09-04
sch_step(
friday_before_labor_day,
n = 2,
schedule = on_weekends_or_holidays
)
#> [1] “2019-09-04"

67. speakerdeck.com/davisvaughan/slide-almanac
friday_before_labor_day <- “2019-08-30"
# Move forward two business days?
# - Steps forward 1 day to Saturday 2019-08-31
# - Steps forward 1 day to Wednesday 2019-09-04
sch_step(
friday_before_labor_day,
n = 2,
schedule = on_weekends_or_holidays
)
#> [1] “2019-09-04"

68. {slide} + {almanac}

69. speakerdeck.com/davisvaughan/slide-almanac

70. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally

71. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally
We can make this with {almanac}

72. speakerdeck.com/davisvaughan/slide-almanac
calendar <- (weekends + holidays)
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
roll_bday = slide_index_dbl(sales, index, mean, .before = bdays(2, calendar))
)
Ideally
We can make this with {almanac}
This doesn’t exist yet, but would use sch_step()

73. slide_index(.x, .i, .f, …)
slide_between(.x, .i, .starts, .stops, .f, …)

74. speakerdeck.com/davisvaughan/slide-almanac
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
starts = index - days(2),
stops = index,
roll_day = slide_between_dbl(sales, index, mean, .starts = starts, .stops = stops)
)

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

75. speakerdeck.com/davisvaughan/slide-almanac
company <- company %>%
mutate(
roll_day = slide_index_dbl(sales, index, mean, .before = days(2))
)
company <- company %>%
mutate(
starts = index - days(2),
stops = index,
roll_day = slide_between_dbl(sales, index, mean, .starts = starts, .stops = stops)
)

3
6
4
roll_day
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

This is where we solve our problem

76. speakerdeck.com/davisvaughan/slide-almanac
company <- company %>%
mutate(
starts = sch_step(index, n = -2, schedule = on_weekends),
stops = index,
roll_bday = slide_between_dbl(sales, index, mean, .starts = starts, .stops = stops)
)
roll_day

3
4
6
2

3
5
4
roll_bday
2
4 2 Wed
2019-09-04
1
3
2
2019-09-03 Tue
6
2019-08-30 Fri
4
2 Thu
2019-08-29
sales wday
index

77. speakerdeck.com/davisvaughan/slide-almanac

78. speakerdeck.com/davisvaughan/slide-almanac

79. In conclusion…

80. {slide} for window functions
slide_index() to roll relative to an index

81. {slide} for window functions
slide_index() to roll relative to an index
{almanac} to build schedules and adjust dates

82. {slide} for window functions
slide_index() to roll relative to an index
{slide} + {almanac}
=
Flexible rolling computations!
{almanac} to build schedules and adjust dates

83. Special Thanks
JavaScript: rrule
https://github.com/jakubroztocil/rrule
James Laird-Smith: gs
https://github.com/jameslairdsmith/gs
Jeroen Ooms: V8
https://github.com/jeroen/V8

84. Questions?
{almanac}
GitHub
https://github.com/DavisVaughan/almanac
Website
https://davisvaughan.github.io/almanac
{slide}
GitHub
https://github.com/DavisVaughan/slide
Website
https://davisvaughan.github.io/slide