Slide 1

Slide 1 text

it depends: a dialog about dependencies Jim Hester  @jimhester  @jimhester_

Slide 2

Slide 2 text

all R code has dependencies R script R packages external libraries R system libraries

Slide 3

Slide 3 text

dependencies break left-pad event-stream bitrot

Slide 4

Slide 4 text

https://xkcd.com/1987/

Slide 5

Slide 5 text

dependency hell https://xkcd.com/1987/

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

not all dependencies are equal

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

library(magrittr) library(httr) HEAD("https://bioconductor.org/packages/ 3.7/data/annotation/src/contrib/ MafDb.gnomAD.r2.0.1.hs37d5_3.7.0.tar.gz") %>% headers() %>% {.[["content-length"]]} %>% as.numeric() %>% prettyunits::pretty_bytes() #> [1] "4.16 GB"

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

not all dependencies are equal

Slide 18

Slide 18 text

features bugfixes testing time installation diskspace breakage generality more less

Slide 19

Slide 19 text

consider your users package developers? install time costly smaller, limited packages easier to depend on stability more important than features data scientists / statisticians? install time cheap top packages already installed features most important

Slide 20

Slide 20 text

illusionary superiority teaching ability (Cross 1977) 68% of the surveyed faculty at the University of Nebraska–Lincoln, ranked themselves in the top 25% and more than 90% rated themselves in the top 50%. driving skill (Svenson 1981) 93% of the U.S. respondents and 69% of the Swedish respondents put themselves in the top 50%.

Slide 21

Slide 21 text

pitfalls of dependency removal overestimation of abilities underestimation of new bugs widely used == free tests less is not always more

Slide 22

Slide 22 text

quantification of dependency weight critical

Slide 23

Slide 23 text

itdepends github.com/jimhester/itdepends

Slide 24

Slide 24 text

itdepends Assess usage Measure weights Visualize proportions Assist removal

Slide 25

Slide 25 text

determine usage itdepends::dep_usage_proj() itdepends::dep_usage_pkg()

Slide 26

Slide 26 text

itdepends::dep_usage_proj("~/p/tidyversedashboard") %>% count(pkg, sort = TRUE) #> # A tibble: 13 x 2 #> pkg n #> #> 1 base 558 #> 2 82 #> 3 purrr 44 #> 4 glue 10 #> 5 utils 5 #> 6 tibble 4 #> 7 htmlwidgets 3 #> 8 magick 3 #> 9 gh 2 #> 10 cranlogs 1 #> 11 desc 1 #> 12 stats 1 #> 13 tools 1

Slide 27

Slide 27 text

itdepends::dep_usage_proj("~/p/tidyversedashboard") %>% group_by(pkg) %>% count(fun) %>% top_n(1) %>% arrange(desc(n)) %>% head() #> Selecting by n #> # A tibble: 6 x 3 #> # Groups: pkg [5] #> pkg fun n #> #> 1 base $ 118 #> 2 parse_datetime_8601 14 #> 3 purrr %||% 13 #> 4 purrr map_dfr 13 #> 5 glue glue 7 #> 6 tibble tibble 3

Slide 28

Slide 28 text

itdepends::dep_usage_pkg("devtools") %>% count(pkg, sort = TRUE) #> # A tibble: 36 x 2 #> pkg n #> #> 1 base 3699 #> 2 devtools 362 #> 3 git2r 34 #> 4 usethis 33 #> 5 pkgload 31 #> 6 httr 25 #> 7 withr 25 #> 8 utils 24 #> 9 cli 16 #> 10 tools 15 #> # ... with 26 more rows

Slide 29

Slide 29 text

measure weights itdepends::dep_weight()

Slide 30

Slide 30 text

weights <- itdepends::dep_weight(c("dplyr", "data.table")) weights #> # A tibble: 2 x 25 #> package num_user bin_self bin_user install_self install_user funs downloads last_release #> #> 1 dplyr 19 1692925 21738385 375. 538. 240 89826 2018-11-10 02:30:06 #> 2 data.t… 0 5720340 5720340 27.0 27.0 107 51658 2018-09-30 09:30:08 #> # ... with 16 more variables: open_issues , last_updated , stars , forks , #> # first_release , total_releases , releases_last_52 , num_dev , #> # install_dev , bin_dev , src_size , user_deps , dev_deps , #> # self_timings , user_timings , dev_timings

Slide 31

Slide 31 text

weights[c("package", "num_user", "num_dev", "bin_self", "bin_user", "bin_dev", "install_self", "install_user", "install_dev")] #> # A tibble: 2 x 9 #> package num_user num_dev bin_self bin_user bin_dev install_self install_user install_dev #> #> 1 dplyr 19 78 1692925 21738385 94327415 375. 538. 1989. #> 2 data.table 0 23 5720340 5720340 33679072 27.0 27.0 628.

Slide 32

Slide 32 text

weights[c("package", "funs", "downloads", "first_release", "last_release", "releases_last_52")] #> # A tibble: 2 x 6 #> package funs downloads first_release last_release releases_last_52 #> #> 1 dplyr 240 89826 2014-01-16 16:53:37 2018-11-10 02:30:06 4 #> 2 data.table 107 51658 2006-04-14 18:03:15 2018-09-30 09:30:08 5

Slide 33

Slide 33 text

weights[c("package", "open_issues", "stars", "forks", "last_updated")] #> # A tibble: 2 x 5 #> package open_issues stars forks last_updated #> #> 1 dplyr 113 2757 1011 2019-01-08 14:25:09 #> 2 data.table 765 1696 725 2019-01-08 11:48:45

Slide 34

Slide 34 text

visualize proportions itdepends::dep_plot_time() itdepends::dep_plot_size()

Slide 35

Slide 35 text

itdepends::dep_plot_time("dplyr")

Slide 36

Slide 36 text

itdepends::dep_plot_size("dplyr")

Slide 37

Slide 37 text

assist removal first write tests then replace itdepends::dep_locate()

Slide 38

Slide 38 text

itdepends::dep_locate("purrr", path = "~/p/tidyversedashboard") #> R/dashboard.R:11:3: warning: purrr::map_int #> map_int(res, ~ if (is.null(.x)) NA_integer_ else length(.x)) #> ^~~~~~~ #> R/dashboard.R:21:5: warning: purrr::map_int #> map_int(gh::gh("/repos/:org/:package/stats/commit_activity", org = org, package = package), "total"), #> ^~~~~~~ #> R/dashboard.R:49:3: warning: purrr::map_int #> map_int(description, #> ^~~~~~~ #> R/dashboard.R:69:10: warning: purrr::map_chr #> res <- map_chr(description, #> ^~~~~~~ #> R/dashboard.R:70:5: warning: purrr::possibly #> possibly(function(.x) { .x$get_maintainer() %|||% NA_character_}, otherwise = NA_character_)) #> ^~~~~~~~ #> R/issue_progress.R:18:39: warning: purrr::walk #> if (is.list(x[[i]]) && isTRUE(walk(x[[i]], depth + 1))) { #> ^~~~ #> R/issue_progress.R:25:3: warning: purrr::walk #> walk(x, 1) #> ^~~~

Slide 39

Slide 39 text

itdepends::dep_locate("purrr", path = "~/p/tidyversedashboard")

Slide 40

Slide 40 text

dependencies are not equal must measure and balance beware overconfidence less is not always more
 itdepends dep_usage dep_weight dep_plot dep_locate  @jimhester  @jimhester_ speakerdeck.com/jimhester/it-depends