Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The essentials to work with object-oriented systems in R

The essentials to work with object-oriented systems in R

All R users have used S3, the oldest and most prominent object-oriented system in R even if they were unaware of it, for example by using the summary function both for data frames and for linear models. The two main building blocks of an object-oriented system are objects with specific type (class) and functions (methods) which behave differently depending on the class of their parameters. Most R users probably also had an experience where they got unexpected results which would have been easier to understand with a foundation in object-oriented systems in R. This talk aims to fill some of the gaps so that you can work confidently with existing code utilizing S3 or S4.

The three widely used object-oriented systems are S3, S4 and R6. This talk will focus on S3 which is the most widely used and assume no prior knowledge of object-oriented systems. I will start with a visual explanation of the most important concepts and then I will show you how understanding the basics can help you in your day-to-day work. I will guide you with examples and show hands-on tricks to understand, debug and get the documentation of existing code utilizing S3 or S4.

Ildikó Czeller

May 15, 2018
Tweet

More Decks by Ildikó Czeller

Other Decks in Programming

Transcript

  1. object-oriented
    systems in R
    Ildi Czeller : @czeildi

    View full-size slide

  2. - Data Scientist @Emarsys
    - 3 years R
    - started with C++, Python
    Me:

    View full-size slide

  3. - R user without CS background
    - understand core concepts
    - explore & debug more effectively
    You:

    View full-size slide

  4. summary(lm(y~x))
    ...
    Coefficients:
    ...
    Signif codes: 0 ‘***’
    Multiple R-squared:
    0.7262
    Min. : 1.0
    1st Qu.: 25.8
    Median : 50.5
    Mean : 10049.5
    3rd Qu.: 75.2
    Max. :1000000.0
    summary(c(1:99, 10^6))

    View full-size slide

  5. date: 2018-05-15
    venue: Budapest
    # participants: 450
    attend -> learn
    talk at -> feedback
    organize -> proud
    data
    behavior +
    object =

    View full-size slide

  6. +36 1 333-3333

    View full-size slide

  7. lm(y~x)
    summary
    summary.lm Coef
    R^2

    View full-size slide

  8. lm(y~x)
    summary
    summary.lm
    dispatch method
    class
    +
    generic
    +

    View full-size slide

  9. class / object type

    View full-size slide

  10. class/type in R
    integer
    character
    list
    Date
    data.frame
    r_conference
    base types
    S3 types

    View full-size slide

  11. method
    summary.lm
    generic class
    as.factor

    View full-size slide

  12. method
    summary.data.frame
    generic class
    as.Date.numeric
    class
    generic

    View full-size slide

  13. generic
    summary <- function(object, …)
    UseMethod(“summary”)
    sum <- function(…, na.rm = FALSE)
    .Primitive(“sum”)

    View full-size slide

  14. summary(lm(y~x))
    Coefficients: …
    Signif codes: 0 ‘***’
    Multiple R-squared: 0.7262
    summary.lm(lm(y~x))
    dispatch

    View full-size slide

  15. why so
    powerful?

    View full-size slide

  16. flexible &
    extensible

    View full-size slide

  17. base R +
    different packages
    work together

    View full-size slide

  18. complex types
    can inherit behavior
    from simpler types

    View full-size slide

  19. class is a vector
    c(“r_conference”, “conference”, “event”)
    most specific à à à least specific

    View full-size slide

  20. specialize
    • print(data.table())
    • print.data.table(data.table())
    Sepal.Length Sepal.Width
    1: 5.1 3.5
    2: 4.9 3.0
    3: 4.7 3.2
    4: 4.6 3.1
    5: 5.0 3.6
    ---
    146: 6.7 3.0
    147: 6.3 2.5
    148: 6.5 3.0
    149: 6.2 3.4
    150: 5.9 3.0
    • print(data.frame())
    • print.data.frame(data.frame())
    Sepal.Length Sepal.Width
    1 5.1 3.5
    2 4.9 3.0
    3 4.7 3.2
    4 4.6 3.1
    5 5.0 3.6
    6 5.4 3.9
    7 4.6 3.4
    8 5.0 3.4
    9 4.4 2.9
    10 4.9 3.1

    View full-size slide

  21. fallback
    • summary(data.table())
    • summary.data.table(data.table())
    • summary.data.frame(data.table())
    Sepal.Length
    Min. :4.300
    1st Qu.:5.100
    Median :5.800
    Mean :5.843
    3rd Qu.:6.400
    Max. :7.900
    • summary(data.frame())
    • summary.data.frame(data.frame())
    Sepal.Length
    Min. :4.300
    1st Qu.:5.100
    Median :5.800
    Mean :5.843
    3rd Qu.:6.400
    Max. :7.900

    View full-size slide

  22. extend
    gift.conference gift.r_conference

    View full-size slide

  23. explore
    • seq.Date
    • data.table:::print.data.table
    • lookup::lookup(“sum”) – Jim Hester
    • https://github.com/wch/r-source

    View full-size slide

  24. explore
    • sloop – R package by Hadley Wickham
    • s3_class, ftype
    • s3_dispatch
    • s3_methods_class, s3_methods_generic

    View full-size slide

  25. Advanced R by Hadley Wickham
    https://www.ildiczeller.com/2018/
    04/02/investigating-difftime-
    behavior/

    View full-size slide

  26. use
    understand
    (create)

    View full-size slide