Upgrade to Pro — share decks privately, control downloads, hide ads and more …

P8105: Linear Models

Jeff Goldsmith
November 07, 2018
9.2k

P8105: Linear Models

Jeff Goldsmith

November 07, 2018
Tweet

Transcript

  1. 1
    LINEAR REGRESSION
    Jeff Goldsmith, PhD
    Department of Biostatistics

    View Slide

  2. 2
    • Linear regression is one approach to modeling
    Modeling
    R for Data Science

    View Slide

  3. 3
    • Like … seriously. I use regression for everything
    • Regression covers simple stuff (t-tests) to complex stuff (automated variable
    selection via penalization)
    – Yes, I use regression for t-tests
    Regression is my favorite

    View Slide

  4. 4

    Linear models

    View Slide

  5. 5
    • Outcome is continuous; predictors can be anything
    • Continuous predictors are added directly
    • Categorical predictors require dummy indicator variables
    – For each non-reference group, a binary (0 / 1) variable indicating group
    membership for each subject is created and used in the model
    Predictors

    View Slide

  6. 6

    Testing

    View Slide

  7. 7
    • Many model assumptions (constant variance, model specification, etc) can be
    examined using residuals
    – Look at overall distribution (centered at 0? Skewed? Outliers?
    – Look at residuals vs predictors (any non-linearity? Trends? Non-constant
    residual variance?)
    Diagnostics

    View Slide

  8. 8

    Generalized linear models

    View Slide

  9. 9
    • lm for linear models
    • glm for generalized linear models
    • Arguments include
    – Formula: y ~ x1 + x2
    – Data
    • Output is complex, and also kind of a mess
    – Use the broom package!
    Linear models in R

    View Slide