P8105: Linear Models

0d559afa4f15e19e0c058fd77da651e4?s=47 Jeff Goldsmith
November 07, 2018
2.9k

P8105: Linear Models

0d559afa4f15e19e0c058fd77da651e4?s=128

Jeff Goldsmith

November 07, 2018
Tweet

Transcript

  1. 1 LINEAR REGRESSION Jeff Goldsmith, PhD Department of Biostatistics

  2. 2 • Linear regression is one approach to modeling Modeling

    R for Data Science
  3. 3 • Like … seriously. I use regression for everything

    • Regression covers simple stuff (t-tests) to complex stuff (automated variable selection via penalization) – Yes, I use regression for t-tests Regression is my favorite
  4. 4 • Linear models

  5. 5 • Outcome is continuous; predictors can be anything •

    Continuous predictors are added directly • Categorical predictors require dummy indicator variables – For each non-reference group, a binary (0 / 1) variable indicating group membership for each subject is created and used in the model Predictors
  6. 6 • Testing

  7. 7 • Many model assumptions (constant variance, model specification, etc)

    can be examined using residuals – Look at overall distribution (centered at 0? Skewed? Outliers? – Look at residuals vs predictors (any non-linearity? Trends? Non-constant residual variance?) Diagnostics
  8. 8 • Generalized linear models

  9. 9 • lm for linear models • glm for generalized

    linear models • Arguments include – Formula: y ~ x1 + x2 – Data • Output is complex, and also kind of a mess – Use the broom package! Linear models in R