Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When BLUE Is Not Best: Non-Normal Errors and the Linear Model

When BLUE Is Not Best: Non-Normal Errors and the Linear Model

Presented on January 9 at the 2016 Annual Meeting of the Southern Political Science Association in San Juan, Puerto Rico.

Carlisle Rainey

January 09, 2016
Tweet

More Decks by Carlisle Rainey

Other Decks in Science

Transcript

  1. When BLUE Is Not Best
    Non-Normal Errors and the Linear Model
    Carlisle Rainey
    Assistant Professor
    Texas A&M University
    Daniel K. Baissa
    Ph.D. Student
    Harvard University
    Paper, code, and data at
    carlislerainey.com/research

    View Slide

  2. Key Point
    Gauss-Markov theorem is an elegant result,
    but it’s not useful for applied researchers.

    View Slide

  3. Background

    View Slide

  4. yi = Xi + ✏i

    View Slide

  5. Technical assumptions:
    1. The design matrix is full rank.
    2. The model is correct.

    View Slide

  6. Additional assumptions:
    1. Errors have mean zero.
    2. Errors have constant, finite variance.
    3. Errors are independent.
    4. Errors follow a normal distribution.

    View Slide

  7. Additional assumptions:
    1. Errors have mean zero.
    2. Errors have constant, finite variance.
    3. Errors are independent.
    4. Errors follow a normal distribution.
    A1 → consistency

    View Slide

  8. Additional assumptions:
    1. Errors have mean zero.
    2. Errors have constant, finite variance.
    3. Errors are independent.
    4. Errors follow a normal distribution.
    A1-A4 → BUE

    View Slide

  9. Additional assumptions:
    1. Errors have mean zero.
    2. Errors have constant, finite variance.
    3. Errors are independent.
    4. Errors follow a normal distribution.
    A1-A3 → BLUE
    (Gauss-Markov Theorem)

    View Slide

  10. But this is not a powerful result.
    Key Point

    View Slide

  11. Linearity in BLUE
    linear model
    or
    linear in the parameters

    View Slide

  12. Linearity in BLUE
    linear model
    or
    linear in the parameters

    View Slide

  13. Linearity in BLUE
    linear estimator
    or
    ˆ = 1yy + 2y2 + ... + nyn
    ˆ = My

    View Slide

  14. Linearity in BLUE
    linearity ≅ easy
    ˆ = My = (X0X) 1X0y

    View Slide

  15. −4 −2 0 2 4
    ε
    i
    0.0
    0.1
    0.2
    0.3
    0.4
    Density

    View Slide

  16. −4 −2 0 2 4
    ε
    i
    0.0
    0.1
    0.2
    0.3
    0.4
    Density

    View Slide

  17. Practical Importance

    View Slide

  18. –Berry (1993)
    “[Even without normally distributed errors]
    OLS coefficient estimators remain
    unbiased and efficient.”

    View Slide

  19. –Wooldridge (2013)
    “[The Gauss-Markov theorem] justifies the
    use of the OLS method rather than using
    a variety of competing estimators.”

    View Slide

  20. –Gujarati (2004)
    “We need not look for another linear
    unbiased estimator, for we will not find
    such an estimator whose variance is
    smaller than the OLS estimator.”

    View Slide

  21. –Berry and Feldman (1993)
    “An important result in multiple regression is
    the Gauss-Markov theorem, which proves
    that when the assumptions are met, the
    least squares estimators of regression
    parameters are unbiased and efficient.”

    View Slide

  22. –Berry and Feldman (1993)
    “The Gauss-Markov theorem allows us to
    have considerable confidence in the least
    squares estimators.”

    View Slide

  23. Alternatives

    View Slide

  24. Skewness

    View Slide

  25. Heavy Tails

    View Slide

  26. Clark and Golder (2006)

    View Slide

  27. −2 0 2 4 6
    Standardized Residuals
    0
    50
    100
    150
    Counts
    Shapiro−Wilk p−value: 2.8 × 10−18

    View Slide

  28. −4 −2 0 2
    Standardized Residuals
    0
    50
    100
    Counts
    Shapiro−Wilk p−value: 0.002

    View Slide

  29. 1 2 5 20 50 150
    District Magnitude
    0
    5
    10
    15
    Effect of ENEG
    Least Squares, No Transformation
    1 2 5 20 50 150
    District Magnitude
    Biweight, Box−Cox Transformation

    View Slide

  30. 1 2 5 20 50 150
    District Magnitude
    0
    5
    10
    15
    Effect of ENEG
    Least Squares, No Transformation
    1 2 5 20 50 150
    District Magnitude
    Biweight, Box−Cox Transformation

    View Slide

  31. Key Points

    View Slide

  32. Without normality, it is easy to
    find a unbiased estimator
    better than least squares.
    Point #1

    View Slide

  33. Researchers can learn a lot
    from unusual cases.
    Point #2

    View Slide