When BLUE Is Not Best: Non-Normal Errors and the Linear Model

When BLUE Is Not Best: Non-Normal Errors and the Linear Model

Presented on January 9 at the 2016 Annual Meeting of the Southern Political Science Association in San Juan, Puerto Rico.

Bf99409063473973c7f9d3cf4f882492?s=128

Carlisle Rainey

January 09, 2016
Tweet

Transcript

  1. When BLUE Is Not Best Non-Normal Errors and the Linear

    Model Carlisle Rainey Assistant Professor Texas A&M University Daniel K. Baissa Ph.D. Student Harvard University Paper, code, and data at carlislerainey.com/research
  2. Key Point Gauss-Markov theorem is an elegant result, but it’s

    not useful for applied researchers.
  3. Background

  4. yi = Xi + ✏i

  5. Technical assumptions: 1. The design matrix is full rank. 2.

    The model is correct.
  6. Additional assumptions: 1. Errors have mean zero. 2. Errors have

    constant, finite variance. 3. Errors are independent. 4. Errors follow a normal distribution.
  7. Additional assumptions: 1. Errors have mean zero. 2. Errors have

    constant, finite variance. 3. Errors are independent. 4. Errors follow a normal distribution. A1 → consistency
  8. Additional assumptions: 1. Errors have mean zero. 2. Errors have

    constant, finite variance. 3. Errors are independent. 4. Errors follow a normal distribution. A1-A4 → BUE
  9. Additional assumptions: 1. Errors have mean zero. 2. Errors have

    constant, finite variance. 3. Errors are independent. 4. Errors follow a normal distribution. A1-A3 → BLUE (Gauss-Markov Theorem)
  10. But this is not a powerful result. Key Point

  11. Linearity in BLUE linear model or linear in the parameters

  12. Linearity in BLUE linear model or linear in the parameters

  13. Linearity in BLUE linear estimator or ˆ = 1yy +

    2y2 + ... + nyn ˆ = My
  14. Linearity in BLUE linearity ≅ easy ˆ = My =

    (X0X) 1X0y
  15. −4 −2 0 2 4 ε i 0.0 0.1 0.2

    0.3 0.4 Density
  16. −4 −2 0 2 4 ε i 0.0 0.1 0.2

    0.3 0.4 Density
  17. Practical Importance

  18. –Berry (1993) “[Even without normally distributed errors] OLS coefficient estimators

    remain unbiased and efficient.”
  19. –Wooldridge (2013) “[The Gauss-Markov theorem] justifies the use of the

    OLS method rather than using a variety of competing estimators.”
  20. –Gujarati (2004) “We need not look for another linear unbiased

    estimator, for we will not find such an estimator whose variance is smaller than the OLS estimator.”
  21. –Berry and Feldman (1993) “An important result in multiple regression

    is the Gauss-Markov theorem, which proves that when the assumptions are met, the least squares estimators of regression parameters are unbiased and efficient.”
  22. –Berry and Feldman (1993) “The Gauss-Markov theorem allows us to

    have considerable confidence in the least squares estimators.”
  23. Alternatives

  24. Skewness

  25. Heavy Tails

  26. Clark and Golder (2006)

  27. −2 0 2 4 6 Standardized Residuals 0 50 100

    150 Counts Shapiro−Wilk p−value: 2.8 × 10−18
  28. −4 −2 0 2 Standardized Residuals 0 50 100 Counts

    Shapiro−Wilk p−value: 0.002
  29. 1 2 5 20 50 150 District Magnitude 0 5

    10 15 Effect of ENEG Least Squares, No Transformation 1 2 5 20 50 150 District Magnitude Biweight, Box−Cox Transformation
  30. 1 2 5 20 50 150 District Magnitude 0 5

    10 15 Effect of ENEG Least Squares, No Transformation 1 2 5 20 50 150 District Magnitude Biweight, Box−Cox Transformation
  31. Key Points

  32. Without normality, it is easy to find a unbiased estimator

    better than least squares. Point #1
  33. Researchers can learn a lot from unusual cases. Point #2