170

# When BLUE Is Not Best: Non-Normal Errors and the Linear Model

Presented on January 9 at the 2016 Annual Meeting of the Southern Political Science Association in San Juan, Puerto Rico. January 09, 2016

## Transcript

1. When BLUE Is Not Best
Non-Normal Errors and the Linear Model
Carlisle Rainey
Assistant Professor
Texas A&M University
Daniel K. Baissa
Ph.D. Student
Harvard University
Paper, code, and data at
carlislerainey.com/research

2. Key Point
Gauss-Markov theorem is an elegant result,
but it’s not useful for applied researchers.

3. Background

4. yi = Xi + ✏i

5. Technical assumptions:
1. The design matrix is full rank.
2. The model is correct.

1. Errors have mean zero.
2. Errors have constant, ﬁnite variance.
3. Errors are independent.
4. Errors follow a normal distribution.

1. Errors have mean zero.
2. Errors have constant, ﬁnite variance.
3. Errors are independent.
4. Errors follow a normal distribution.
A1 → consistency

1. Errors have mean zero.
2. Errors have constant, ﬁnite variance.
3. Errors are independent.
4. Errors follow a normal distribution.
A1-A4 → BUE

1. Errors have mean zero.
2. Errors have constant, ﬁnite variance.
3. Errors are independent.
4. Errors follow a normal distribution.
A1-A3 → BLUE
(Gauss-Markov Theorem)

10. But this is not a powerful result.
Key Point

11. Linearity in BLUE
linear model
or
linear in the parameters

12. Linearity in BLUE
linear model
or
linear in the parameters

13. Linearity in BLUE
linear estimator
or
ˆ = 1yy + 2y2 + ... + nyn
ˆ = My

14. Linearity in BLUE
linearity ≅ easy
ˆ = My = (X0X) 1X0y

15. −4 −2 0 2 4
ε
i
0.0
0.1
0.2
0.3
0.4
Density

16. −4 −2 0 2 4
ε
i
0.0
0.1
0.2
0.3
0.4
Density

17. Practical Importance

18. –Berry (1993)
“[Even without normally distributed errors]
OLS coefﬁcient estimators remain
unbiased and efﬁcient.”

19. –Wooldridge (2013)
“[The Gauss-Markov theorem] justiﬁes the
use of the OLS method rather than using
a variety of competing estimators.”

20. –Gujarati (2004)
“We need not look for another linear
unbiased estimator, for we will not ﬁnd
such an estimator whose variance is
smaller than the OLS estimator.”

21. –Berry and Feldman (1993)
“An important result in multiple regression is
the Gauss-Markov theorem, which proves
that when the assumptions are met, the
least squares estimators of regression
parameters are unbiased and efﬁcient.”

22. –Berry and Feldman (1993)
“The Gauss-Markov theorem allows us to
have considerable conﬁdence in the least
squares estimators.”

23. Alternatives

24. Skewness

25. Heavy Tails

26. Clark and Golder (2006)

27. −2 0 2 4 6
Standardized Residuals
0
50
100
150
Counts
Shapiro−Wilk p−value: 2.8 × 10−18

28. −4 −2 0 2
Standardized Residuals
0
50
100
Counts
Shapiro−Wilk p−value: 0.002

29. 1 2 5 20 50 150
District Magnitude
0
5
10
15
Effect of ENEG
Least Squares, No Transformation
1 2 5 20 50 150
District Magnitude
Biweight, Box−Cox Transformation

30. 1 2 5 20 50 150
District Magnitude
0
5
10
15
Effect of ENEG
Least Squares, No Transformation
1 2 5 20 50 150
District Magnitude
Biweight, Box−Cox Transformation

31. Key Points

32. Without normality, it is easy to
ﬁnd a unbiased estimator
better than least squares.
Point #1

33. Researchers can learn a lot
from unusual cases.
Point #2