$30 off During Our Annual Pro Sale. View Details »

Modeling Over-Reports in Survey Data

Modeling Over-Reports in Survey Data

Carlisle Rainey

April 11, 2012
Tweet

More Decks by Carlisle Rainey

Other Decks in Research

Transcript

  1. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Modeling Over-Reports in Survey Data Using Split Population Models to Correct Self-Reported Turn Out Data Carlisle Rainey and Robert Jackson Florida State University April 11, 2012 Rainey and Jackson Over-Reports
  2. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Outline 1 The Problem of Misreports and Three Solutions 2 A Source Monitoring Framework 3 An Application Specific Empirical Model 4 Predicted Probabilities and Marginal Effects 5 Conclusion Rainey and Jackson Over-Reports
  3. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion The Problem Observational studies of turnout that rely on self-reported data have an often recognized, but rarely addressed problem–survey respondents who actually abstained often report turning out to vote. vote over-reporting...certainly ranks among the big annoyances of survey-based electoral research, as it threatens both the general credibility of survey data and the validity of conclusions drawn from studies of individual political behavior and attitudes (Selb and Munzert 2011, p. 2). Rainey and Jackson Over-Reports
  4. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Three Solutions There are three solutions to the problem. 1 Rely instead on validated turn out data. • ANES (64, 76, 88; 78, 86, 90) – too expensive • Ansolabehere and Hersh (2011) offer a more reliable and less expensive method. 2 Improve measurement strategy. • better question wording (Belli et al. 1999) • Item Count Technique (Holbrook and Krosnick 2010) 3 Directly model the process leading to the misreports. • methodologically weakest approach • most accessible approach • sometimes the only viable approach Rainey and Jackson Over-Reports
  5. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion The Purpose of Our Project The purpose of our project is twofold. 1 Investigate whether split-population models offer a viable research strategy for dealing specifically with over-reports. 2 Offer a more general assessment of split-population models for dealing with measurement error. Rainey and Jackson Over-Reports
  6. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Our Approach In order to evaluate the effectiveness of split-population models, we adopt the following approach. 1 Develop a compelling theory explaining over-reports. 2 Derive and estimate a split-population model suitable for our specific application. 3 Compare the inferences from the split-population model to the inferences based on validated data. Rainey and Jackson Over-Reports
  7. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion A Theory of Misreport A source monitoring framework • social pressure • memory failure 1 might have participated in previous elections 2 might have thought about participating • These pressures push respondents to over-report, with respondents being more likely to over-report as time increases. Rainey and Jackson Over-Reports
  8. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion The Self-Report Process respondent turn out abstain misreport correctly report Rainey and Jackson Over-Reports
  9. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion An Empirical Model Let yi be a vector of self-reported turnout data. Let q1 i and q2 i represent unobserved indicators of actual turn out and misreport, respectively, and ∆(x) represent the function logit−1(x) = 1 1 + e−x . P(yi = 1) = P(q1 i = 1) + P(q2 i = 1|q1 i = 0)[1 − P(q1 i = 1)] = ∆(Xβ) + ∆(Zγ)[1 − ∆(Xβ)] Estimated using maximum likelihood in Stata. Need at least one “non-overlaping” variable to identify the model. Quick observations • some convergence problems • some separation problems Rainey and Jackson Over-Reports
  10. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Data We rely on 1988 ANES. • accurate validation effort • richest set of identifying variables We look specifically at the effect of education on turning out. • one of the most studied relationships in political science • simple model specification Rainey and Jackson Over-Reports
  11. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Variables In the equation modeling actual turn out... • years of education (key variable) • family income • age • gender • African-American In the equation modeling over-reports... • all of the above • number of days between the election and the interview • We drop all respondents who required more than two calls. Rainey and Jackson Over-Reports
  12. 8 10 12 14 16 0.2 0.4 0.6 0.8 1.0

    Years of Education Estimated Pr(Turn Out) Logit (Self−Report) Estimated Pr(Vote)
  13. 8 10 12 14 16 0.2 0.4 0.6 0.8 1.0

    Years of Education Estimated Pr(Turn Out) Logit (Validated) Logit (Self−Report) Estimated Pr(Vote)
  14. 8 10 12 14 16 0.2 0.4 0.6 0.8 1.0

    Years of Education Estimated Pr(Turn Out) Over−Report Model Logit (Validated) Logit (Self−Report) Estimated Pr(Vote)
  15. 8 10 12 14 16 0.00 0.02 0.04 0.06 0.08

    0.10 0.12 Years of Education Estimated ME of Education Logit (Self−Report) Estimated ME of Education on Pr(Vote)
  16. 8 10 12 14 16 0.00 0.02 0.04 0.06 0.08

    0.10 0.12 Years of Education Estimated ME of Education Logit (Validated) Logit (Self−Report) Estimated ME of Education on Pr(Vote)
  17. 8 10 12 14 16 0.00 0.02 0.04 0.06 0.08

    0.10 0.12 Years of Education Estimated ME of Education Over−Report Model Logit (Validated) Logit (Self−Report) Estimated ME of Education on Pr(Vote)
  18. Over- Reports Rainey and Jackson Introduction Theory Empirical Model Results

    Conclusion Conclusion We began with two questions. 1 Do split-population models offer a viable option for dealing with misreports? 2 Can this specific application say something more generally about split popultion models? Based on our preliminary results, we have reached a couple of conclusions. 1 In an over-reporting application, we find that our over-report model leads us to make more biased inferences than the procedure we intended to correct. 2 Our findings suggest that future research cast a more critical eye toward applications relying on split-population models without very strong theoretical guidance for model specification. Rainey and Jackson Over-Reports