Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Extracting political information from legislative speech

Extracting political information from legislative speech

Getting policy positions from word frequency data. Paper given at St. Catherine's College, Oxford at the ESRC Methods Festival.

Will Lowe

July 20, 2012
Tweet

More Decks by Will Lowe

Other Decks in Science

Transcript

  1. Political information from legislative speech proposing and examining legislation, constituency

    issues, explaining policy, holding government to account. . . policy agenda: issue framing and reframing in debate and questioning ... policy positions: taking and inferring positions on issues ESRC Methods Festival, June 2012
  2. Political information from legislative speech votes: mostly uninformative about individual

    positions on a potentially biased set of issues manifestos: intermittently produced and uninformative about individual positions ... legislative speech: more expressive, more frequent, less constrained. . . ESRC Methods Festival, June 2012
  3. Spatial Models for Text Statistical models of position depend on

    relative emphasis across discrete indicators Word counts (association models): log E[Cij] = ¸i + j + „i ˛j (Monroe & Maeda 2004; Slapin and Proksch 2007. But basically Goodman, 1979) ESRC Methods Festival, June 2012
  4. Spatial Models for Text Statistical models of position depend on

    relative emphasis across discrete indicators Word counts (correspondence analysis): Cij=n = ri cj (1 ` „i ˛j) (Laver et al. 2003. But basically Hirschfeld 1935, Benzecri 1973, Greenacre 1983) ESRC Methods Festival, June 2012
  5. Model assumptions Unidimensionality: e.g. left-right, pro/anti-EU Poisson distributed word counts:

    variance = mean words generated independently Local/conditional independence: non-positional information is noise ESRC Methods Festival, June 2012
  6. . . . can nevertheless work well (Slapin & Proksch,

    2007) Year Party Position 1990 1994 1998 2002 2005 !2 !1 0 1 2 Left!Right Positions in Germany, 1990!2005 including 95% confidence intervals PDS Greens SPD CDU!CSU FDP Figure 1: Left-Right Party Positions in Germany 43 ESRC Methods Festival, June 2012
  7. Certainty: An aside “Doubt is to certainty as neurosis is

    to psychosis. The neurotic is in doubt and has fears about persons and things; the psychotic has convictions and makes claims about them. In short, the neurotic has problems, the psychotic has solutions.” Thomas Szasz We may doubt the model and (independently) fear that its inferences are too certain ESRC Methods Festival, June 2012
  8. Certainty in text scaling A good position uncertainty measure answers

    the question: How could this speech have been different? (while still expressing the same position) ESRC Methods Festival, June 2012
  9. Three ways to be uncertain Bayes: assume the model is

    correct (Monroe & Maeda 2004, Lo et al. 2011) ML: assume the model is correct & word parameters are well estimated (Lowe & Benoit 2010) . . . Bootstrap: only expected word rate is correct (Lowe & Benoit 2011) ESRC Methods Festival, June 2012
  10. Bootstrapping Text Resampling ‘writes’ the speeches that could have been

    given, but weren’t. . . parametric: Resample from the fitted word counts and refit word: Resample individual words and refit sentence: Resample natural sentences and refit block: Resample overlapping length K word sequences ESRC Methods Festival, June 2012
  11. Qualitative Validation Data: 14 speeches from the debate on Ireland’s

    2010 budget (FF+Greens vs FG+Lab+SF) Subjects: 18 PhD students (LSE and TCD) Task: Identify speaker positions, directly and by pairwise comparison and indicate uncertainty Questions: Does the model recover human positioning? What is appropriate certainty? ESRC Methods Festival, June 2012
  12. Respondents’ positions OCaolain Morgan Gilmore Burton Higgins Quinn Bruton ODonnell

    Kenny Gormley Ryan Cuffe Lenihan Cowen • • • • • • • • • • • • • • FF Green FG LAB SF −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 ESRC Methods Festival, June 2012
  13. Model’s positions (ML errors) Burton Higgins Quinn Gilmore Kenny Bruton

    ODonnell OCaolain Morgan Ryan Cuffe Gormley Lenihan Cowen • • • • • • • • • • • • • • FF Green SF FG LAB −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 ESRC Methods Festival, June 2012
  14. Results: Doubt The model distinguishes government and opposition, but less

    strongly than respondents correctly orders parties, except for Sinn Féin identifies more variation in the Green party Individual speaker positions are apparently difficult to assign ESRC Methods Festival, June 2012
  15. Uncertainty from five methods −1.0 −0.5 0.0 0.5 1.0 1.5

    −0.1 0.0 0.1 0.2 Human position method uncertainty − human uncertainty ESRC Methods Festival, June 2012
  16. Results: Uncertainty Model-based uncertainty measures (ML, parametric bootstrap) are indeed

    too small Other bootstrap measures seem to be too large, except for. . . Sinn Féin and Green positions that the model gets wrong Word-based bootstrap is probably the best option for getting positions from text ESRC Methods Festival, June 2012
  17. Conclusions Strong statistical model assumptions do not prevent effective extraction

    of positions from legislative speech However, model-based uncertainty measures are over-confident and can be usefully replaced by a bootstrap approach Qualitative validation is essential for quantitative text analysis methods ESRC Methods Festival, June 2012
  18. Replication Position estimation code from the R package austin available

    from r-forge. Correspondence analysis code is in the R packages MASS and ca. Bootstrap code in Java and survey materials are available from the authors on request . . . but you’ll have to find your own PhD students ESRC Methods Festival, June 2012
  19. Sinn Féin again SF rejected the proposed and alternative budget:

    “replacing Fianna Fáil and the Green Party with Fine Gael and the Labour Party will make no difference to economic recovery. [. . . ] we are the only party that stands up for working people. . . We are unique because we are the only party with an alternative analysis of the situation.” A second dimension? . . . perhaps not ESRC Methods Festival, June 2012
  20. Is Sinn Féin just special? Test: How do parties in

    opposition differentiate themselves? ESRC Methods Festival, June 2012
  21. q q q q q q q q q −2

    −1 0 1 2 3 −2 −1 0 1 2 3 Human positions with uncertainty Model position with word bootstrap uncertainty Gilmore LAB Kenny FG Bruton FG Quinn LAB Burton LAB ODonnell FG Morgan SF OCaolain SF Higgins LAB ESRC Methods Festival, June 2012
  22. Some possible solutions Two-dimensional doubtful (but covers 90% of human

    position variance) Conditionally one-dimensional Two-and-a-bit dimensional government / opposition is a weakly dimensional ESRC Methods Festival, June 2012