Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Vine Pruning for Efficient Multi-Pass Dependency Parsing

Vine Pruning for Efficient Multi-Pass Dependency Parsing

Coarse-to-fine inference has been shown to be a robust approximate method for improving the efficiency of structured prediction models while preserving their accuracy. We propose a multi-pass coarse-to-fine architecture for dependency parsing using linear-time vine pruning and structured prediction cascades. Our first-, second-, and third-order models achieve accuracies comparable to those of their unpruned counterparts, while exploring only a fraction of the search space. We observe speed-ups of up to two orders of magnitude compared to exhaustive search. Our pruned third-order model is twice as fast as an unpruned first-order model and also compares favorably to a state-of-the-art transition-based parser for multiple languages

Alexander Rush

October 16, 2012
Tweet

More Decks by Alexander Rush

Other Decks in Research

Transcript

  1. Dependency Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild
  2. Styles of Dependency Parsing speed accuracy greedy O(n) k-best O(kn)

    first-order O(n3) second-order O(n3) third-order O(n4) transition-based parsers (Nivre, 2004) graph-based parsers (Eisner, 2000), (McDonald, 2005)
  3. Styles of Dependency Parsing speed accuracy greedy O(n) k-best O(kn)

    first-order O(n3) second-order O(n3) third-order O(n4) transition-based parsers (Nivre, 2004) graph-based parsers (Eisner, 2000), (McDonald, 2005) this work
  4. Preview: Coarse-to-Fine Cascades * As McGwire neared , fans went

    wild * As McGwire neared , fans went wild * As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild vine first second
  5. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  6. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  7. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  8. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  9. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  10. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  11. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  12. Representation * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild Heads Modifiers
  13. First-Order Feature Calculation * As McGwire neared , fans went

    wild * As McGwire neared , fans went wild As McGwire neared , fans went wild
  14. First-Order Feature Calculation * As McGwire neared , fans went

    wild * As McGwire neared , fans went wild As McGwire neared , fans went wild [went] [VBD] [As] [ADP] [went] [VERB] [As] [IN] [went, VBD] [As, ADP] [went, As] [VBD, ADP] [went, VERB] [As, IN] [went, As] [VERB, IN] [VBD, As, ADP] [went, As, ADP] [went, VBD, ADP] [went, VBD, As] [ADJ, *, ADP] [VBD, *, ADP] [VBD, ADJ, ADP] [VBD, ADJ, *] [NNS, *, ADP] [NNS, VBD, ADP] [NNS, VBD, *] [ADJ, ADP, NNP] [VBD, ADP, NNP] [VBD, ADJ, NNP] [NNS, ADP, NNP] [NNS, VBD, NNP] [went, left, 5] [VBD, left, 5] [As, left, 5] [ADP, left, 5] [VERB, As, IN] [went, As, IN] [went, VERB, IN] [went, VERB, As] [JJ, *, IN] [VERB, *, IN] [VERB, JJ, IN] [VERB, JJ, *] [NOUN, *, IN] [NOUN, VERB, IN] [NOUN, VERB, *] [JJ, IN, NOUN] [VERB, IN, NOUN] [VERB, JJ, NOUN] [NOUN, IN, NOUN] [NOUN, VERB, NOUN] [went, left, 5] [VERB, left, 5] [As, left, 5] [IN, left, 5] [went, VBD, As, ADP] [VBD, ADJ, *, ADP] [NNS, VBD, *, ADP] [VBD, ADJ, ADP, NNP] [NNS, VBD, ADP, NNP] [went, VBD, left, 5] [As, ADP, left, 5] [went, As, left, 5] [VBD, ADP, left, 5] [went, VERB, As, IN] [VERB, JJ, *, IN] [NOUN, VERB, *, IN] [VERB, JJ, IN, NOUN] [NOUN, VERB, IN, NOUN] [went, VERB, left, 5] [As, IN, left, 5] [went, As, left, 5] [VERB, IN, left, 5] [VBD, As, ADP, left, 5] [went, As, ADP, left, 5] [went, VBD, ADP, left, 5] [went, VBD, As, left, 5] [ADJ, *, ADP, left, 5] [VBD, *, ADP, left, 5] [VBD, ADJ, ADP, left, 5] [VBD, ADJ, *, left, 5] [NNS, *, ADP, left, 5] [NNS, VBD, ADP, left, 5] [NNS, VBD, *, left, 5] [ADJ, ADP, NNP, left, 5] [VBD, ADP, NNP, left, 5] [VBD, ADJ, NNP, left, 5] [NNS, ADP, NNP, left, 5] [NNS, VBD, NNP, left, 5] [VERB, As, IN, left, 5] [went, As, IN, left, 5] [went, VERB, IN, left, 5] [went, VERB, As, left, 5] [JJ, *, IN, left, 5] [VERB, *, IN, left, 5] [VERB, JJ, IN, left, 5] [VERB, JJ, *, left, 5] [NOUN, *, IN, left, 5] [NOUN, VERB, IN, left, 5]
  15. Arc Length By Part-of-Speech 1 2 3 4 5 6

    length 0.0 0.1 0.2 0.3 0.4 0.5 counts NOUN ADP DET VERB ADJ
  16. Arc Length By Part-of-Speech 1 2 3 4 5 6

    length 0.0 0.1 0.2 0.3 0.4 0.5 counts NOUN ADP DET VERB ADJ
  17. Arc Length By Part-of-Speech 1 2 3 4 5 6

    length 0.0 0.1 0.2 0.3 0.4 0.5 counts NOUN ADP DET VERB ADJ
  18. Arc Length Examples * The bill intends to restrict the

    RTC to Treasury borrowings only , unless the agency receives specific congressional authorization . The bill intends to restrict the RTC to Treasury borrowings only , unless the agency receives specific congressional authorization .
  19. Arc Length Examples * This financing system was created in

    the new law in order to keep the bailout spending from swelling the budget deficit . This financing system was created in the new law in order to keep the bailout spending from swelling the budget deficit .
  20. Arc Length Examples * But the RTC also requires “

    working ” capital to maintain the bad assets of thrifts that are sold , until the assets can be sold separately . But the RTC also requires “ working ” capital to maintain the bad assets of thrifts that are sold , until the assets can be sold separately .
  21. Arc Length Examples * “ It ’s a problem that

    clearly has to be resolved , ” said David Cooke , executive director of the RTC . “ It ’s a problem that clearly has to be resolved , ” said David Cooke , executive director of the RTC .
  22. Arc Length Examples * “ We would have to wait

    until we have collected on those assets before we can move forward , ” he said . “ We would have to wait until we have collected on those assets before we can move forward , ” he said .
  23. Arc Length Examples * The complicated language in the huge

    new law has muddied the fight . The complicated language in the huge new law has muddied the fight .
  24. Arc Length Examples * “ That secrecy leads to a

    proposal like the one from Ways and Means , which seems to me sort of draconian , ” he said . “ That secrecy leads to a proposal like the one from Ways and Means , which seems to me sort of draconian , ” he said .
  25. Arc Length Examples * “ The RTC is going to

    have to pay a price of prior consultation on the Hill if they want that kind of flexibility . ” “ The RTC is going to have to pay a price of prior consultation on the Hill if they want that kind of flexibility . ”
  26. Arc Length Heat Map * 1 2 3 4 5

    6 7 8 9 1 2 3 4 5 6 7 8 9
  27. Arc Length Heat Map * 1 2 3 4 5

    6 7 8 9 1 2 3 4 5 6 7 8 9
  28. Banded Matrix * As McGwire neared , fans went wild

    As McGwire neared , fans went wild
  29. Banded Matrix * As McGwire neared , fans went wild

    As McGwire neared , fans went wild
  30. Outer Arc * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  31. Outer Arc * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  32. Outer Arc * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  33. Outer Arc * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  34. Coarse-to-Fine * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild As McGwire neared , fans went wild vine
  35. Coarse-to-Fine * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild vine first
  36. Coarse-to-Fine * As McGwire neared , fans went wild *

    As McGwire neared , fans went wild * As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild vine first second
  37. Inference Questions questions: • How do we reduce inference time

    to O(n)? • How do we decide which arcs to prune? Vine Parsing (Eisner and Smith, 2005)
  38. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  39. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  40. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  41. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  42. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  43. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  44. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  45. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  46. First-Order Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  47. Vine Parsing Rules 0 e ← 0 e − 1

    + e e − 1 0 e ← 0 m + e m 0 e ← 0 e 0 e ← 0 m + e m 0 e ← 0 e − 1 + e − 1 e
  48. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  49. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  50. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  51. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  52. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  53. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  54. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  55. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  56. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  57. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  58. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  59. Vine Parsing * As McGwire neared , fans went wild

    * As McGwire neared , fans went wild As McGwire neared , fans went wild
  60. Arc Pruning • Prune arcs based on max-marginals. maxmarginal(a) =

    max y:a∈y (y · w) • Can compute using inside-outside algorithm. • Generic algorithm using hypergraph parsing.
  61. Max-Marginals for First-Order Arcs maxmarginal(neared → fans) > threshold ?

    * As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild
  62. Max-Marginals for Outer Arcs maxmarginal(LEFT → fans) > threshold ?

    * As McGwire neared , fans went wild * As McGwire neared , fans went wild As McGwire neared , fans went wild
  63. Max-Marginal Pruning goal: Define a threshold on max-marginal score. •

    Validation parameter α trades off between speed and accuracy. tα (w) = α max y (y · w) + (1 − α) 1 |A| a∈A maxmarginal(a, w) • Highest scoring parse upper bounds any max-marginal. • Assume average of max-marginals is lower than gold.
  64. Structured Cascade Training (Weiss and Taskar, 2011) • Train a

    linear model with a loss function for pruning. • Regularized risk minimization with loss based on threshold min w λ w 2 + 1 P P p=1 [1 − y(p) · w + t(p) α (w)]+ • Can use a simple variant of perceptron/pegasos to train.
  65. Implementation Inference • Experiments use a highly-optimized C++ implementation. •

    Baseline first-order parser processes 2000 tokens/sec. • Hypergraph parsing framework with shared inference. Model • Final models trained with hamming-loss MIRA. • Full collection of dependency parsing features (Koo, 2010). • First-, second-, and third-order models match state-of-the-art.
  66. Baselines NoPrune exhaustive parsing model with no pruning LocalShort unstructured

    classifier over O(n) short arcs (Bergsma and Cherry, 2010) Local unstructured classifier over O(n2) arcs (Bergsma and Cherry, 2010) FirstOnly structured first-order model in cascade (Koo, 2010) VinePosterior posterior pruning cascade trained with L-BFGS ZhangNivre reimplementation of state-of-the-art, k-best, transition-based parser (Zhang and Nivre, 2011).
  67. Speed/Accuracy Experiments: First-Order Parsing 90 91 92 93 94 Accuracy

    0 1 2 3 4 5 6 Relative Speed NoPrune Local FirstOnly VinePosterior VineCascade ZhangNivre(8)
  68. Speed/Accuracy Experiments: Second-Order Parsing 90 91 92 93 94 Accuracy

    0 1 2 3 4 Relative Speed NoPrune Local FirstOnly VinePosterior VineCascade ZhangNivre(16)
  69. Speed/Accuracy Experiments: Third-Order Parsing 90 91 92 93 94 Accuracy

    0 1 2 Relative Speed NoPrune Local FirstOnly VinePosterior VineCascade ZhangNivre(64)
  70. Empirical Complexity: First-Order Parsing 10 20 30 40 50 sentence

    length time NoPrune [2.8] VineCascade [1.4]
  71. Empirical Complexity: Second-Order Parsing 10 20 30 40 50 sentence

    length time NoPrune [2.8] VineCascade [1.8]
  72. Empirical Complexity: Third-Order Parsing 10 20 30 40 50 sentence

    length time NoPrune [3.8] VineCascade [1.9]
  73. Multilingual Experiments: First-Order Parsing 0 1 2 3 4 5

    6 7 Relative Speed En Bg De Pt Sw Zh NoPrune VineCascade
  74. Multilingual Experiments: Second-Order Parsing 0 1 2 3 4 5

    6 Relative Speed En Bg De Pt Sw Zh NoPrune VineCascade
  75. Special thanks to: Ryan McDonald, Hao Zhang, Michael Ringgaard, Terry

    Koo, Keith Hall, Kuzman Ganchev, Yoav Goldberg, Andre Martins, and the rest of the Google NLP team