Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ACL-EMNLP 2012 review

ACL-EMNLP 2012 review

I will introduce five papers from ACL 2012 and three papers from EMNLP 2012, mainly related to the educational application of natural language processing.

Slides were presented at the Educational NLP research group regular meeting (NAIST, Japan).

Mamoru Komachi

July 24, 2012
Tweet

More Decks by Mamoru Komachi

Other Decks in Research

Transcript

  1. ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP

    research group Computational Linguistics Lab Nara Institute of Science and Technology, Japan
  2. Today’s agenda • Introduce several papers presented at ACL/EMNLP conferences

    • Not complete list, so please take a look at accepted papers by yourself! • More papers on related areas such as spelling correction and text normalization (especially for microblogs like Twitter) • Disclaimer: I haven’t read any papers yet. I will talk about the impression from the presentation (oral, poster, demo) of their work. Please refer to the paper itself if you feel interested J
  3. ACL • Native Language Detection with Tree Substitution Grammars •

    A Corpus of Textual Revisions in Second Language Writing • A Meta Learning Approach to Grammatical Error Correction • FLOW: A First-Language-Oriented Writing Assistant System • Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation
  4. Native Language Detection with Tree Substitution Grammars (Short Paper) Ben

    Swanson and Eugene Charniak (Brown University, USA) Problem: Though syntactic features are known to be useful for native language detection, CFG rules cannot capture long range dependencies Idea: Use Tree Substitution Grammar to extract tree fragments for native language identification • Use tree fragments as features for MaxEnt classifier • Tested on ICLE and outperformed baselines (CFG and frequent-based tree mining)
  5. A Corpus of Textual Revisions in Second Language Writing John

    Lee and Jonathan Webster (City University of Hong Kong) Problem: There is no ESL corpus containing sentence aligned revision logs Idea: Collected a corpus with (possibly multiple) revision logs of ESL learners • Errors are identified by language teachers (not necessarily the same person for each revision) • Mail them to get a copy for research purpose
  6. A Meta Learning Approach to Grammatical Error Correction (Short Paper)

    Hongsuck Seo, Jonghoon Lee, Seokhwan Kim, Kyusong Lee, Sechun Kang, and Gary Geunbae Lee (PosTech, Korea) Problem: There are many ESL corpora which have different characteristics Idea: Train several classifiers using different corpora, and combine them with a meta-classifier • Base classifiers use ASO (Ando and Zhang, 2005) to train a model from both a native corpus and an error-tagged corpus • Meta-Learner improves precision and F1 on article error correction task
  7. FLOW: A First-Language-Oriented Writing Assistant System Mei-Hua Chen, Shih-Ting Huang,

    Hung-Ting Hsieh, Ting-Hui Kao, and Jason S. Chang (National Tsing Hua University, Taiwan) http://www.youtube.com/watch?v=uhH55fEPiqI Problem: Previous ESL assistance tool does not take context and native language into account Idea: Developed a browser-based ESL writing assistance system for Chinese speakers • Can accept Chinese input given English context, and show predictive text by N-gram • Paraphrase suggestion by translation from En- >Ch->En
  8. Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation (Short

    Paper) Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, and Hitoshi Nishikawa (NTT) Problem: Error-tagged corpora of language learners are hard to obtain Idea: Automatically generates error-tagged corpora using a confusion set (derived from manually tagged corpus) • Applied Frustratingly-easy domain adaptation • Domain adaptation gives stable improvement
  9. EMNLP • A Beam-Search Decoder for Grammatical Error Correction •

    Assessment of ESL Learners’ Syntactic Competence Based on Similarity Measures • Exploring Adaptor Grammars for Native Language Identification
  10. A Beam-Search Decoder for Grammatical Error Correction Daniel Dahlmeier and

    Hwee Tou Ng (NUS, Singapore) Problem: Traditional approach uses multi-class pointwise prediction, which does not correct a sentence as a whole Idea: Build a beam search decoder that combines the classification approach and SMT • Pipeline. Proposers generate candidates and experts ranks generated candidates • Tested on spelling, article, preposition, punctuation insertion and noun number task and achieved state-of-the-art
  11. Assessment of ESL Learners’ Syntactic Competence Based on Similarity Measures

    Su-Youn Yoon and Suma Bhat (UIUC, USA) Problem: Previous studies focus on the length of the output, such as the mean length of clauses Idea: Focus on morpho-syntactic features for measuring English proficiency • Constructed POS-based vector space model for each proficiency level • POS tag sequences are robust and highly correlates with human evaluation
  12. Exploring Adaptor Grammars for Native Language Identification Sze-Meng Jojo Wong,

    Mark Dras, and Mark Johnson (Macquarie University, Australia) Problem: {word,character,POS} N-gram features for native language identification do not consider long range contextual information Idea: Use Adapter Grammar (a non-parametric extension to PCFGs) to capture long n-grams • Built a MaxEnt classifier to combing syntactic language model and n-gram collocations • Experimental results are not stable, but shows better accuracy overall
  13. Summary • Introduced eNLP-related papers presented at ACL/EMNLP • For

    M2/D students: eNLP can exploit sophisticated methods explored in sequence labeling, parsing and SMT (e.g. string-to-tree, tree substitution grammer, etc) • For M1 students: Find a good problem and think hard to solve it!