Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A Word-Complexity Lexicon and A Neural Readabil...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
onizuka laboratory
December 18, 2018
Research
0
140
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
弊研究室で行なったEMNLP2018読み会の発表資料です。
onizuka laboratory
December 18, 2018
Tweet
Share
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
120
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
72
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
38
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
61
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
57
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
100
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
98
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
120
SGM: Sequence Generation Model for Multi-Label Classification
onilab
0
81
Other Decks in Research
See All in Research
「車1割削減、渋滞半減、公共交通2倍」を 熊本から岡山へ@RACDA設立30周年記念都市交通フォーラム2026
trafficbrain
1
740
ドメイン知識がない領域での自然言語処理の始め方
hargon24
1
260
病院向け生成AIプロダクト開発の実践と課題
hagino3000
0
570
Tiaccoon: Unified Access Control with Multiple Transports in Container Networks
hiroyaonoe
0
1.1k
2026 東京科学大 情報通信系 研究室紹介 (大岡山)
icttitech
0
770
A History of Approximate Nearest Neighbor Search from an Applications Perspective
matsui_528
1
200
ペットのかわいい瞬間を撮影する オートシャッターAIアプリへの スマートラベリングの適用
mssmkmr
0
380
AIスーパーコンピュータにおけるLLM学習処理性能の計測と可観測性 / AI Supercomputer LLM Benchmarking and Observability
yuukit
1
740
Can We Teach Logical Reasoning to LLMs? – An Approach Using Synthetic Corpora (AAAI 2026 bridge keynote)
morishtr
1
170
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
satai
3
630
ウェブ・ソーシャルメディア論文読み会 第36回: The Stepwise Deception: Simulating the Evolution from True News to Fake News with LLM Agents (EMNLP, 2025)
hkefka385
0
200
Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities
satai
3
640
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
57
14k
Practical Orchestrator
shlominoach
191
11k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.8k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
140
Build your cross-platform service in a week with App Engine
jlugia
234
18k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.4k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
780
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
240
Color Theory Basics | Prateek | Gurzu
gurzu
0
250
Being A Developer After 40
akosma
91
590k
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
140
Transcript
EMNLP A Word-Complexity Lexicon and A Neural Readability Ranking Model
2018/12/18 M1
• 2 • 15000 • SimplePPDB++
2
3 Complex Sentence The cat perched on the mat. Substitution
Generation perched : rested, sat Substitution Ranking #1 : sat, #2 : rested Complex Word Identification The cat perched on the mat. Simplification Sentence The cat sat on the mat.
$,52(% *60#94 -):3 • 60 • $;! '
. • foolishness7 vs folly1 • 60 foolishness • Google Ngram Corpus foolishness/;! • PPDB"&2272 • 21%60 8160 • 14%/;! 760 4 +2
- • Google Ngram Corpus • Wo 15000 • 11
L • 6 5 6 • e p bug n d • C Wo c • 1000 i 2-2.5h • 1 5-7 L • m l 5
- C 2 • 3% • L 0.55 → 0.64
• • ≦0.5 47% • ≦1.0 78% • ≦1.5 93% 6
2 7
• ,/+*23.0! •
SemEval2012$! "% • )-2*15Candidates • $! "% • %'&(30Target300Candidate • #% 171Target1710Candidate 8 TEXT When you think about it, that’s pretty terrible. Target terrible Candidates bad, awful, deplorable
9 P@1 1 S all binning WC R 15000
• PPDB P Ranking model • PPDB • • •
+ + + • PPDB D • 10B S 10
+ 11 SimplePPDB++
Target Candidate • 100 Target Candidate • 2 • Candidate
G • SimplePPDB++ 12
13
• n Target • PPs Candidate • MAP Candidate • P@1 Top1
I • SemEval2016 CWIG3G2 • C WC 14
15
• 2'"#( & • SOTA% • 15000'"#(
• !*$ CWI) • SimplePPDB++ 16