Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A Word-Complexity Lexicon and A Neural Readabil...
Search
onizuka laboratory
December 18, 2018
Research
140
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
弊研究室で行なったEMNLP2018読み会の発表資料です。
onizuka laboratory
December 18, 2018
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
120
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
82
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
43
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
66
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
62
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
110
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
100
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
130
SGM: Sequence Generation Model for Multi-Label Classification
onilab
0
87
Other Decks in Research
See All in Research
Sequences of Logits Reveal the Low Rank Structure of Language Models
sansantech
PRO
1
260
老舗ものづくり企業でリサーチが変革を起こすまで - 三菱重工DXの実践
skydats
0
180
Using our influence and power for patient safety
helenbevan
0
360
機械学習で作った ポケモン対戦bot で 遊ぼう!
fufufukakaka
0
250
SOTAのさらに先へ:厳しい推論制約下での高性能モデルのPost-Training
analokmaus
0
1.2k
羽田新ルート運用6年の検証
1manken
0
160
社内データ分析AIエージェントを できるだけ使いやすくする工夫
fufufukakaka
1
1.1k
SoftMatcha 2: 1兆語規模コーパスの超高速かつ柔らかい検索
e869120_sub
6
3.4k
進学校の生徒にはア行の苗字が多いのか
ozekinote
0
430
ペットのかわいい瞬間を撮影する オートシャッターAIアプリへの スマートラベリングの適用
mssmkmr
0
510
2026-01-30-MandSL-textbook-jp-cos-lod
yegusa
1
1.3k
討議:RACDA設立30周年記念都市交通フォーラム2026
trafficbrain
0
940
Featured
See All Featured
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
570
Build your cross-platform service in a week with App Engine
jlugia
234
18k
The untapped power of vector embeddings
frankvandijk
2
1.7k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
200
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Visualization
eitanlees
152
17k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
360
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
230
Accessibility Awareness
sabderemane
1
130
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
1
530
HTML-Aware ERB: The Path to Reactive Rendering @ RubyCon 2026, Rimini, Italy
marcoroth
1
160
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
250
Transcript
EMNLP A Word-Complexity Lexicon and A Neural Readability Ranking Model
2018/12/18 M1
• 2 • 15000 • SimplePPDB++
2
3 Complex Sentence The cat perched on the mat. Substitution
Generation perched : rested, sat Substitution Ranking #1 : sat, #2 : rested Complex Word Identification The cat perched on the mat. Simplification Sentence The cat sat on the mat.
$,52(% *60#94 -):3 • 60 • $;! '
. • foolishness7 vs folly1 • 60 foolishness • Google Ngram Corpus foolishness/;! • PPDB"&2272 • 21%60 8160 • 14%/;! 760 4 +2
- • Google Ngram Corpus • Wo 15000 • 11
L • 6 5 6 • e p bug n d • C Wo c • 1000 i 2-2.5h • 1 5-7 L • m l 5
- C 2 • 3% • L 0.55 → 0.64
• • ≦0.5 47% • ≦1.0 78% • ≦1.5 93% 6
2 7
• ,/+*23.0! •
SemEval2012$! "% • )-2*15Candidates • $! "% • %'&(30Target300Candidate • #% 171Target1710Candidate 8 TEXT When you think about it, that’s pretty terrible. Target terrible Candidates bad, awful, deplorable
9 P@1 1 S all binning WC R 15000
• PPDB P Ranking model • PPDB • • •
+ + + • PPDB D • 10B S 10
+ 11 SimplePPDB++
Target Candidate • 100 Target Candidate • 2 • Candidate
G • SimplePPDB++ 12
13
• n Target • PPs Candidate • MAP Candidate • P@1 Top1
I • SemEval2016 CWIG3G2 • C WC 14
15
• 2'"#( & • SOTA% • 15000'"#(
• !*$ CWI) • SimplePPDB++ 16