Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
Search
onizuka laboratory
December 18, 2018
Research
0
98
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
弊研究室で行なったEMNLP2018読み会の発表資料です。
onizuka laboratory
December 18, 2018
Tweet
Share
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
110
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
64
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
31
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
56
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
48
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
98
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
91
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
110
SGM: Sequence Generation Model for Multi-Label Classification
onilab
0
67
Other Decks in Research
See All in Research
SSII2023 医療支援における画像処理研究の動向と展望
moda0
0
110
生成AIを用いたText to SQLの最前線
masatoto
1
2k
Rの機械学習フレームワークの紹介〜tidymodelsを中心に〜 / machine_learning_with_r2024
s_uryu
0
210
リサーチに組織を巻き込むための「準備8割」の話
terasho
0
460
ゼロからわかるリザバーコンピューティング
kurotaky
1
280
論文紹介 DSRNet: Single Image Reflection Separation via Component Synergy (ICCV 2023)
tattaka
0
180
Webスケールデータセットに対する実用的なポイズニング手法 / Poisoning Web-Scale Training Datasets is Practical
nttcom
0
110
Source Code Diff Revolution (JetBrains Open Reading Club)
tsantalis
0
250
20240209 データを肴に熊本の交通を考える会「車1割削減、渋滞半減、公共交通2倍」をめざし世界に学ぼう
trafficbrain
0
770
「EBPMエコシステム」の可能性
daimoriwaki
0
200
ニフティのインナーソース導入事例 - InnerSource Commons #11
niftycorp
PRO
0
260
クリック率を最大化しない推薦システム
joisino
41
14k
Featured
See All Featured
A Philosophy of Restraint
colly
196
16k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
60
14k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
220
21k
Unsuck your backbone
ammeep
662
57k
The Cult of Friendly URLs
andyhume
74
5.7k
Teambox: Starting and Learning
jrom
128
8.4k
Art, The Web, and Tiny UX
lynnandtonic
288
19k
BBQ
matthewcrist
80
8.7k
The World Runs on Bad Software
bkeepers
PRO
61
6.7k
Done Done
chrislema
178
15k
How GitHub (no longer) Works
holman
304
140k
Intergalactic Javascript Robots from Outer Space
tanoku
266
26k
Transcript
EMNLP A Word-Complexity Lexicon and A Neural Readability Ranking Model
2018/12/18 M1
• 2 • 15000 • SimplePPDB++
2
3 Complex Sentence The cat perched on the mat. Substitution
Generation perched : rested, sat Substitution Ranking #1 : sat, #2 : rested Complex Word Identification The cat perched on the mat. Simplification Sentence The cat sat on the mat.
$,52(% *60#94 -):3 • 60 • $;! '
. • foolishness7 vs folly1 • 60 foolishness • Google Ngram Corpus foolishness/;! • PPDB"&2272 • 21%60 8160 • 14%/;! 760 4 +2
- • Google Ngram Corpus • Wo 15000 • 11
L • 6 5 6 • e p bug n d • C Wo c • 1000 i 2-2.5h • 1 5-7 L • m l 5
- C 2 • 3% • L 0.55 → 0.64
• • ≦0.5 47% • ≦1.0 78% • ≦1.5 93% 6
2 7
• ,/+*23.0! •
SemEval2012$! "% • )-2*15Candidates • $! "% • %'&(30Target300Candidate • #% 171Target1710Candidate 8 TEXT When you think about it, that’s pretty terrible. Target terrible Candidates bad, awful, deplorable
9 P@1 1 S all binning WC R 15000
• PPDB P Ranking model • PPDB • • •
+ + + • PPDB D • 10B S 10
+ 11 SimplePPDB++
Target Candidate • 100 Target Candidate • 2 • Candidate
G • SimplePPDB++ 12
13
• n Target • PPs Candidate • MAP Candidate • P@1 Top1
I • SemEval2016 CWIG3G2 • C WC 14
15
• 2'"#( & • SOTA% • 15000'"#(
• !*$ CWI) • SimplePPDB++ 16