Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real-Time Open-Domain Question Answering with D...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Scatter Lab Inc.
July 17, 2019
Research
900
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Scatter Lab Inc.
July 17, 2019
More Decks by Scatter Lab Inc.
See All by Scatter Lab Inc.
zeta introduction
scatterlab
0
1.9k
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
scatterlab
0
4.4k
Adversarial Filters of Dataset Biases
scatterlab
0
2.3k
Sparse, Dense, and Attentional Representations for Text Retrieval
scatterlab
0
2.3k
Weight Poisoning Attacks on Pre-trained Models
scatterlab
0
2.2k
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
scatterlab
0
2.5k
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
scatterlab
0
2.3k
Open-Retrieval Conversational Question Answering
scatterlab
0
2.3k
What Can Neural Networks Reason About?
scatterlab
0
2.3k
Other Decks in Research
See All in Research
ScoreMatchingRiesz for Automatic Debiased Machine Learning and Policy Path Estimation with an Application to Japanese Monetary Policy Evaluation
masakat0
0
290
[チュートリアル] 電波マップ構築入門 :研究動向と課題設定の勘所
k_sato
0
470
AGI4OPT:自然言語から数理最適化を導くエ ージェントスキル Translating Human Intent into Mathematical Optimization
mickey_kubo
0
130
Φ-Sat-2のAutoEncoderによる情報圧縮系論文
satai
4
750
LLMアプリケーションの透明性について
fufufukakaka
0
230
AY 2026 Guide to Academic Writing Using Generative AI - Workshop
ks91
PRO
0
120
SAKURAONE:An Open Ethernet-based AI HPC System And Its Observed Workload Dynamicsin a Single-Tenant LLM Development Environment
yuukit
1
300
業界横断 副業コンプライアンス調査 三者(副業者・本業先・発注者)におけるトラブル認知ギャップの構造分析
fkske
0
1.3k
LOSの検討(λ Kansai 2026 in Winter)
motopu
0
140
通時的な類似度行列に基づく単語の意味変化の分析
rudorudo11
0
310
[BlackHatAsia2026] Hidden Telemetry: Uncovering TraceLogging ETW Providers You're Not Using (Yet)
asuna_jp
1
500
Ankylosing Spondylitis
ankh2054
0
170
Featured
See All Featured
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.5k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.3k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
600
Practical Orchestrator
shlominoach
191
11k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
2
1.5k
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
140
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
320
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
360
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9.1k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.3k
WENDY [Excerpt]
tessaabrams
11
38k
Transcript
Real-Time Open-Domain Question Answering With Dense-Sparse Phrase Index 스캐터랩(ScatterLab) Machine
Learning Engineer 김준성 ੌ࢚ച ੋҕמ
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD Error Propagation Ӓܻझ݃झ ࣛ۽ … X
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD Super Slow Document Size ࣻরѤ WIKi Model Size BERT-Large 10-100 sec/query ࢲ࠺झೡ ࣻ হ
Motivation Super Slow SQuAD-BERT ܻझ݃झח ઁঠ | ࣻ ࢤਸ ୷ೞೞח..
ܻझ݃झח ઁঠ | ӝةҮ ੋٜীѱ য о… ܻझ݃झח ઁঠ | োੋٜ о ೠ…. Ҋ ೞח Articleਸ Caching ೡ ࣻ হӝ ٸޙী Query ী ٮۄࢲ ݒߣ ҅೧ঠೣ 12ਘ 25ੌ
Models Query Embedding [CLS] What day is it the Christmas?
Query Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25.. Phrase Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25 Start Span Embed End Span Embed December 25 Phrase Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25 Start Span Embed End Span Embed annual festival commemorating the birth of Jesus Christ Phrase Encoding
Sparse Encoding Models ੌ߈ੋ Document Retrieval + Reading Comprehension झఋੌ
ইצ ߄۽ Query -> Span ՙܻ Distribution Matching ೣ (RMM э?)
Sparse Encoding Models START EMBED / END EMBED / COHERENCE
EMBED = DENSE EMBEDDING
Sparse Encoding Models ৵ Coherenceо ৵ ٜযо??
Models Dense Encoding Token Embed dimਸ 4١࠙ ೧ࢲ 1,2,3,4 ۽
ա׃ Start_embed1, End_embed2, dot[Start_embed3, End_embed4] Concat
Sparse Encoding 2-Gram TF-IDFܳ ਊ೧ࢲ Sparse Embeddingਸ ݅ٞ + Paragraph-Level
Sparse Embedding ب ؊ೣ Models
ੌ߈ਵ۽ח ݽٚ Document ৬ ࠺Ү೧ࢲ argmaxܳ ೞח ߑߨ ࢎਊ Training
Training Efficiency
Training Training Efficiency ೞ݅ ܻח ݏח ௪ܻী ೧ࢲ Positive ೞѱ
णೞݶ ؽ (RMMীࢲ Positive Sample) ઑӘ؊ ࣻਵ۽ ܻೞݶ ؊ ബਯਵ۽ Loss ܳ ҅ೡ ࣻ
Result SQuAD 1.1 BERTࠁח ࢿמ 8% ب ծѪ ࢎप ೞ݅
प ࢲ࠺झ۽ ٜ݅য ݽ؛ীࢲח ӝઓ SOTA (DrQA) ࠺ ࢿמ 4% ֫Ҋ ࣘبח DrQA 58ߓ ࡅܰҊ BERTࠁ 6000ߓ ب ࡅܴ