Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real-Time Open-Domain Question Answering with D...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Scatter Lab Inc.
July 17, 2019
Research
910
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Scatter Lab Inc.
July 17, 2019
More Decks by Scatter Lab Inc.
See All by Scatter Lab Inc.
zeta introduction
scatterlab
0
1.9k
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
scatterlab
0
4.4k
Adversarial Filters of Dataset Biases
scatterlab
0
2.3k
Sparse, Dense, and Attentional Representations for Text Retrieval
scatterlab
0
2.3k
Weight Poisoning Attacks on Pre-trained Models
scatterlab
0
2.2k
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
scatterlab
0
2.5k
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
scatterlab
0
2.3k
Open-Retrieval Conversational Question Answering
scatterlab
0
2.3k
What Can Neural Networks Reason About?
scatterlab
0
2.3k
Other Decks in Research
See All in Research
typst の使い方:言語学を研究する学生のために
gitomochang
0
470
Research Engineerという仕事 / Research Engineering: Bridging Research and Business
chck
1
220
通時的な類似度行列に基づく単語の意味変化の分析
rudorudo11
0
320
Apache Gravitinoで実現する Icebergカタログ統合とアクセスの一元化
matsumooon
0
300
SAKURAONE:An Open Ethernet-based AI HPC System And Its Observed Workload Dynamicsin a Single-Tenant LLM Development Environment
yuukit
1
390
Using our influence and power for patient safety
helenbevan
0
370
コーディングエージェントとABNを再考
hf149
2
730
進学校の生徒にはア行の苗字が多いのか
ozekinote
0
460
世界モデルにおける分布外データ対応の方法論
koukyo1994
7
2.2k
YOLO26_ Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection
satai
3
830
COFFEE-Japan PROJECT Impact Report(海ノ向こうコーヒー)
ontheslope
0
2k
老舗ものづくり企業でリサーチが変革を起こすまで - 三菱重工DXの実践
skydats
0
200
Featured
See All Featured
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.5k
Making Projects Easy
brettharned
120
6.7k
Ecommerce SEO: The Keys for Success Now & Beyond - #SERPConf2024
aleyda
1
2k
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
150
Building the Perfect Custom Keyboard
takai
2
800
Game over? The fight for quality and originality in the time of robots
wayneb77
1
210
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
170
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.3k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
280
4 Signs Your Business is Dying
shpigford
187
22k
Navigating Team Friction
lara
192
16k
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.3k
Transcript
Real-Time Open-Domain Question Answering With Dense-Sparse Phrase Index 스캐터랩(ScatterLab) Machine
Learning Engineer 김준성 ੌ࢚ച ੋҕמ
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD Error Propagation Ӓܻझ݃झ ࣛ۽ … X
Motivation Query When is the Christmas? Christmas Jejus Christmas Tree
Christmas Eve … Document Retrieval Reading Comprehension ࣻ ࢤਸ ୷ೞೞח ೯ࢎо दػ Ѫ 4ࣁӝ ޖ۵ࠗఠ. दীח ܻझ݃झ զ о ഛೞѱ ೧ ঋওӝ ٸޙী 1ਘ 6ੌ, 3ਘ 21ੌ, 12ਘ 25ੌ оؘ ೞܖо ࢶغ . 12ਘ 25ੌ ܻझ݃झ۽ ܻ Ѫ AD 354֙, റ. 12ਘ 25ੌ TF-IDF Searching SQuAD Super Slow Document Size ࣻরѤ WIKi Model Size BERT-Large 10-100 sec/query ࢲ࠺झೡ ࣻ হ
Motivation Super Slow SQuAD-BERT ܻझ݃झח ઁঠ | ࣻ ࢤਸ ୷ೞೞח..
ܻझ݃झח ઁঠ | ӝةҮ ੋٜীѱ য о… ܻझ݃झח ઁঠ | োੋٜ о ೠ…. Ҋ ೞח Articleਸ Caching ೡ ࣻ হӝ ٸޙী Query ী ٮۄࢲ ݒߣ ҅೧ঠೣ 12ਘ 25ੌ
Models Query Embedding [CLS] What day is it the Christmas?
Query Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25.. Phrase Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25 Start Span Embed End Span Embed December 25 Phrase Encoding
Models Christmas is an annual festival commemorating the birth of
Jesus Christ, observed on December 25 Start Span Embed End Span Embed annual festival commemorating the birth of Jesus Christ Phrase Encoding
Sparse Encoding Models ੌ߈ੋ Document Retrieval + Reading Comprehension झఋੌ
ইצ ߄۽ Query -> Span ՙܻ Distribution Matching ೣ (RMM э?)
Sparse Encoding Models START EMBED / END EMBED / COHERENCE
EMBED = DENSE EMBEDDING
Sparse Encoding Models ৵ Coherenceо ৵ ٜযо??
Models Dense Encoding Token Embed dimਸ 4١࠙ ೧ࢲ 1,2,3,4 ۽
ա׃ Start_embed1, End_embed2, dot[Start_embed3, End_embed4] Concat
Sparse Encoding 2-Gram TF-IDFܳ ਊ೧ࢲ Sparse Embeddingਸ ݅ٞ + Paragraph-Level
Sparse Embedding ب ؊ೣ Models
ੌ߈ਵ۽ח ݽٚ Document ৬ ࠺Ү೧ࢲ argmaxܳ ೞח ߑߨ ࢎਊ Training
Training Efficiency
Training Training Efficiency ೞ݅ ܻח ݏח ௪ܻী ೧ࢲ Positive ೞѱ
णೞݶ ؽ (RMMীࢲ Positive Sample) ઑӘ؊ ࣻਵ۽ ܻೞݶ ؊ ബਯਵ۽ Loss ܳ ҅ೡ ࣻ
Result SQuAD 1.1 BERTࠁח ࢿמ 8% ب ծѪ ࢎप ೞ݅
प ࢲ࠺झ۽ ٜ݅য ݽ؛ীࢲח ӝઓ SOTA (DrQA) ࠺ ࢿמ 4% ֫Ҋ ࣘبח DrQA 58ߓ ࡅܰҊ BERTࠁ 6000ߓ ب ࡅܴ