Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Misspelling_Oblivious_Word_Embedding.pdf
Search
MARUYAMA
January 22, 2020
0
170
Misspelling_Oblivious_Word_Embedding.pdf
MARUYAMA
January 22, 2020
Tweet
Share
More Decks by MARUYAMA
See All by MARUYAMA
vampire.pdf
tmaru0204
0
140
Simple_Unsupervised_Summarization_by_Contextual_Matching.pdf
tmaru0204
0
150
Controlling_Text_Complexity_in_Neural_Machine_Translation.pdf
tmaru0204
0
140
20191028_literature-review.pdf
tmaru0204
0
130
Hint-Based_Training_for_Non-Autoregressive_Machine_Translation.pdf
tmaru0204
0
120
Soft_Contextual_Data_Augmentation_for_Neural_Machine_Translation_.pdf
tmaru0204
0
140
An_Embarrassingly_Simple_Approach_for_Transfer_Learning_from_Pretrained_Language_Models_.pdf
tmaru0204
0
130
Addressing_Trobulesome_Words_in_Neural_Machine_Translation.pdf
tmaru0204
0
130
Simple_Unsupervised_Keyphrase_Extraction_using_Sentence_Embeddings.pdf
tmaru0204
0
170
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
53
13k
What's in a price? How to price your products and services
michaelherold
244
12k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
For a Future-Friendly Web
brad_frost
176
9.5k
Speed Design
sergeychernyshev
25
730
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
173
50k
The Cost Of JavaScript in 2023
addyosmani
46
7.2k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
38
1.9k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
3
240
Transcript
.JTTQFMMJOH0CMJWJPVT 8PSE&NCFEEJOHT จݙհ #PSB&EJ[FM "MFLTBOESB1JLUVT 1JPUS#PKBOPXTLJ 3VJ'FSSFJSB &EPVBSE(SBWF 'BCSJ[JP4JMWFTUSJ
/""$-)-5 QQ
"CTUSBDU ✦ εϖϧϛεੑΛඋ͑ͨ୯ޠࢄදݱ 2 ɾεϖϧϛεͷ୯ޠͱਖ਼͍͠୯ޠͷࢄදݱΛ͚ۙͮΔֶश ✦ ɾ֎తධՁͷ྆ํʹ͓͍ͯ ఏҊख๏ ͷ༗ޮੑΛࣔͨ͠ ɾతධՁXPSETJNJMBSJUZ
XPSEBOBMPHZ OFJHICPSIPPETJNJMBSJUZ ɾ֎తධՁ104UBHHJOH
*OUSPEVDUJPO ✦ 0VUPGWPDBCVMBSZ 007 ඇৗʹଟ͍ 3 ɾ8FCݕࡧΫΤϦͷεϖϧϛε ✦ 007ʹରॲͰ͖ΔࢄදݱΛ࡞Γ͍ͨ ɾֶशίʔύεʹεϖϧϛεͷ୯ޠΛಋೖ
εϖϧϛε୯ޠͷεύʔεੑ ɾ'BTU5FYU εϖϧϛεύλʔϯͷڭࢣ͋Γֶश
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ 4LJQHSBNXJUIOFHBUJWFTBNQMJOH 4 ίʔύε पล୯ޠ ෛྫू߹
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ 'BTU5FYU 5 ίʔύε पล୯ޠ ෛྫू߹ ୯ޠͷจࣈOHSBN ωi FH
CBOBOB 㱡O㱡 \CBO BOB OBO CBOB BOBO OBOB CBOBO BOBOB^ LFT sFT (ωi , ωc ) sFT (ωi , ωc )
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ .0&NPEFM 6 ίʔύε εϖϧϛεϖΞ (ωm , ωe )
∈ M ωm εϖϧϛεͷ୯ޠ ωe ਖ਼͍͠εϖϧͷ୯ޠ 4QFMMDPSSFDUJPOMPTT ෛྫू߹
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ .0&NPEFM 7 https://ai.facebook.com/blog/-a-new-model-for-word-embeddings-that-are-resilient-to-misspellings-/
%BUB ✦ &OHMJTI8JLJQFEJB 8 ɾ'BTU5FYUMPTTͷ࠷దԽ ✦ .JTTQFMMJOHTEBUBTFU ɾGBDFCPPLͷݕࡧΫΤϦʹج͍ͮͯੜ ɾ
ϖΞ ɾIUUQTCJUCVDLFUPSHCFEJ[FMNPF
.JTTQFMMFEEBUBHFOFSBUJPO ✦ &SSPSNPEFM 9 ɾݕࡧΫΤϦͷཤྺ εϖϧϛεΛϢʔβ͕मਖ਼ͨ݁͠Ռ ͔ΒҎԼͷUSJQMFUΛ࡞ ɾ.0&ͷֶश࣌ʹ εϖϧϛε֬ʹج͍ͮͯαϯϓϦϯά
ɾ࠷Ұக͢ΔΛͱʹσʔληοτΛݕࡧ c c pm pe c FH hello worjdˠhello world < XPS K M PS K M S K M П K M > લͷจࣈྻ ฤूલͷจࣈ ฤूޙͷจࣈ 5SJQMFU
&YQFSJNFOUT ✦ εϖϧϛεؚΉςετσʔλΛੜ 10 ɾֶशσʔλੜ࣌ͱಉ༷ͷํ๏ͰεϖϧϛεΛՃ ɾฤूڑͱ୯ޠΛ੍ޚ͢ΔύϥϝʔλS
&YQFSJNFOUT ✦ *OUSJOTJDUBTL 11 ɾ8PSETJNJMBSJUZ ୯ޠؒͷྨࣅ ਓखධՁͱͷ૬ؔʹΑΓධՁ ɾ8PSEBOBMPHZ #FSMJO(FSNBO 'SBODF1BJST
ਖ਼ղͰධՁ ɾ/FJHICPSIPPETJNJMBSJUZ εϖϧϛεͷࢄදݱ͕ਖ਼͍͠୯ޠͷࢄදݱͱ͍͔ۙ ฏۉٯॱҐ .33 DPWFSBHFͰධՁ
*OUSJOTJDFWBMVBUJPO ✦ 8PSETJNJMBSJUZ 12
*OUSJOTJDFWBMVBUJPO ✦ 8PSEBOBMPHZ 13
*OUSJOTJDFWBMVBUJPO ✦ /FJHICPSIPPETJNJMBSJUZ 14
&YQFSJNFOUT ✦ &YUSJOTJDUBTL 15 ɾ104UBHHJOH #J-45. $3' 8PSEFNCFEEJOHMBZFS'BTU5FYUPS.0&
&YUSJOTJDFWBMVBUJPO ✦ 104UBHHJOH 16 0SJHJOBMͷ݁ՌΛଛͳΘͣʹ εϖϧϛεͰͷੑೳվળ
&YUSJOTJDFWBMVBUJPO ✦ 104UBHHJOH 17 USBJOͱUFTU͕ۃʹҟͳΔઃఆͰ༗ޮʹ࡞༻
$PODMVTJPO 18 ✦ εϖϧϛεੑΛඋ͑ͨ୯ޠࢄදݱ ɾεϖϧϛεͷ୯ޠͱਖ਼͍͠୯ޠͷࢄදݱΛ͚ۙͮΔֶश ✦ ࣮ࡍʹਖ਼͍͠୯ޠͷࢄදݱʹ͍ۙͮͯ ͍Δ͜ͱΛ֬ೝ ✦ εϖϧϛεΛؚΉ༷ʑͳλεΫʹ͓͍ͯ
ੑೳΛվળ XPSETJNJMBSJUZ BOBMPHZ 104UBHHJOH