Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Misspelling_Oblivious_Word_Embedding.pdf
Search
MARUYAMA
January 22, 2020
0
190
Misspelling_Oblivious_Word_Embedding.pdf
MARUYAMA
January 22, 2020
Tweet
Share
More Decks by MARUYAMA
See All by MARUYAMA
vampire.pdf
tmaru0204
0
180
Simple_Unsupervised_Summarization_by_Contextual_Matching.pdf
tmaru0204
0
180
Controlling_Text_Complexity_in_Neural_Machine_Translation.pdf
tmaru0204
0
160
20191028_literature-review.pdf
tmaru0204
0
150
Hint-Based_Training_for_Non-Autoregressive_Machine_Translation.pdf
tmaru0204
0
140
Soft_Contextual_Data_Augmentation_for_Neural_Machine_Translation_.pdf
tmaru0204
0
160
An_Embarrassingly_Simple_Approach_for_Transfer_Learning_from_Pretrained_Language_Models_.pdf
tmaru0204
0
150
Addressing_Trobulesome_Words_in_Neural_Machine_Translation.pdf
tmaru0204
0
150
Simple_Unsupervised_Keyphrase_Extraction_using_Sentence_Embeddings.pdf
tmaru0204
0
190
Featured
See All Featured
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
Facilitating Awesome Meetings
lara
55
6.5k
How to Think Like a Performance Engineer
csswizardry
26
1.9k
Optimizing for Happiness
mojombo
379
70k
Mobile First: as difficult as doing things right
swwweet
224
9.9k
[RailsConf 2023] Rails as a piece of cake
palkan
57
5.8k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.9k
The Pragmatic Product Professional
lauravandoore
36
6.9k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Visualization
eitanlees
148
16k
Typedesign – Prime Four
hannesfritz
42
2.8k
Transcript
.JTTQFMMJOH0CMJWJPVT 8PSE&NCFEEJOHT จݙհ #PSB&EJ[FM "MFLTBOESB1JLUVT 1JPUS#PKBOPXTLJ 3VJ'FSSFJSB &EPVBSE(SBWF 'BCSJ[JP4JMWFTUSJ
/""$-)-5 QQ
"CTUSBDU ✦ εϖϧϛεੑΛඋ͑ͨ୯ޠࢄදݱ 2 ɾεϖϧϛεͷ୯ޠͱਖ਼͍͠୯ޠͷࢄදݱΛ͚ۙͮΔֶश ✦ ɾ֎తධՁͷ྆ํʹ͓͍ͯ ఏҊख๏ ͷ༗ޮੑΛࣔͨ͠ ɾతධՁXPSETJNJMBSJUZ
XPSEBOBMPHZ OFJHICPSIPPETJNJMBSJUZ ɾ֎తධՁ104UBHHJOH
*OUSPEVDUJPO ✦ 0VUPGWPDBCVMBSZ 007 ඇৗʹଟ͍ 3 ɾ8FCݕࡧΫΤϦͷεϖϧϛε ✦ 007ʹରॲͰ͖ΔࢄදݱΛ࡞Γ͍ͨ ɾֶशίʔύεʹεϖϧϛεͷ୯ޠΛಋೖ
εϖϧϛε୯ޠͷεύʔεੑ ɾ'BTU5FYU εϖϧϛεύλʔϯͷڭࢣ͋Γֶश
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ 4LJQHSBNXJUIOFHBUJWFTBNQMJOH 4 ίʔύε पล୯ޠ ෛྫू߹
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ 'BTU5FYU 5 ίʔύε पล୯ޠ ෛྫू߹ ୯ޠͷจࣈOHSBN ωi FH
CBOBOB 㱡O㱡 \CBO BOB OBO CBOB BOBO OBOB CBOBO BOBOB^ LFT sFT (ωi , ωc ) sFT (ωi , ωc )
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ .0&NPEFM 6 ίʔύε εϖϧϛεϖΞ (ωm , ωe )
∈ M ωm εϖϧϛεͷ୯ޠ ωe ਖ਼͍͠εϖϧͷ୯ޠ 4QFMMDPSSFDUJPOMPTT ෛྫू߹
.JTTQFMMJOH0CMJWJPVT&NCFEEJOH ✦ .0&NPEFM 7 https://ai.facebook.com/blog/-a-new-model-for-word-embeddings-that-are-resilient-to-misspellings-/
%BUB ✦ &OHMJTI8JLJQFEJB 8 ɾ'BTU5FYUMPTTͷ࠷దԽ ✦ .JTTQFMMJOHTEBUBTFU ɾGBDFCPPLͷݕࡧΫΤϦʹج͍ͮͯੜ ɾ
ϖΞ ɾIUUQTCJUCVDLFUPSHCFEJ[FMNPF
.JTTQFMMFEEBUBHFOFSBUJPO ✦ &SSPSNPEFM 9 ɾݕࡧΫΤϦͷཤྺ εϖϧϛεΛϢʔβ͕मਖ਼ͨ݁͠Ռ ͔ΒҎԼͷUSJQMFUΛ࡞ ɾ.0&ͷֶश࣌ʹ εϖϧϛε֬ʹج͍ͮͯαϯϓϦϯά
ɾ࠷Ұக͢ΔΛͱʹσʔληοτΛݕࡧ c c pm pe c FH hello worjdˠhello world < XPS K M PS K M S K M П K M > લͷจࣈྻ ฤूલͷจࣈ ฤूޙͷจࣈ 5SJQMFU
&YQFSJNFOUT ✦ εϖϧϛεؚΉςετσʔλΛੜ 10 ɾֶशσʔλੜ࣌ͱಉ༷ͷํ๏ͰεϖϧϛεΛՃ ɾฤूڑͱ୯ޠΛ੍ޚ͢ΔύϥϝʔλS
&YQFSJNFOUT ✦ *OUSJOTJDUBTL 11 ɾ8PSETJNJMBSJUZ ୯ޠؒͷྨࣅ ਓखධՁͱͷ૬ؔʹΑΓධՁ ɾ8PSEBOBMPHZ #FSMJO(FSNBO 'SBODF1BJST
ਖ਼ղͰධՁ ɾ/FJHICPSIPPETJNJMBSJUZ εϖϧϛεͷࢄදݱ͕ਖ਼͍͠୯ޠͷࢄදݱͱ͍͔ۙ ฏۉٯॱҐ .33 DPWFSBHFͰධՁ
*OUSJOTJDFWBMVBUJPO ✦ 8PSETJNJMBSJUZ 12
*OUSJOTJDFWBMVBUJPO ✦ 8PSEBOBMPHZ 13
*OUSJOTJDFWBMVBUJPO ✦ /FJHICPSIPPETJNJMBSJUZ 14
&YQFSJNFOUT ✦ &YUSJOTJDUBTL 15 ɾ104UBHHJOH #J-45. $3' 8PSEFNCFEEJOHMBZFS'BTU5FYUPS.0&
&YUSJOTJDFWBMVBUJPO ✦ 104UBHHJOH 16 0SJHJOBMͷ݁ՌΛଛͳΘͣʹ εϖϧϛεͰͷੑೳվળ
&YUSJOTJDFWBMVBUJPO ✦ 104UBHHJOH 17 USBJOͱUFTU͕ۃʹҟͳΔઃఆͰ༗ޮʹ࡞༻
$PODMVTJPO 18 ✦ εϖϧϛεੑΛඋ͑ͨ୯ޠࢄදݱ ɾεϖϧϛεͷ୯ޠͱਖ਼͍͠୯ޠͷࢄදݱΛ͚ۙͮΔֶश ✦ ࣮ࡍʹਖ਼͍͠୯ޠͷࢄදݱʹ͍ۙͮͯ ͍Δ͜ͱΛ֬ೝ ✦ εϖϧϛεΛؚΉ༷ʑͳλεΫʹ͓͍ͯ
ੑೳΛվળ XPSETJNJMBSJUZ BOBMPHZ 104UBHHJOH