state (words in vocabulary) Non-accepting state word wor / d Initial state Read “w” Read “o” Read “r” No transition to “d”, yield “wor” 2022/12/14 NLPコロキウム 28
state (words in vocabulary) Non-accepting state word wor / d Initial state Read “w” Read “o” Read “r” No transition to “d”, yield “wor” 2022/12/14 NLPコロキウム 29
state (words in vocabulary) Non-accepting state word wor / d Read “w” Read “o” Read “r” No transition to “d”, yield “wor” Initial state Read “d” 2022/12/14 NLPコロキウム 30
state (words in vocabulary) Non-accepting state word wor / d Read “w” Read “o” Read “r” No transition to “d”, yield “wor” Initial state Read “d” Input ends, yield “d” 2022/12/14 NLPコロキウム 31
state (words in vocabulary) Non-accepting state word wor / d Dropped state Randomly removed state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 35
state (words in vocabulary) Non-accepting state word wor / d Dropped state Initial state Randomly removed state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 36
state (words in vocabulary) Non-accepting state word wor / d Dropped state Initial state Read “w” Randomly removed state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 37
state (words in vocabulary) Non-accepting state word wor / d Dropped state Initial state Read “w” Read “o” Randomly removed state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 38
state (words in vocabulary) Non-accepting state word wor / d Dropped state Initial state Read “w” Read “o” Read “r” Randomly removed state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 39
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Initial state Read “w” Read “o” Read “r” Randomly removed state No transition to “word”, yield final accpeted token “w” 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 40
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Initial state Read “w” Read “o” Read “r” Randomly removed state No transition to “word”, yield final accpeted token “w” 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 41
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Initial state Read “w” Read “o” Read “r” Randomly removed state No transition to “word”, yield final accpeted token “w” Read “o” 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 42
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Initial state Read “w” Read “o” Read “r” No transition to “word”, yield final accpeted token “w” Read “r” Read “o” 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 43
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Read “w” Read “o” Read “r” No transition to “word”, yield final accpeted token “w” Read “r” Yield “or” Read “o” Initial state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 44
state (words in vocabulary) Non-accepting state word w / or / d Dropped state Read “w” Read “o” Read “r” No transition to “word”, yield final accpeted token “w” Read “r” Yield “or” Read “d” Input ends, yield “d” Read “o” Initial state 最⻑⼀致探索において語彙をランダムに削除する →Tokenizationをサンプリングできる! 2022/12/14 NLPコロキウム 45