Slide 13
Slide 13 text
Tokenize
- The match query analyzes any provided text before performing a search
Query: match (full text query)
Tokenize: n-gram
- The tokenizer breaks down a word into a contiguous sequence of n characters
1-gram: abcde → [a, b, c, d, e]
2-gram: abcde → [ab, bc, cd, de]
3-gram: abcde → [abc, bcd, cde]
[t, o, k, y, o]
Analyze
(Tokenize: 1-gram)
tokyo
match
t o k y
ID 1, 2, 4 1, 2, 3 1, 2 1, 2
Inversed index
ID 1 (tokyo)
[t, o, k, y, o]
ID 2 (kyoto)
[k, y, o, t, o]
Documents