Slide 14
Slide 14 text
Storage mechanism - Inverted Index
An inverted index is an index data structure storing a mapping
from content, such as words or numbers, to its locations in a
document or a set of documents
D1 : "This is a dog"
D2 : "This is a cat"
D3 : "Dog eats cat"
"this" => {D1, D2}
"is" => {D1, D2}
"a" => {D1, D2}
"dog" => {D1, D3}
"cat" => {D2, D3}
"eats" => {D3}
Supposing we need to find:
this dog
this {D1, D2} ⋂ dog {D1, D3} = {D1}
Documents Inverted Index
Tokenize