put in an inverted index doc1 = “no limit, no boundaries” doc2 = “no limit made music” $q=boundaries => doc1 $q=limit => doc1, doc2 no 1,1 2 limit 1 2 boundaries 1 made 2 music 2
a stream of terms or tokens. A simple tokenizer might split the string up into terms wherever it encounters whitespace or punctuation. “The|quick|brown|fox…”