Slide 11
Slide 11 text
CHOOSING K
GOAL: Minimize the amount of difference within each cluster, but not
so much that they become singular.
K is the number of groups
to classify hotels into. How
do we know? Visualize
within group differences for
at least 30 - 50 clusters.
Here, K is fine anywhere
between 10 and 23 clusters.
We can assign new hotels
to a cluster by feeding their
reviews into the this matrix.
The matrix can learn new
words from new reviews
and re-classify hotels based
on updates.