We are addressing anonymous, disengaged readers • At a large scale in challenging environments • In 40 different languages • We need the solution to be universal ◦ Work with little information about the user ◦ Work fast (scale and kick start) ◦ Work based on language independent techniques
as in a commercial organisation. We need to start now and start simple. But where? • Content to content ◦ Diversity Low ◦ Novelty High ◦ Suprisal High • User Profile to content ◦ Diversity Low ◦ Novelty Low ◦ Suprisal High • Based on mass user (Most viewed) ◦ Diversity Low ◦ Novelty Low ◦ Suprisal Low • User to User ◦ Diversity High ◦ Novelty High ◦ Suprisal High
Allocation? Documents are modelled as a mixture of topics. It’s basically like doing a soft clustering of the articles (it allows documents to exhibit multiple topics) and project them into smaller topics space. Topic 1