Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Diversity from Unsupervised Mapping of Learned Features

Diversity from Unsupervised Mapping of Learned Features

As an Insight Data Science Fellow, I recently worked with Whirlscape, Inc. to increase the diversity of results displayed by their emoji recommender app, Dango (https://play.google.com/store/apps/details?id=co.dango.emoji.gif&hl=en).

Dango suggests emoji in real-time based on what a user is writing. The recommender is informed by a recurrent neural network that has been trained with millions of tweets containing text and emoji (check out http://karpathy.github.io/2015/05/21/rnn-effectiveness/ for a brief introduction to RNNs). When given a new text input, the trained model uses a softmax classifier to assign a probability to each emoji based on the text , and only the most probable (highest ranked) emoji are displayed to the user by the app.

It's a great app, and the performance of the RNN is very impressive, but sometimes the most probable emoji for a given text input are very similar to one another, and so the menu displayed to the user is too redundant (e.g., all hearts, all clocks, or all smiley faces). To address this problem, I sought to establish a useful measure of similarity between recommended items (emoji), and then use this similarity measure to improve the diversity of items that are displayed to the user.

Importantly, I observed that in the high-dimensional feature space of the trained RNN's final dense layer, similar emoji are located near to one another. I observed this by visualizing 2-dimensional t-SNE renderings of the final dense layer feature space. The t-SNE algorithm estimates pairwise joint probabilities between all points in a dataset using Gaussian kernel functions centered about each point. To quantify the degree of similarity between all pairs of emoji, I used this same approach to estimating their joint probabilities, and used the negative log of the joint probability as a convenient and effective similarity score (similarity S=0 for joint probability P=1; S=1 for P=0.1; S=2 for P=0.01; and so on).

I next used this similarity score to modify the original ranking by demoting any emoji that was too similar to those preceding it in the original ranking. This reranking reduces the model's accuracy in the sense that it is no longer optimized to predict the most probable emoji for a given user input. However, increasing the diversity of the displayed items can improve user experience by providing the user with a more interesting set of options to choose from.

The modified ranking will ultimately need to be validated with A/B tests. It will be particularly important to measure how frequently users presented with the "reranked" menu select emoji that would not have been displayed in the original menu, and how frequently users presented with the original menu select emoji that would not have been displayed in the "reranked" menu.

The approach described here, for measuring similarity between items in a recommender system's inventory, should generalize to arbitrarily large, unlabeled inventories, provided that a suitable model (such as a neural network) has already been trained to predict items in the inventory. This circumvents the need for additional supervised learning or specialized knowledge to differentiate between similar vs. diverse sets of items and may prove useful for diversification of other recommender systems.

elliottmerriam

February 09, 2016
Tweet

Other Decks in Technology

Transcript

  1. Adding variety to a menu or, "How to make a

    model less accurate on purpose" Elliott Merriam, Insight Data Science
  2. Problem: Emoji sometimes too similar "I love . . .

    " Dango, by Whirlscape, Inc. Emoji Recommender:
  3. Millions of tweets RNN assigns P(emoji) to text input Suggested

    emoji displayed Existing App: Project: Generate a useful measure of similarity Compute emoji similarity Rerank emoji to be more diverse
  4. ꏼ % ♎ ♌ ♑ 4 5 ꑡ ꑣ t-SNE

    plot of emoji in final dense layer of trained RNN Similar emoji are neighbors 998 emoji 768 features
  5. Similarity score based on joint probabilities S = -log(pi,j) 0.61

    3.42 8.15 8.50 8.51 2.92 2.81 0.43 8.20 8.36 8.94 2.21 8.11 7.79 0.13 2.56 2.06 6.70 9.46 9.05 2.01 0.87 1.78 8.48 8.60 8.57 2.19 2.47 0.28 8.04 2.17 3.17 7.00 7.59 7.86 0.12 S
  6. Impact: Company will A/B test and will use if UX

    is enhanced Reranked results less accurate but more diverse Neighboring emoji can’t be too similar: Similarity S must be > threshold T for jth emoji and previous M emoji "I love . . . " rerank ❤ — ❤
  7. About me Postdoc (University of Washington) Certificate in Data Science

    (University of Washington) Ph.D. in Neuroscience (University of Wisconsin) Registered Patent Agent
  8. didn't help relax me too much but i'm oiled up

    smelling like coconuts ☺ ☺ (reranked) who wants to hangout next weekend lol ☺ ❤ ‼ ☺ (reranked) looks like i'm relying on study halls to finish homework again (reranked) mom is making me eat dinner before having pancakes (reranked) so looks like i can go to the tour after all! oh the things i do for boybands ☺ ❤ ❤ ☺ (reranked) so bloated that i have no choice but to wear sweats, leggings, or yoga pants ❓ (reranked) Original vs. reranked emoji for sample tweet inputs