Recommendations That Don't Suck (Gareth Seneque 2019)

Game of Clicks Recommendations That Don’t Suck Gareth Seneque @garethseneque
GopherconAU 2019

Some Structure •Recommendations: Popping the Filter Bubble •ML 101: From
Geometry to Intelligent Machines •Building a Data Pipeline •A Duel-DQN for News Recommendation

Popping the Filter Bubble • An agent's model of what
you are interested in will be a cartoon model, and you will see a cartoon version of the world through the agent's eyes. It is therefore a self-reinforcing model. This will recreate the lowest-common-denominator approach to content that plagues TV - Jaron Lanier (1995)

The Filter Bubble: What are Netflix up to?

The Filter Bubble: YouTube Sargin et al. (2016) https://ai.google/research/pubs/pub45530 •
Optimising for ‘watch time’ – ex-Youtuber Guillaume Chaslot, founder of algotransparency.org.

Background: A Geometric View of ML • The number line:
one dimension • A spreadsheet: two dimensions • Add another axis and we have 3d (Euclidian) space • Defining a space: distance metric, inner product • Moving between spaces with simple operations • E.g. Matrix multiplication • Input spaces, weight space, loss surfaces

Background: Visualising Loss Surfaces • Loss surface of a Deep
Neural Network • Goldstein et al. (2017) - https://arxiv.org/abs/1712.09913

Background: Machine Learning • The canonical example: ‘Cat’

Background: Supervised Learning • Our cat image is embedded in
an input space, • X, Y coordinates of a still image, and the pixel intensity/RGB value, hence a 3d-matrix • This space is defined in such a way that we can perform operations on it, like multiplication and addition (of the elements in our 3d-matrix) • These operations induce a common feature representation of the input image we want, and then a categorical prediction of label ‘cat’

Background: Unsupervised Learning • You have your X/Y/Z representation of
cat1.jpg, cat2.jpg, cat3.jpg. They embed in the space of animals*.jpg. • While we may not be able to make a prediction of label ‘cat’ we can make identify ‘clusters’ of features common to cat*.jpg, as distinct from dog*.jpg and goat*.jpg.

Background: Reinforcement Learning • Instead of learning a function mapping
cat.jpg to ‘cat’, or identifying cat-ness in the space of animals*.jpg, we can think of a cat as a self- directed agent, exploring a world in search of tasty mice. • The cat finds a mouse, great success • Cat learns that looking in that corner of the basement is a good place to find mice • Repeat until all roads lead to mice Markov the happy cat à

Reinforcement Learning and Atari Games

Reinforcement Learning for News Recs! http://www.personal.psu.edu/~gjz5038/paper/www2018_reinforceRec/www2018_reinforceRec.pdf

Background: Why Go? • No code published with the DRN
paper – a blank slate! • Easy training at scale, minimal dependencies/container fun • Have written a DQN recently for a book • Want to integrate with existing ABC Recommendations platform & API • Curiosity!

Data: Wrangling the Yak Shavers • Critical but unglamorous part
of data science applications • ‘lib/pq’ just works • Integrate query into pipeline, enabling: • Offline training on ~6 months of 10k sampled users • Online training of recent content interactions • Data pipeline is 71% total LOC pls no

Data: Raw Interaction Log Timestamp Userid URL1 isClick1 URL2 isClick2
15/8/19 0:00 100 /news/2019-08-15/all-blacks… 0 /news/2019-07-18/warning-signs… 1 15/8/19 0:00 200 /news/2019-08-15/wa-premi… 1 /news/2019-07-18/cricketer_fou… 1

Data: What is a Feature Vector? • For our purposes,
[]int • Four types: User, News, User/News, Context • Calculate sparsity and keep it to 1-2%

Data: News Features • Initial news feature vector comprised of
subjects 1380 length – sparsity 0.38% • Reduced vector (>100 articles) of 590 subjects, sparsity 1.11% concatenate

Data: User features • Capture changing content preferences at discrete
time intervals for each user: • 1 hour, 6 hours, 24 hours, 7 days, 1 year

Data: User/News features • Similar to User vector, run inside
same loop • Capturing cumulative total/historical interest • Contrast with pure User vector: • User: [0, 0, 0, 1, 0, 1, 0, 1] (of length 590 x time interval) • UserNews: [1, 0, 4, 2, 0, 3, 6, 1] (of length 590)

Data: Context features • Slice of slices, capturing the ‘context’
(time) of the request. • Simple one-hot encoding, • with ‘isFresh’ attribute to capture the role of recency in a click

Data: What does this mean for our DDQN? • ENVIRONMENT
= Users, News • STATE = User + Context • ACTION = UserNews + Context

Data: Processing Millions of Interactions

A DDQN in Go!

DDQN: Network Architecture de Freitas, et al. (2016) https://arxiv.org/pdf/1511.06581.pdf DQN
DDQN

DDQN: Breaking up the Q-function

DDQN: Content Diversity, Audience trust • Optimising a content diversity
metric! • Measuring co-sine similarity of each recommended item • A single dial for controlling tradeoff between novelty/relevance • Simulation! The original DRN system was run in a live environment • We don’t have that luxury, we are developing a simulator based on real data • This ensures that we are only presenting our audience with a finished system • We then have high confidence in the quality of recommendations • Keeping in-line with editorial standards • ‘Oh these recs are terrible, guess the system is still learning’ = NOPE • Must retain audience trust!

Closing Remarks: What’s Next? • Introduce new information into system
to avoid feedback loops • Successful searches that indicate novel content preferences • Adding textual features to our News vectors • Performance metrics and evaluation against current recommendations system • More generally, a greater push for a personalized experience in 2020 and beyond! • New MD has the mandate to “engage new audiences online and creating a personalised and connected online network”

THE END. Thank you! @garethseneque

Recommendations That Don't Suck (Gareth Seneque...

Recommendations That Don't Suck (Gareth Seneque 2019)

GopherConAU

More Decks by GopherConAU

Other Decks in Programming

Featured

Transcript

Game of Clicks Recommendations That Don’t Suck Gareth Seneque @garethseneque

Some Structure •Recommendations: Popping the Filter Bubble •ML 101: From

Popping the Filter Bubble • An agent's model of what

The Filter Bubble: What are Netflix up to?

The Filter Bubble: YouTube Sargin et al. (2016) https://ai.google/research/pubs/pub45530 •

Background: A Geometric View of ML • The number line:

Background: Visualising Loss Surfaces • Loss surface of a Deep

Background: Machine Learning • The canonical example: ‘Cat’

Background: Supervised Learning • Our cat image is embedded in

Background: Unsupervised Learning • You have your X/Y/Z representation of

Background: Reinforcement Learning • Instead of learning a function mapping

Reinforcement Learning and Atari Games

Reinforcement Learning for News Recs! http://www.personal.psu.edu/~gjz5038/paper/www2018_reinforceRec/www2018_reinforceRec.pdf

Background: Why Go? • No code published with the DRN

Data: Wrangling the Yak Shavers • Critical but unglamorous part

Data: Raw Interaction Log Timestamp Userid URL1 isClick1 URL2 isClick2

Data: What is a Feature Vector? • For our purposes,

Data: News Features • Initial news feature vector comprised of

Data: User features • Capture changing content preferences at discrete

Data: User/News features • Similar to User vector, run inside

Data: Context features • Slice of slices, capturing the ‘context’

Data: What does this mean for our DDQN? • ENVIRONMENT

Data: Processing Millions of Interactions

Data: Processing Millions of Interactions

A DDQN in Go!

DDQN: Network Architecture de Freitas, et al. (2016) https://arxiv.org/pdf/1511.06581.pdf DQN

DDQN: Breaking up the Q-function

DDQN: Content Diversity, Audience trust • Optimising a content diversity

Closing Remarks: What’s Next? • Introduce new information into system

THE END. Thank you! @garethseneque