Slide 1

Slide 1 text

Jill C tes November 16th 2020 Building a MovieLens Recommender System

Slide 2

Slide 2 text

Customers who bought this item also bought… Amazon Similar items based on your browsing history… Based on your reading history… Because you watched Narcos… “Finding your best match” Jobs recommended for you… Net ix Linkedin Online Shopping Medium OkCupid

Slide 3

Slide 3 text

limited inventory mainstream products unlimited inventory niche products Brick-and-mortar E-commerce

Slide 4

Slide 4 text

Brick-and-mortar limited inventory mainstream products unlimited inventory niche products E-commerce

Slide 5

Slide 5 text

“A physical store cannot be recon gured on the y to cater to each customer based on his or her particular interests.” - Chris Anderson

Slide 6

Slide 6 text

The Tasting Booth Experiment 6 jam samples 24 jam samples vs.

Slide 7

Slide 7 text

6 jam samples 24 jam samples vs. 40% of customers stopped at the limited-choice booth 60% of customers stopped at the extensive-choice booth Initial Interest The Tasting Booth Experiment

Slide 8

Slide 8 text

6 jam samples 24 jam samples vs. 30% conversion rate 3% conversion rate Subsequent Purchase The Tasting Booth Experiment

Slide 9

Slide 9 text

The Tasting Booth Experiment The Paradox of Choice Less is more Too much choice = Stressful

Slide 10

Slide 10 text

What is a Recommender System? An application of machine learning Machine Learning Model Data Predictions

Slide 11

Slide 11 text

An application of machine learning Recommender System User preferences Recommendations What is a Recommender System?

Slide 12

Slide 12 text

What is a Recommender System? An application of machine learning predicting future behaviour explicit feedback implicit feedback Recommender System User preferences Recommendations

Slide 13

Slide 13 text

What is a Recommender System? An application of machine learning predicting future behaviour explicit feedback implicit feedback Recommender System User preferences Recommendations Collaborative ltering Content-based ltering item user John Jim Anne Liz Erica

Slide 14

Slide 14 text

Collaborative Filtering “Similar people like similar things” User-item (“utility”) matrix Users Movies Arnold Peter Susan Valerie Jean Walter Charlie 5 4 3 4 1 2 3 1 5 2 2 4 1 4 5 5 4 1 2 4 5 4 2 1 1 5 5 3 5 4 2 4 3

Slide 15

Slide 15 text

Collaborative Filtering “Similar people like similar things” User-item (“utility”) matrix Users Movies Arnold Peter Susan Valerie Jean Walter Charlie 5 4 3 4 1 2 3 1 5 2 2 4 1 4 5 5 4 1 2 4 5 4 2 1 1 5 5 3 5 4 2 4 3

Slide 16

Slide 16 text

Collaborative Filtering “Similar people like similar things” User-item (“utility”) matrix Users Movies Arnold Peter Susan Valerie Jean Walter Charlie 5 4 3 4 1 2 3 1 5 2 2 4 1 4 5 5 4 1 2 4 5 4 2 1 1 5 5 3 5 4 2 4 3

Slide 17

Slide 17 text

Collaborative Filtering “Similar people like similar things” User-item (“utility”) matrix Users Movies Arnold Peter Susan Valerie Jean Walter Charlie 5 4 3 4 1 2 3 1 5 2 2 4 1 4 5 5 4 1 2 4 5 4 2 1 1 5 5 3 5 4 2 4 3

Slide 18

Slide 18 text

Users Movies 5 3 4 1 2 3 1 2 2 4 1 5 5 4 1 2 4 5 4 2 1 1 5 3 5 4 2 4 3 “ A system cannot draw any inferences for users or items about which it has not yet gathered sufficient information.” Cold Start Problem ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Slide 19

Slide 19 text

Content-Based Filtering User and Item Features Users Features Arnold Peter Susan Valerie Jean Walter Charlie Movies Features 45 32 20 59 47 17 36 M M M F M F F CA US US US FR CA CA EN EN EN FR EN EN CA Y Y N Y N Y N N Y Y N Y N N Y Y N Y Y Y N age gender country language 96 horror? family? comedy? 97 07 10 19 16 03 EN EN EN EN EN EN EN N N N N N N Y N Y Y Y Y N N N N N Y Y Y N Y Y Y N N Y N N N N N N N N horror family comedy drama thriller language year of release

Slide 20

Slide 20 text

Tutorial

Slide 21

Slide 21 text

Environment Set-up Option 1: Run notebook locally Option 2: Run notebook in the cloud • Need to install Jupyter Notebook • Google Colab is a Jupyter notebook environment that runs in the cloud • Minimal set-up required (need a Gmail account) • Supports free GPU

Slide 22

Slide 22 text

MovieLens Dataset

Slide 23

Slide 23 text

MovieLens Dataset • Created by GroupLens research group at the University of Minnesota • Titanic dataset of recommenders