Slide 1

Slide 1 text

Metro’s Newsfeed Algorithm How Metro built an algorithmically driven Homepage

Slide 2

Slide 2 text

! David Jensen ! Head of Development ! Metro.co.uk ! WordPress VIP » since Dec 2012 Who am I?

Slide 3

Slide 3 text

! Metro is a very lean operation ! 6 developers ! 20 content producers ! 24/7 mindset ! Constant experimentation ! Trending -> Timeline -> Newsfeed Why algorithms?

Slide 4

Slide 4 text

Collating data from: ! Facebook » Shares » Likes » Comments ! Twitter ! Omniture ! WordPress Started as a dissertation project

Slide 5

Slide 5 text

Views + ((Tweets + Facebook Interactions) * 50) = Score ! Calculated every 30 minutes ! Rate of change = constantly changing Current Score – Previous Score = Trending Initial calculations

Slide 6

Slide 6 text

Trending Designs

Slide 7

Slide 7 text

Trending Stats

Slide 8

Slide 8 text

! Removed swipe ! Trialled at the bottom of the homepage ! Then rolled out to the bottom of every page ! Native Content ! Native Display Units Timeline “A stream of news”

Slide 9

Slide 9 text

! Time based stream ! Picture size based on currently popularity ! Native content clearly marked ! Native display units for CPA and CPM based advertising ! Consistency across all platforms Timeline Design

Slide 10

Slide 10 text

! Scrolls Timeline Statistics ! Clicks

Slide 11

Slide 11 text

(Views + ((Social Interactions) * 10)) * Time Since Publish Coefficient ! Highest clicks on the top of every stream ! Popular content at the top of each stream ! Gives fresh content a chance ! Penalises popular content after a period ! Four hours about the half life of a story Newsfeed “Algorithmic stream of news”

Slide 12

Slide 12 text

Stream based iterations / DAU 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Timeline New Style Newsfeed Infinite Scrolls/DAU Clicks/DAU

Slide 13

Slide 13 text

! Based on actions per daily active user (DAU) ! Timeline -> Newsfeed clicks increased 9% ! Allowed us to take over the homepage ! Content density A/B test increased clicks 20% ! Infinite scroll increased clicks 20% ! Native Display -> 10x click through vs sidebar MPU ! Native content traffic drivers on every page NewsFeed Statistics

Slide 14

Slide 14 text

! Content volume is key to ensure freshness ! Cut the data at the highest level for cache-ability ! Speed of lazy load essential ! Publishing times can affect clustering ! MySQL is simple but limited ! Common understanding helps iterating Lessons learned

Slide 15

Slide 15 text

! Cache the first page of it using wpcom_vip_file_get_contents ! Copy the public API format to be able to change between sources quickly ! Large options allows you to store data in it ! Post meta can also store information ! CHEEZETEST is great but can add complication 15 WordPress Lessons

Slide 16

Slide 16 text

! Micro services architecture » Data mining » Newsfeed » Commercial feed ! Backbone used for templates ! Cloud front for caching API / Frontend

Slide 17

Slide 17 text

! Just top 10 stories on the site at any time ! Gives you more of what you read most ! 600 installs, 120 DAUs, 2 sessions a day, 13 screens/session ! Wouldn't have been possible without the API Metro10 – Android App

Slide 18

Slide 18 text

http://blog.david-jensen.com @elgrom Thanks for listening