Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A personalized news feed

Viadeo
April 10, 2015

A personalized news feed

Un News Feed temps réel personnalisé pour 65 millions d’utilisateurs - conférence Devoxx avril 2015

Fabrice Robini @frobini
Greg Truchetet @Myxz
Quentin Suire @Kuhess

Viadeo

April 10, 2015
Tweet

More Decks by Viadeo

Other Decks in Programming

Transcript

  1. @Viadeo #elasticfeed “ A personalized news feed ” Fabrice Robini

    @frobini Greg Truchetet @Myxz Quentin Suire @Kuhess
  2. @Viadeo #elasticfeed Why Elasticsearch ? • Our use cases are

    search oriented • We want to score & boost results • And we already use it in our infrastructure :)
  3. @Viadeo #elasticfeed Activity Stream /newsfeed/news/urn:viadeo:news:33 { "producerId": "urn:viadeo:member:john", "published": "2014-12-17T12:45:00.000Z",

    "actor": { "id": "urn:viadeo:member:john", "objectType": "member", "displayName": "John" }, "verb": "add", "object": { "id": "urn:viadeo:position:10", "objectType": "position", "displayName": "Software Engineer @ Viadeo" }, "target": { "id": "urn:viadeo:profile:john", "objectType": "profile", "displayName": "John's profile" } }
  4. @Viadeo #elasticfeed Network Document Michel knows Ned, John and Mary

    can be represented as { "producers": [ "urn:viadeo:member:ned", "urn:viadeo:member:john", "urn:viadeo:member:mary" ] } /network/network/urn:viadeo:network:michel
  5. @Viadeo #elasticfeed Personalized Explicit terms filter "terms": { "producerId": [

    "urn:viadeo:member:john", "urn:viadeo:member:ned", "urn:viadeo:member:mary" ] }
  6. @Viadeo #elasticfeed Personalized Terms filter lookup "terms": { "producerId": {

    "index": "network", "type": "network", "id": "urn:viadeo:network:michel", "path": "producers" } } http://www.elastic.co/guide/
  7. @Viadeo #elasticfeed Basic ordering #1 No relevancy : Timeline ordered

    activities "sort": [ { "published": { "order": "desc" } } ] http://www.elastic.co/guide/
  8. @Viadeo #elasticfeed Function score #2 Relevancy on Likes : Boost

    on Likes value "script_score": { "script": "_score * doc['likes'].value" } http://www.elastic.co/guide/
  9. @Viadeo #elasticfeed Decay function #3 Relevancy on Likes & Date:

    Using Decay function "gauss": { "published": { "origin": "2015-04-01T12:00:00.000Z", "offset": "3d", "scale": "3d", "decay": 0.5 } } http://www.elastic.co/guide/
  10. @Viadeo #elasticfeed Our ES cluster • 300 million documents •

    50 million search queries / day • 3 nodes • Every 15 minutes backup • Average response time: 50ms
  11. @Viadeo #elasticfeed Take Away • Terms Filter Lookup to aggregate

    documents accross several indices • Deal with relevancy by using functions_score (decay functions in our case) • Play & POC with ES :-) http://www.elastic.co/guide/
  12. @Viadeo #elasticfeed Thank you! & we’re hiring :) Fabrice Robini

    @frobini Greg Truchetet @Myxz Quentin Suire @Kuhess