$30 off During Our Annual Pro Sale. View Details »

A personalized news feed

Viadeo
April 10, 2015

A personalized news feed

Un News Feed temps réel personnalisé pour 65 millions d’utilisateurs - conférence Devoxx avril 2015

Fabrice Robini @frobini
Greg Truchetet @Myxz
Quentin Suire @Kuhess

Viadeo

April 10, 2015
Tweet

More Decks by Viadeo

Other Decks in Programming

Transcript

  1. @Viadeo
    #elasticfeed
    “ A personalized news feed ”
    Fabrice Robini @frobini
    Greg Truchetet @Myxz
    Quentin Suire @Kuhess

    View Slide

  2. @YourTwitterHandle
    @YourTwitterHandle
    @Viadeo
    #elasticfeed

    View Slide

  3. @YourTwitterHandle
    @YourTwitterHandle
    @Viadeo
    #elasticfeed
    News feed

    View Slide

  4. @Viadeo
    #elasticfeed
    What is a news feed?

    View Slide

  5. @Viadeo
    #elasticfeed
    What is a news?
    http://www.w3.org/TR/activitystreams-core/

    View Slide

  6. @Viadeo
    #elasticfeed
    Why Elasticsearch ?
    ● Our use cases are search oriented
    ● We want to score & boost results
    ● And we already use it in our infrastructure :)

    View Slide

  7. @Viadeo
    #elasticfeed
    Activity Stream
    /newsfeed/news/urn:viadeo:news:33
    {
    "producerId": "urn:viadeo:member:john",
    "published": "2014-12-17T12:45:00.000Z",
    "actor": {
    "id": "urn:viadeo:member:john",
    "objectType": "member",
    "displayName": "John"
    },
    "verb": "add",
    "object": {
    "id": "urn:viadeo:position:10",
    "objectType": "position",
    "displayName": "Software Engineer @ Viadeo"
    },
    "target": {
    "id": "urn:viadeo:profile:john",
    "objectType": "profile",
    "displayName": "John's profile"
    }
    }

    View Slide

  8. @Viadeo
    #elasticfeed
    News Creation

    View Slide

  9. @Viadeo
    #elasticfeed
    Store a pre-computed news feed for each consumer
    Fan-out on write

    View Slide

  10. @Viadeo
    #elasticfeed
    Fan-out on write
    Store a pre-computed news feed for each consumer

    View Slide

  11. @Viadeo
    #elasticfeed
    Fan-out on read
    Store atomic news and compose a news feed at the query time

    View Slide

  12. @Viadeo
    #elasticfeed
    Store atomic news and compose a news feed at the query time
    Fan-out on read

    View Slide

  13. @YourTwitterHandle
    @YourTwitterHandle
    @Viadeo
    #elasticfeed
    Personalized

    View Slide

  14. @Viadeo
    #elasticfeed
    What is a news feed?
    Network

    View Slide

  15. @Viadeo
    #elasticfeed
    News Filtering

    View Slide

  16. @Viadeo
    #elasticfeed
    Network Document
    Michel knows Ned, John and Mary
    can be represented as
    {
    "producers": [
    "urn:viadeo:member:ned",
    "urn:viadeo:member:john",
    "urn:viadeo:member:mary"
    ]
    }
    /network/network/urn:viadeo:network:michel

    View Slide

  17. @Viadeo
    #elasticfeed
    Personalized
    Explicit terms filter
    "terms": {
    "producerId": [
    "urn:viadeo:member:john",
    "urn:viadeo:member:ned",
    "urn:viadeo:member:mary"
    ]
    }

    View Slide

  18. @Viadeo
    #elasticfeed
    Personalized
    Terms filter lookup
    "terms": {
    "producerId": {
    "index": "network",
    "type": "network",
    "id": "urn:viadeo:network:michel",
    "path": "producers"
    }
    }
    http://www.elastic.co/guide/

    View Slide

  19. @YourTwitterHandle
    @YourTwitterHandle
    @Viadeo
    #elasticfeed
    Demo

    View Slide

  20. @Viadeo
    #elasticfeed
    Basic ordering
    #1 No relevancy : Timeline ordered activities
    "sort": [
    {
    "published": {
    "order": "desc"
    }
    }
    ]
    http://www.elastic.co/guide/

    View Slide

  21. @Viadeo
    #elasticfeed
    Function score
    #2 Relevancy on Likes : Boost on Likes value
    "script_score": {
    "script": "_score * doc['likes'].value"
    }
    http://www.elastic.co/guide/

    View Slide

  22. @Viadeo
    #elasticfeed
    Decay function
    #3 Relevancy on Likes & Date: Using Decay function
    "gauss": {
    "published": {
    "origin": "2015-04-01T12:00:00.000Z",
    "offset": "3d",
    "scale": "3d",
    "decay": 0.5
    }
    }
    http://www.elastic.co/guide/

    View Slide

  23. @Viadeo
    #elasticfeed
    Decay Function

    View Slide

  24. @YourTwitterHandle
    @YourTwitterHandle
    @Viadeo
    #elasticfeed
    Today

    View Slide

  25. @Viadeo
    #elasticfeed
    Our ES cluster
    • 300 million documents
    • 50 million search queries / day
    • 3 nodes
    • Every 15 minutes backup
    • Average response time: 50ms

    View Slide

  26. @Viadeo
    #elasticfeed
    Take Away
    • Terms Filter Lookup to aggregate documents accross
    several indices
    • Deal with relevancy by using functions_score (decay
    functions in our case)
    • Play & POC with ES :-)
    http://www.elastic.co/guide/

    View Slide

  27. @Viadeo
    #elasticfeed
    Thank you!
    & we’re hiring :)
    Fabrice Robini @frobini
    Greg Truchetet @Myxz
    Quentin Suire @Kuhess

    View Slide