Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Activating Your Site - A look at activity streams

Avatar for Justin Quick Justin Quick
September 04, 2014

Activating Your Site - A look at activity streams

Djangocon US 2014 talk with Ben Fonarov and Farhan Syed

http://lanyrd.com/2014/djangocon-us/sdcytk/
https://www.youtube.com/watch?v=R_tyN8bNe4A

Avatar for Justin Quick

Justin Quick

September 04, 2014
Tweet

Other Decks in Technology

Transcript

  1. A Look at Activity Streams Presented by Justin Quick, Ben

    Fonarov & Farhan Syed Activating Your Site Djangocon US 2014
  2. Farhan Syed
 Application Developer
 github.com/nearhan Who We Are Ben Fonarov

    Full Stack-Tech Lead github.com/chuwiey Justin Quick AppDev Tech Lead Python/Django > 7yrs github.com/justquick github.com/natgeo
  3. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations The Problems The Activity Streams Specification
  4. Activities increase engagement Even something as simple as a “like

    button” increases traffic* People like clicking things Seeing what others are doing is always super awesome *http:/ /thinktraffic.net/will-facebook-like-buttons-bring-more-visitors-to-your-site-think-traffic-monthly-report-8
  5. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations The Problems The Activity Streams Specification
  6. No cross site interaction Duplication of work == $$$ No

    common semantic structuring Varied Implementations Too many peppers == Evil
  7. What to store? What data structure? Relational vs Non-Relational Will

    you have more writes or reads? Should activities and final stream be stored differently? Storage, Data Structure & Schema
  8. Centrality: too costly? Enforce limits? Is this a UX thing?

    Sparsity: “Not fun”, “Cold Start” Recommended content/ activities in stream? Centrality vs Sparsity
  9. How do you deal with a massive amount of activities?

    Where do you compute? How do you deal with complex queries? Scalability
  10. When do you do your computation? Are you pushing out

    in real-time? Half and half? (whipped please) What are you actually computing/sending? Real-time vs Pre-computed
  11. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations The Problems The Activity Streams Specification
  12. http:/ /activitystrea.ms/
 ATOM/JSON Gnip BBC Google Buzz Gowalla IBM MySpace

    Opera Socialcast Superfeedr TypePad Windows Live YIID It’s Spec Time!!!
  13. Anatomy of The Spec Actor (required) The entity that performed

    the activity (eg user) Verb (required) Specifies the type of action which was done by the actor (eg liked) Action Object Primary object of the activity (eg photo) Target What the action object belongs to (eg album) Time, Title, Summary
  14. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations The Problems The Activity Streams Specification
  15. django-activity-stream Uses the spec Tracks any object in your Django

    project Django supported DB using GFKs Provides a way to render activity reprs via templates/feeds All generated at request time (caching is up to you) github.com/justquick/django-activity-stream
  16. Models up close - GFKs
 - verb (char)
 - timestamp

    (datetime)
 - description (text)
 - public (bool) actor target action_object ACTION ANY MODEL - FK
 - GFK
 - started (datetime)
 FOLLOW follow_object ANY MODEL user AUTH/CUSTOM USER
  17. User Stream Query user object3 object2 object1 action3 action2 action1

    action4 action5 FOLLOWS INVOLVES Actor Target Object
  18. Custom Stream Query game player3 player2 player1 action3 action2 action1

    action4 action5 CONTAINS INVOLVES Actor Actor
  19. DB Problems Normally O(A * (Aa + At + Ao

    )) With prefetch_related (Django>=1.4) O(C) GFKs cant handle aggregation/annotation Count(actor__health__gt=5)
  20. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations The Problems The Activity Streams Specification
  21. Horizon Built as a service Follows the activity stream spec

    (or tries to…) Real-time/Pre-compute mix Your models must have an API Clear separation between front and back end github.com/natgeo/activitystreams
  22. Horizon Ecosystem Apache Storm Snippet Stream Web Socket Neo4j Redis

    Horizon ! - Node.js - Sails.js (MVC) AJAX MQ Pre-compute Front-End
  23. Storage Considerations a graph database is perfectly suited for storing

    activities and for making interesting queries Actor Object edge node edge Target node node
  24. Redis Horizon Ecosystem Apache Storm Snippet Stream Web Socket Neo4j

    Horizon ! - Node.js - Sails.js (MVC) AJAX MQ
  25. Neo4j (In General) Based on TinkerPop (a java framework for

    property graphs) Implements doubly linked lists as its data structure for relationships (edges) and the nodes have pointers to their relationships
  26. Complexity (Neo4j) Doubly Linked Lists Average: Indexing - O(N) Search

    - O(N) Insert - O(1) Delete - O(1) * N = # of edges ** Neo4j uses Lucene for indexing, so we get much better average-case for indexing and search
  27. What We Store api: <string> aid: <string> type: <applabel_modelname> created:

    <timestamp> updated: <timestamp> node edge type: Neo4j Native created: <timestamp> updated: <timestamp> api: <string> aid: <string> type: <applabel_modelname> created: <timestamp> updated: <timestamp> node we do not store any model data in the graph!
  28. Horizon Ecosystem Apache Storm Snippet Stream Web Socket Neo4j Horizon

    ! - Node.js - Sails.js (MVC) AJAX MQ Pre-compute Redis
  29. Apache Storm: Pre-Computation Like Hadoop but for real-time message processing

    Not a dependency, but a recommendation Communication via message queue (eg kafka) Storm topologies: many different computation patterns Dump results into Redis to have cached responses
  30. Horizon Ecosystem Apache Storm Snippet Stream Web Socket Neo4j Redis

    Horizon - Node.js - Sails.js (MVC) AJAX MQ Pre-compute Front-End
  31. Horizon API Supports multiple content models with the use of

    simple namespacing: applabel_modelname allows access to activities from the viewpoint of an actor, object, target and more
  32. Example Call GET all of the activities in which the

    youtube_video with id 1 has been FAVORITED /object/ means that in this case youtube_video is an “action object”
  33. More Examples Would return a specific activity as described by

    the spec Remember that actor_type and object_type are in “applabel_modelname” syntax Would return all activities done to an object by a specific type of actor (Useful for Counts)
  34. And Even Complex Stuff… proxy controller facilitates verbs like “followed”

    the returned data looks very much like the example call p.s: there’s also a reverse proxy api Proxy Actor Proxy Verb Verb Object Object Object
  35. Problems no control over external data graph databases don’t have

    a great eco- system, yet (adapters, GORM etc.) front end vs back end computation of stream specific logic
  36. Snippet Stream Horizon Ecosystem Apache Storm Web Socket Neo4j Redis

    Horizon ! - Node.js - Sails.js (MVC) AJAX MQ Pre-compute Front-End
  37. The Snippet 12 45 ! Represents an Activity that an

    Actor can take on an Object Sends activities over the wire to the Horizon service And displays some information about the state of that activity github.com/natgeo/modules-activitysnippet Snippet “Like Button”
  38. The Stream Displays streams based on context Streams activities real-time

    in the browser github.com/natgeo/modules-activitystream Stream “News Feed”
  39. Request Response Cycle Web Socket ! Horizon App1 App2 AJAX

    Request(s) Responses are cached Stream local Storage
  40. Agenda Activity Streams:
 - What? - Why? As a Django

    App: django-
 activity-stream As a Service: Horizon Activity Stream The Need Solutions Engineering Considerations the problems The Activity Streams Specification
  41. Open Source All of the projects we spoke about are

    open source! Contribute! There’s a lot more we can do! We are also creating a django-horizon app github.com/natgeo/ modules-activitystream
 github.com/natgeo/ modules-activitysnippet
 github.com/natgeo/ activitystreams
 github.com/justquick/ django-activity-stream
  42. ATTRIBUTION Activity Streams graphic content released under CC 3.0 by

    http:/ /activitystrea.ms ! Other images released under various licenses: ! https:/ /www.flickr.com/photos/thomashawk/8562000383 (CC BY-NC 2.0) https:/ /www.flickr.com/photos/rikomatic/2784359847 (CC BY-NC-SA 2.0) https:/ /www.flickr.com/photos/ajc1/5058431577 (CC BY-NC-SA 2.0) https:/ /www.flickr.com/photos/bumi/2803382816 (CC BY-SA 2.0) https:/ /www.flickr.com/photos/_tar0_/5241548889 (CC BY 2.0) https:/ /www.flickr.com/photos/katerha/14310234256 (CC BY 2.0) https:/ /www.flickr.com/photos/quinnanya/8203757695 (CC BY-SA 2.0) http:/ /en.wikipedia.org/wiki/Infinite_monkey_theorem#mediaviewer/File:Monkey-typing.jpg 
 (public domain via Wikimedia Commons) ! Cropping and resizing modifications were made to fit the presentation format.