$30 off During Our Annual Pro Sale. View Details »

Dealing With Time Travelers in Analytics

xyu
September 10, 2015

Dealing With Time Travelers in Analytics

It's not enough to just have an analytics infrastructure for real-time online actions. The proliferation of mobile and international expansions means analytics platforms must also account for offline actions while ensuring that it does not harm user experience. At WordPress.com, we are building and scaling an analytic infrastructure to satisfy the dual mandate of accepting arbitrarily ordered events while efficiently coalesce them for querying. Here's what we've learned on our journey towards a lambda architecture.

xyu

September 10, 2015
Tweet

More Decks by xyu

Other Decks in Technology

Transcript

  1. xyu@automattic.com  @HypertextRanch  xyu.io  xyu  Dealing With

    Time Travelers in Analytics Xiao Yu / Automattic
  2.   

  3.  

  4.    VaultPress Jetpack Simplenote Akismet Polldaddy Gravatar VideoPress

    IntenseDebate Simperium Code Poet Cloudup WooCommerce Longreads WordPress.com WordPress.com VIP
  5. –No One Ever “Gee... I really like watching a spinner

    go round and round.”
  6. 100ms Delay → 62 Years Wasted 100ms Delay

  7. None
  8. Better Page Speed → More Conversions

  9. Emerging Markets & Mobile More Latencies & Less Bandwidth

  10. Data Scientist Performance Engineer

  11. –Sage Advice “You can't improve what you can't measure.”

  12. Data Scientist Performance Engineer • Load pre-render • Add callback

    to all actions • Wait for successful response • Lazy load • Track only minimum required • Send via background task
  13.  Don’t Block The Critical Path

  14.  Don’t Block The Critical Path Build Optimistic UIs

  15.  Don’t Block The Critical Path Build Optimistic UIs Build

    Native Apps
  16. Client Speed — Don’t Block

  17. <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit',

    { custom_data: '…' } ] ); </script> <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit', { custom_data: '…' } ] ); </script>
  18. <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit',

    { custom_data: '…' } ] ); </script> <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit', { custom_data: '…' } ] ); </script>
  19. <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit',

    { custom_data: '…' } ] ); </script> … <script src = '//stats.wp.com/w.js' async defer ></script> <script> window._tkq = window._tkq || []; window._tkq.push( [ 'recordEvent', 'wpcom_post_edit', { custom_data: '…' } ] ); </script> … <script src = '//stats.wp.com/w.js' async defer ></script>
  20. –Einstein “Everything should be made as simple as possible, but

    not simpler.”
  21. None
  22. Faster Analytics → 11.7% Bump in Conversions

  23. Client guarantees events are sent at-least once; servers cleans up

    data and does everything else.
  24. Clients tell us when events happened…

  25. Clients tell us when events happened… but
 client clocks are

    unreliable.
  26. Event A Event B ?

  27. Ȑ Ȑ Ȑ Ȑ Ȑ Ȑ Morphine

  28. Client Clock Server Clock
 10 20 21
 22 32 10

  29. Client Clock Server Clock
 10 11 20 21
 22 32

    10
  30. Now T-9

  31. Ȑ Ȑ Ȑ Ȑ Ȑ Ȑ Morphine

  32. Morphine

  33. https://automattic.com/work-with-us/data-wrangler/

  34. xyu@automattic.com  @HypertextRanch  xyu.io  xyu  Thanks!