Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Evening with MongoDB - San Diego: Real User Measurements with MongoDB

mongodb
July 25, 2012
1.1k

An Evening with MongoDB - San Diego: Real User Measurements with MongoDB

Eric Azoulay, Neustar
In this talk I will briefly go over what Neustar does and what real user monitoring is. Then I will talk about why we chose MongoDB as our storage solution for the expected massive amount of data we plan to collect. Then I will go more into details about our data model, our architecture with Amazon AWS, aggregation jobs, etc. After that I will talk about daily operations such as maintenance, backups, and using Mongo MMS for monitoring. Finally I will talk about what is coming up with MongoDB in this project: using sharding for scaling out and the new aggregation framework for advanced features of our product.

mongodb

July 25, 2012
Tweet

Transcript

  1. Eric Azoulay
    Neustar Web Performance
    [email protected]
    Real User Measurements
    with MongoDB

    View Slide

  2. What is…
    Neustar?

    View Slide

  3. Neustar offer
    »  Number Portability, Common Short Codes & QR Codes
    »  UltraDNS
    »  DDoS Protection
    »  IP Intelligence: Fraud Prevention, Localized Web Content
    »  Web Performance Management: Performance Monitoring, Load Testing, Real
    User Measurements

    View Slide

  4. »  Synthetic website monitoring
    »  Load testing service
    »  Intelligent alerting
    »  Real User Measurements
    Neustar Web Performance Management

    View Slide

  5. Why Real User Measurements?
    »  RUM identifies issues experienced by your users
    »  Covers your users’ locations, browsers, paths in your
    website
    »  Data collected: url, total page load time, time to first byte,
    dns time, redirect time
    »  Data NOT collected: session, cookie, personal information

    View Slide

  6. »  Chrome, FF 7+, IE9+ (more than 2/3 of browsers)
    Navigation Timing API

    View Slide

  7. Neustar Web Performance RUM
    »  Beta product – free!
    »  Captures experience of actual users on your site
    »  Only thing to do is copy a tiny piece of JavaScript code into
    your web page template
    »  Performance data is collected for every page visit after the
    page is done loading
    »  THIS COULD TURN INTO A LOT OF DATA

    View Slide

  8. Why MongoDB?
    »  Built to scale
    »  JSON everywhere in a Javascript ecosystem (Node.js)
    »  No alter table!!
    »  Ease of use, reduced development time
    »  Lots of nice features: replica sets, JavaScript shell, mms
    »  Support from community and 10gen

    View Slide

  9. Real User Measurements architecture
    INPUT OUTPUT
    MongoDB
    Magic

    View Slide

  10. View Slide

  11. RUM Web Beacon

    <br/>ns_rum = {};<br/>ns_rum.init = function () {<br/>var s = document.createElement ('script'); s.id = 'rum';<br/>s.type = 'text/javascript'; s.src = 'https://djzsy4s19uaxq.cloudfront.net/[your ID]/neustar.beacon.js';<br/>document.getElementsByTagName('head')[0].appendChild(s);<br/>}<br/>window.addEventListener ? window.addEventListener('load', ns_rum.init, false) :<br/>window.attachEvent ? window.attachEvent('onload', ns_rum.init) : false;<br/>

    View Slide

  12. Data Flow, step 1: browser to data collector
    https://rum-collector.wpm.neustar.biz/beacon?u=https%3A
    %2F%2Fmonitor.wpm.neustar.biz
    %2F&t_done=3505&t_page=1770&r=https%3A%2F
    %2Frum.wpm.neustar.biz
    %2F&nt_redirectCount=0&nt_navigationType=0&nt_redirect
    Time=0&nt_dnsTime=0&nt_connectTime=757&nt_firstPack
    et=1735&nt_sslTime=203

    View Slide

  13. Data Flow, step 2: raw data collection
    "ts" : ISODate("2012-07-10T12:40:00.231Z"),
    “mid” : “1234567890abcdef”
    "ua" : {
    "asn" : "45839 - piradius net",
    "co" : "malaysia",
    ”st" : "kuala lumpur",
    "ls" : "high",
    "br" : "firefox 11"
    },
    "u" : “mywebsite.com/example",
    "t_page" : 4957,
    "t_dom" : 3702,
    "t_dns" : 6,
    "_id" : ObjectId("4ffc22a0f0db9a6f590000e5")

    View Slide

  14. Data Flow, step 3: aggregated data collection
    "ts": ISODate("2012-03-05"),
    "mid": "CA91DA1B4B6F44229121FA84795D143E",
    "hours": [
    {
    "hour": 0,
    "cnt": 12,
    "tplt": 46000,
    “apdex”: 0.77,
    …,
    “mins": [
    {
    "min": 0,
    "cnt": 3,
    "tplt": 6000,
    “apdex”: 0.69,

    View Slide

  15. Data rollup job
    »  JS script run every minute on the primary DB node
    »  Aggregate data, calculate apdex, min/max, add sample
    counts to buckets
    »  If day document does not exist, create one padded with 0s
    »  In-place update of the current hour and minute
    »  Document does not grow in size (keep padding factor at 1)

    View Slide

  16. Keeping data under control
    »  hourly job to remove old data
    »  weekly job to compact collections
    »  monthly job to rotate the MongoDB log files

    View Slide

  17. Deployment on Amazon EC2
    »  Easy and affordable
    »  Can scale: from experiment to production
    »  Redundancy, security groups, etc.

    View Slide

  18. Deployment on Amazon EC2 (cont.)

    View Slide

  19. Deployment on Amazon EC2 (cont.)

    View Slide

  20. Monitoring with MMS

    View Slide

  21. Monitoring with MMS (cont.)

    View Slide

  22. » 7 days worth of raw data
    » Quick drill down to the minute
    RUM UI – Time Series

    View Slide

  23. RUM UI – Time Series (cont.)

    View Slide

  24. Apdex score
    »  Measures user satisfaction
    »  Apdex = (Satisfied + Tolerated/2) / Total
    Level Time
    Satisfied <= 2 seconds
    Tolerated Between 2 and 8 seconds
    Frustrated Greater than 8 seconds

    View Slide

  25. now: with Map Reduce
    soon: with Aggregation Framework
    RUM UI – The fun stuff

    View Slide

  26. Soon with MongoDB
    »  TTL collections – 2.2 release
    »  Aggregation framework – 2.2 release
    »  Sharding

    View Slide

  27. Visit us at wpm.neustar.biz
    Thank you

    View Slide