Mike Howsden - Zen of Quality - How PBS measures QoS for digital viewers

Mike Howsden - Zen of Quality - How PBS measures QoS for digital viewers

It is extremely important to PBS that digital viewers have an awesome experience when viewing online videos. In this talk, we explain how PBS built a system to collect, analyze, and measure who's getting a good experience -- and who's not.



PyCon 2015

April 18, 2015


  1. Making plans is often the occupation of an opulent and

    boastful mind, which thus obtains the reputation of creative genius by demanding what it cannot itself supply, by censuring what it cannot improve, and by proposing what it knows not where to find. Immanuel Kant - Prolegomena (1783)
  2. Zen of Quality How PBS measures Quality of Service for

    digital viewers Mike Howsden Pycon 2015
  3. Who am I? *thanks for the terrible colors on that

    venn diagram wikipedia
  4. PBS Digital • We do web (pbs.org/pbskids.org), OTT (Roku, FireTV,

    Chromecast, AppleTV, etc.) and mobile application development for PBS • We serve videos and lists of videos to as many platforms as possible • Disclaimer - PBS does not endorse any product or service mentioned and opinions expressed here are my own
  5. Why do we need this? • PBS runs its own

    video streaming system • It’s hard to know what is at fault when a user has a bad streaming experience • We need insight into user experiences around the country, end user bandwidth, and ISP service degradation in relation to our CDNs (net neutrality?) • Complaints are inherently anecdotal and generally don’t come with enough data to make a good diagnosis
  6. Why do we need this? • We care about our

    users • We needed better answers for VIPs • We needed better answers for ourselves
  7. Build vs Buy • Vendors providing this type of service

    are very expensive • Integrations have a non-trivial development cost • We want an independent way to evaluate our vendors’ players • No good open source solutions (yet?)
  8. None
  9. Agora Overview • This framework attempts to follow best practices,

    but is specifically tuned to video QoS • Similar frameworks exist for other business applications • The goal is to be modular and clearly define interfaces • This is one of the projects at PBS Digital that we consider “big data” • 70 million events -> 7 million stream profiles, daily* * I just picked a day and did very rough math
  10. What’s the Challenge? • For each video view, a player

    fires events like MediaStarted, BufferingStart/Stop, MediaPaused, etc. • These events need to be collected to represent a stream • They then need to be interpreted to create a stream profile • These profiles need to be analyzed and summarized to answer our business questions
  11. EMR S3 (streams.json) Goonhilly S3 (logs.tgz) 2 3 4 5

    6 2 3 1
  12. S3 (.json) 2 3 5 6 2 3 1 4

    batch INSERT RDS /* postgreSQL table partitioning, trigger on master table creates child tables on INSERT */ CREATE TABLE ga_streams_YYYY-MM-DD ( ) INHERITS (ga_streams_master); 5 Summary Table/Tableau Extract 6
  13. Future enhancements • Real time vs Batch • Spark? Storm?

    • Dashboard style live visualizations for operational monitoring • Alerts and automated decisioning • Continued flexibility for changing business needs
  14. Code/Resources • http://github.com/pbs/goonhilly - simple event logging service • https://github.com/Yelp/mrjob

    - python library we used for agora-proc • http://github.com/pbs/agora-proc - batch analyzer utilizes Amazon EMR cluster to reduce video events into defined characteristics of a particular stream (this repo also documents the player events) • http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html - for more on postgreSQL table partitioning
  15. Questions? Thanks for your time.