Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Science! Science Harder!

Science! Science Harder!

How we reinvented ourselves to be data literate, experiment-driven, and ship faster.

Presented by Laura Thomson and Rebecca Weiss at Monitorama 2017

Laura Thomson

May 23, 2017
Tweet

More Decks by Laura Thomson

Other Decks in Technology

Transcript

  1. Science! Science Harder!
    How we reinvented ourselves to be
    data literate, experiment-driven, and ship faster
    Rebecca Weiss & Laura Thomson (@lxt)
    {rweiss, laura}@mozilla.com

    View Slide

  2. Science! Science Harder!
    How we are reinventeding ourselves to be
    data literate, experiment-driven, and ship faster
    Rebecca Weiss & Laura Thomson (@lxt)
    {rweiss, laura}@mozilla.com

    View Slide

  3. In the beginning...

    View Slide

  4. Data is bad.
    It’s bad. And you’re bad for wanting it.
    You carry only 3 things as you enter the wilderness:
    1. Downloads
    2. Blocklist pings
    3. Crash data (kept in a monastery, guarded by ninjas)

    View Slide

  5. View Slide

  6. Decision-making,
    without evidence
    (also bad)

    View Slide

  7. ! Did this code change affect
    retention?
    ! Are we regressing on Firefox
    performance?
    ! Is crashiness getting worse or
    better?
    ! How often do our users
    experience janky behavior?
    ! How many active users do we
    have, anyway?

    View Slide

  8. Browser = app?
    Not quite.

    View Slide

  9. Browser = service?
    Sort of.

    View Slide

  10. =

    View Slide

  11. Failure to abstract
    Build system to answer question.
    Answer the question! Yay!
    Ask more questions. Fail.
    Build new measurement system.

    View Slide

  12. ! Blocklist ping
    ○ Disables badness (default on, for your protection)
    ! Telemetry (v1)
    ○ Aggregate performance measurement (default off)
    ! Firefox Health Report
    ○ Very limited individual-level longitudinal measurement (default on)
    ○ Extended individual-level longitudinal measurement (default off)
    ! Crash stats
    ○ Crash details (default off)
    Measuring by accident, reluctantly

    View Slide

  13. View Slide

  14. View Slide

  15. Need one data collection
    system, not four*
    *there are more than four. perhaps a lot more.

    View Slide

  16. Also, privacy

    View Slide

  17. View Slide

  18. Meanwhile,
    in the real
    world...
    From http://flickr.com/photo/44124323641@N01/246805948, CC-Generic 2.0 with attribution

    View Slide

  19. View Slide

  20. View Slide

  21. Enabling a data culture
    Based on scientific principles
    Embrace data
    consistency and
    reliability over
    precision
    Prioritize self-service
    over personalized
    tools
    Build a badass
    experimental
    platform ASAP

    View Slide

  22. A single data
    collection system

    View Slide

  23. Unified Telemetry
    One infrastructure, many types of probes and pings
    Many types of events to measure
    One good system vs many half baked
    Unify mental models
    One set of infrastructure quirks
    One set of errors
    By Mike Wutzler AKA Darth Mike (Own work) [GFDL (http://www.gnu.org/
    copyleft/fdl.html), CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/
    3.0/) or FAL], via Wikimedia Commons

    View Slide

  24. Choose self-service
    services

    View Slide

  25. Before

    View Slide

  26. !re:dash: automatic dashboarding for simple
    requests
    !Airflow: monitoring and job scheduling for
    dataset creation
    !Spark and Parquet: standardized analytical
    model and data format
    !Presto: SQL access to many data sets
    !And more...
    Now

    View Slide

  27. Reproducible knowledge generation
    Experimented
    Tabulated
    Version controlled
    Productionized
    Democratized

    View Slide

  28. Transparency =
    citation needed
    Results must be reproducible as a URL
    Audit a number all the way down to the code
    Enforce transparent model of work
    Open science, open methodology
    https://xkcd.com/285/
    Transparency = [citation needed]

    View Slide

  29. Success smells like
    https://sql.telemetry.mozilla.org/dashboard/re-dash-health
    623 total users so far

    View Slide

  30. Public reports you can use
    https://metrics.mozilla.com/firefox-hardware-report/

    View Slide

  31. Experimenting with
    experiments

    View Slide

  32. View Slide

  33. View Slide

  34. Previously:
    Funnelcakes
    Telemetry experiments
    Test Pilot (Mark I)
    Weird science
    By The original uploader was Lorax at English Wikipedia (Taken by user Lorax and released under the GNU FDL) [GFDL (http://www.gnu.
    copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons

    View Slide

  35. These kind of sucked:
    High barriers to entry
    Inconsistent measurement approaches
    Not production-y
    Science needs to be open

    View Slide

  36. Part of core Firefox, modularized into add-
    ons
    Build/test against existing Firefox builds
    Update via xpi, independent of Fx
    Update up to daily (for now) on any release
    channel
    (Not really an experimental mechanism)
    System add-ons
    Photo by Francesco Lodolo, l10n team.

    View Slide

  37. Mark II
    Test Pilot
    Users opt-in to install and try whole new
    features with extended measurement
    Successful features graduate to Firefox
    - e.g. Firefox Screenshots
    More:
    https://testpilot.firefox.com/

    View Slide

  38. Real sampling!
    Flip a preference, install
    a new feature, collect
    data
    Multiple experiments in
    flight at once
    SHIELD Studies

    View Slide

  39. Run Firefox Run

    View Slide

  40. View Slide

  41. Data will shape the Web.
    What kind of Web do you want?
    [email protected] / @lxt
    [email protected] / @rweiss

    View Slide