Slide 1

Slide 1 text

Science! Science Harder! How we reinvented ourselves to be data literate, experiment-driven, and ship faster Rebecca Weiss & Laura Thomson (@lxt) {rweiss, laura}@mozilla.com

Slide 2

Slide 2 text

Science! Science Harder! How we are reinventeding ourselves to be data literate, experiment-driven, and ship faster Rebecca Weiss & Laura Thomson (@lxt) {rweiss, laura}@mozilla.com

Slide 3

Slide 3 text

In the beginning...

Slide 4

Slide 4 text

Data is bad. It’s bad. And you’re bad for wanting it. You carry only 3 things as you enter the wilderness: 1. Downloads 2. Blocklist pings 3. Crash data (kept in a monastery, guarded by ninjas)

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Decision-making, without evidence (also bad)

Slide 7

Slide 7 text

! Did this code change affect retention? ! Are we regressing on Firefox performance? ! Is crashiness getting worse or better? ! How often do our users experience janky behavior? ! How many active users do we have, anyway?

Slide 8

Slide 8 text

Browser = app? Not quite.

Slide 9

Slide 9 text

Browser = service? Sort of.

Slide 10

Slide 10 text

=

Slide 11

Slide 11 text

Failure to abstract Build system to answer question. Answer the question! Yay! Ask more questions. Fail. Build new measurement system.

Slide 12

Slide 12 text

! Blocklist ping ○ Disables badness (default on, for your protection) ! Telemetry (v1) ○ Aggregate performance measurement (default off) ! Firefox Health Report ○ Very limited individual-level longitudinal measurement (default on) ○ Extended individual-level longitudinal measurement (default off) ! Crash stats ○ Crash details (default off) Measuring by accident, reluctantly

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Need one data collection system, not four* *there are more than four. perhaps a lot more.

Slide 16

Slide 16 text

Also, privacy

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Meanwhile, in the real world... From http://flickr.com/photo/44124323641@N01/246805948, CC-Generic 2.0 with attribution

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Enabling a data culture Based on scientific principles Embrace data consistency and reliability over precision Prioritize self-service over personalized tools Build a badass experimental platform ASAP

Slide 22

Slide 22 text

A single data collection system

Slide 23

Slide 23 text

Unified Telemetry One infrastructure, many types of probes and pings Many types of events to measure One good system vs many half baked Unify mental models One set of infrastructure quirks One set of errors By Mike Wutzler AKA Darth Mike (Own work) [GFDL (http://www.gnu.org/ copyleft/fdl.html), CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/ 3.0/) or FAL], via Wikimedia Commons

Slide 24

Slide 24 text

Choose self-service services

Slide 25

Slide 25 text

Before

Slide 26

Slide 26 text

!re:dash: automatic dashboarding for simple requests !Airflow: monitoring and job scheduling for dataset creation !Spark and Parquet: standardized analytical model and data format !Presto: SQL access to many data sets !And more... Now

Slide 27

Slide 27 text

Reproducible knowledge generation Experimented Tabulated Version controlled Productionized Democratized

Slide 28

Slide 28 text

Transparency = citation needed Results must be reproducible as a URL Audit a number all the way down to the code Enforce transparent model of work Open science, open methodology https://xkcd.com/285/ Transparency = [citation needed]

Slide 29

Slide 29 text

Success smells like https://sql.telemetry.mozilla.org/dashboard/re-dash-health 623 total users so far

Slide 30

Slide 30 text

Public reports you can use https://metrics.mozilla.com/firefox-hardware-report/

Slide 31

Slide 31 text

Experimenting with experiments

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Previously: Funnelcakes Telemetry experiments Test Pilot (Mark I) Weird science By The original uploader was Lorax at English Wikipedia (Taken by user Lorax and released under the GNU FDL) [GFDL (http://www.gnu. copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons

Slide 35

Slide 35 text

These kind of sucked: High barriers to entry Inconsistent measurement approaches Not production-y Science needs to be open

Slide 36

Slide 36 text

Part of core Firefox, modularized into add- ons Build/test against existing Firefox builds Update via xpi, independent of Fx Update up to daily (for now) on any release channel (Not really an experimental mechanism) System add-ons Photo by Francesco Lodolo, l10n team.

Slide 37

Slide 37 text

Mark II Test Pilot Users opt-in to install and try whole new features with extended measurement Successful features graduate to Firefox - e.g. Firefox Screenshots More: https://testpilot.firefox.com/

Slide 38

Slide 38 text

Real sampling! Flip a preference, install a new feature, collect data Multiple experiments in flight at once SHIELD Studies

Slide 39

Slide 39 text

Run Firefox Run

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Data will shape the Web. What kind of Web do you want? [email protected] / @lxt [email protected] / @rweiss