Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A/B Testing Got You Elected Mister President

A/B Testing Got You Elected Mister President

Penelope Phippen

April 06, 2013
Tweet

More Decks by Penelope Phippen

Other Decks in Technology

Transcript

  1. A/B Testing
    Got you elected
    Mister President

    View full-size slide

  2. @samphippen
    @samphippen

    View full-size slide

  3. Should I make
    this change?

    View full-size slide

  4. Users
    A group: 50% B group: 50%
    Site change
    Old site

    View full-size slide

  5. Measure some metric

    View full-size slide

  6. Do maths on
    the two
    groups

    View full-size slide

  7. Lemme show you my
    favourite A/B test

    View full-size slide

  8. Also some videos

    View full-size slide

  9. +$60 million

    View full-size slide

  10. Same user always sees
    same version

    View full-size slide

  11. Roughly same
    performance

    View full-size slide

  12. Also for feature flagging

    View full-size slide

  13. A super lightning fast
    guide on how to do it and
    what it looks like

    View full-size slide

  14. require 'split/dashboard'
    run Rack::URLMap.new \
    "/" => YourApp::Application,
    "/split" => Split::Dashboard.new

    View full-size slide

  15. <% ab_test("experiment_name", "a", "b") do |c| %>

    Get points?

    <% end %>

    View full-size slide

  16. What it looks like

    View full-size slide

  17. https://github.com/
    andrew/split

    View full-size slide

  18. How to interpret the
    results

    View full-size slide

  19. Confidence
    Value

    View full-size slide

  20. P =0.95
    is used in medical
    trials

    View full-size slide

  21. Common mistake:
    Assumption of normality

    View full-size slide

  22. This will probably work
    for you

    View full-size slide

  23. How to design the
    experiment

    View full-size slide

  24. Step 1: clearly
    state your
    hypothesis

    View full-size slide

  25. Example:
    I will get more donations
    if our button is jimmy
    wale’s face

    View full-size slide

  26. Formally:
    Null Hypothesis: there
    will be no increase in
    donations if we use
    jimmy wales face

    View full-size slide

  27. Formally:
    positive Hypothesis: there
    will be an increase in
    donations if we use
    jimmy wales face

    View full-size slide

  28. Step 2: Pick a statistical
    test

    View full-size slide

  29. Example: difference of
    proportions (the
    standard A/b test)

    View full-size slide

  30. http://stattrek.com/
    hypothesis-test/
    difference-in-
    proportions.aspx

    View full-size slide

  31. Step 3: Decide an
    experiment length
    (number of days)

    View full-size slide

  32. Example: we get 200 hits
    a day, let’s test for 15
    days for 3000 hits

    View full-size slide

  33. Alternatively: A fixed
    sample size
    Stop after 10000 users

    View full-size slide

  34. Step 4: Split

    View full-size slide

  35. Half the users get jimmy
    wales face
    half the users get
    whatever the button
    was before

    View full-size slide

  36. Step 5: inspect results
    and analyse

    View full-size slide

  37. Let’s talk about analysis

    View full-size slide

  38. Let’s work two examples
    (one null, one positive)

    View full-size slide

  39. With
    jimmy
    Without
    Jimmy
    Users in
    test 100 100
    Users
    that
    clicked
    27 18

    View full-size slide

  40. Confidence = 93.6%
    Too low at 95% to
    conclude that this is
    better

    View full-size slide

  41. common mistake:
    Sample size

    View full-size slide

  42. With
    jimmy
    Without
    Jimmy
    Users in
    test 1000 1000
    Users
    that
    clicked
    270 180

    View full-size slide

  43. 99.9% confidence
    High enough for us to
    declare this better

    View full-size slide

  44. Confounding
    factors ARE
    bad

    View full-size slide

  45. this is hard stuff
    I hope you understood :)
    ask me questions
    @samphippen

    View full-size slide