$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A/B Testing Got You Elected Mister President
Search
Penelope Phippen
April 06, 2013
Technology
1
370
A/B Testing Got You Elected Mister President
Penelope Phippen
April 06, 2013
Tweet
Share
More Decks by Penelope Phippen
See All by Penelope Phippen
Introducing Rubyfmt
penelope_zone
0
590
How RSpec Works
penelope_zone
0
6.7k
Quick and easy browser testing using RSpec and Rails 5.1
penelope_zone
1
96
Teaching RSpec to play nice with Rails
penelope_zone
2
160
Little machines that eat strings
penelope_zone
1
110
What is processor (brighton ruby edition)
penelope_zone
0
130
What is processor?
penelope_zone
1
370
extremely defensive coding - rubyconf edition
penelope_zone
0
280
Agile, etc.
penelope_zone
2
240
Other Decks in Technology
See All in Technology
SQLだけでマイグレーションしたい!
makki_d
0
1.2k
「もしもデータ基盤開発で『強くてニューゲーム』ができたなら今の僕はどんなデータ基盤を作っただろう」
aeonpeople
0
230
_第4回__AIxIoTビジネス共創ラボ紹介資料_20251203.pdf
iotcomjpadmin
0
130
半年で、AIゼロ知識から AI中心開発組織の変革担当に至るまで
rfdnxbro
0
140
Authlete で実装する MCP OAuth 認可サーバー #CIMD の実装を添えて
watahani
0
160
シニアソフトウェアエンジニアになるためには
kworkdev
PRO
3
270
ペアーズにおけるAIエージェント 基盤とText to SQLツールの紹介
hisamouna
2
1.6k
AI駆動開発の実践とその未来
eltociear
1
480
Next.js 16の新機能 Cache Components について
sutetotanuki
0
170
20251218_AIを活用した開発生産性向上の全社的な取り組みの進め方について / How to proceed with company-wide initiatives to improve development productivity using AI
yayoi_dd
0
650
「図面」から「法則」へ 〜メタ視点で読み解く現代のソフトウェアアーキテクチャ〜
scova0731
0
490
『君の名は』と聞く君の名は。 / Your name, you who asks for mine.
nttcom
1
110
Featured
See All Featured
A Modern Web Designer's Workflow
chriscoyier
698
190k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
400
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.3k
Paper Plane (Part 1)
katiecoart
PRO
0
1.9k
GraphQLとの向き合い方2022年版
quramy
50
14k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
Speed Design
sergeychernyshev
33
1.4k
ラッコキーワード サービス紹介資料
rakko
0
1.8M
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
29
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
0
300
Transcript
A/B Testing Got you elected Mister President
@samphippen @samphippen
Should I make this change?
Users A group: 50% B group: 50% Site change Old
site
Measure some metric
Do maths on the two groups
???
Profit
Lemme show you my favourite A/B test
None
None
None
None
None
None
Also some videos
None
+$60 million
None
Protips
Same user always sees same version
Caching
Roughly same performance
Also for feature flagging
A super lightning fast guide on how to do it
and what it looks like
gem 'split'
require 'split/dashboard' run Rack::URLMap.new \ "/" => YourApp::Application, "/split" =>
Split::Dashboard.new
<% ab_test("experiment_name", "a", "b") do |c| %> <a href="/win" class="btn
<%= c %>"> Get points? </a> <% end %>
What it looks like
None
None
None
https://github.com/ andrew/split
How to interpret the results
Stats time
Confidence Value
P =0.95 is used in medical trials
Common mistake: Assumption of normality
None
This will probably work for you
How to design the experiment
Step 1: clearly state your hypothesis
Example: I will get more donations if our button is
jimmy wale’s face
Formally: Null Hypothesis: there will be no increase in donations
if we use jimmy wales face
Formally: positive Hypothesis: there will be an increase in donations
if we use jimmy wales face
Step 2: Pick a statistical test
Example: difference of proportions (the standard A/b test)
http://stattrek.com/ hypothesis-test/ difference-in- proportions.aspx
Step 3: Decide an experiment length (number of days)
Example: we get 200 hits a day, let’s test for
15 days for 3000 hits
Alternatively: A fixed sample size Stop after 10000 users
Step 4: Split
Half the users get jimmy wales face half the users
get whatever the button was before
Step 5: inspect results and analyse
Let’s talk about analysis
Let’s work two examples (one null, one positive)
With jimmy Without Jimmy Users in test 100 100 Users
that clicked 27 18
Confidence = 93.6% Too low at 95% to conclude that
this is better
common mistake: Sample size
With jimmy Without Jimmy Users in test 1000 1000 Users
that clicked 270 180
99.9% confidence High enough for us to declare this better
Confounding factors ARE bad
this is hard stuff I hope you understood :) ask
me questions @samphippen