Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A/B Testing Got You Elected Mister President
Search
Penelope Phippen
April 06, 2013
Technology
370
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
A/B Testing Got You Elected Mister President
Penelope Phippen
April 06, 2013
More Decks by Penelope Phippen
See All by Penelope Phippen
Introducing Rubyfmt
penelope_zone
0
610
How RSpec Works
penelope_zone
0
6.8k
Quick and easy browser testing using RSpec and Rails 5.1
penelope_zone
1
110
Teaching RSpec to play nice with Rails
penelope_zone
2
170
Little machines that eat strings
penelope_zone
1
130
What is processor (brighton ruby edition)
penelope_zone
0
140
What is processor?
penelope_zone
1
380
extremely defensive coding - rubyconf edition
penelope_zone
0
290
Agile, etc.
penelope_zone
2
260
Other Decks in Technology
See All in Technology
Taking back control of your AI development
inesmontani
PRO
0
110
「コーディング」しない人のための Claude Code 入門 ChatGPT の次の一歩 — 業務に組み込む 育成・共有・自動化
rfdnxbro
2
1.3k
実装は速くなった、レビューはどうする? ― 自身のレビューをAIで再現させるサーヴァントエンジニアリングのすゝめ / Implementation got faster. So what about reviews? — An invitation to Servant Engineering: Recreating your own code reviews with AI
nrslib
8
4.4k
あなたの AI ワークスペースに、 専門コーダーを連れてくる - Amazon Quick Desktop 最新情報
kawaji_scratch
1
120
「エンジニア進化論」2028年の開発完全自動化、エンジニアはどう進化するか
cyberagentdevelopers
PRO
3
1.6k
SIer20年! 培ったスキルがスタートアップで輝く時
shucho0103
0
800
自律型AIエージェントは何を破壊するのか
kojira
0
130
OCI Oracle AI Database Services新機能アップデート(2026/03-2026/05)
oracle4engineer
PRO
0
330
On-behalf-of Token exchange with AgentCore Identity
hironobuiga
2
120
AIソロプレナー時代に2ヶ月で20人増員した事業創造会社の開発組織の話
miyatakoji
0
470
LLMにもCAP定理があるという話
harukasakihara
0
280
機械学習を「社会実装」するということ 2026年夏版 / Social Implementation of Machine Learning June 2026 Version
moepy_stats
3
760
Featured
See All Featured
Side Projects
sachag
455
43k
How Software Deployment tools have changed in the past 20 years
geshan
0
34k
We Are The Robots
honzajavorek
0
240
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
310
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
65
55k
The Pragmatic Product Professional
lauravandoore
37
7.3k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
360
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
2
390
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
720
Information Architects: The Missing Link in Design Systems
soysaucechin
0
960
The Spectacular Lies of Maps
axbom
PRO
1
790
Transcript
A/B Testing Got you elected Mister President
@samphippen @samphippen
Should I make this change?
Users A group: 50% B group: 50% Site change Old
site
Measure some metric
Do maths on the two groups
???
Profit
Lemme show you my favourite A/B test
None
None
None
None
None
None
Also some videos
None
+$60 million
None
Protips
Same user always sees same version
Caching
Roughly same performance
Also for feature flagging
A super lightning fast guide on how to do it
and what it looks like
gem 'split'
require 'split/dashboard' run Rack::URLMap.new \ "/" => YourApp::Application, "/split" =>
Split::Dashboard.new
<% ab_test("experiment_name", "a", "b") do |c| %> <a href="/win" class="btn
<%= c %>"> Get points? </a> <% end %>
What it looks like
None
None
None
https://github.com/ andrew/split
How to interpret the results
Stats time
Confidence Value
P =0.95 is used in medical trials
Common mistake: Assumption of normality
None
This will probably work for you
How to design the experiment
Step 1: clearly state your hypothesis
Example: I will get more donations if our button is
jimmy wale’s face
Formally: Null Hypothesis: there will be no increase in donations
if we use jimmy wales face
Formally: positive Hypothesis: there will be an increase in donations
if we use jimmy wales face
Step 2: Pick a statistical test
Example: difference of proportions (the standard A/b test)
http://stattrek.com/ hypothesis-test/ difference-in- proportions.aspx
Step 3: Decide an experiment length (number of days)
Example: we get 200 hits a day, let’s test for
15 days for 3000 hits
Alternatively: A fixed sample size Stop after 10000 users
Step 4: Split
Half the users get jimmy wales face half the users
get whatever the button was before
Step 5: inspect results and analyse
Let’s talk about analysis
Let’s work two examples (one null, one positive)
With jimmy Without Jimmy Users in test 100 100 Users
that clicked 27 18
Confidence = 93.6% Too low at 95% to conclude that
this is better
common mistake: Sample size
With jimmy Without Jimmy Users in test 1000 1000 Users
that clicked 270 180
99.9% confidence High enough for us to declare this better
Confounding factors ARE bad
this is hard stuff I hope you understood :) ask
me questions @samphippen