Load Testing with
1,000,000 Users
Sebastian Cohnen, @tisba
stormforger.com, @StormForgerApp
WebPerfDays Barcelona 2014
Slide 2
Slide 2 text
The year is 2014…
• Still no global IPv6 rollout
• But we finally have .technology, .domains, .xyz
and .guru TLDs
• TV Shows are getting interactive
Slide 3
Slide 3 text
Quizduell
• alias "QuizReto", "QuizClash", …
• Mobile Quiz Game/App
• >30M players worldwide
• >14M in Germany
Slide 4
Slide 4 text
Let’s make an interactive
TV show out of it!
Slide 5
Slide 5 text
"Quizduell im Ersten"
VS
Slide 6
Slide 6 text
Behind the Scenes
• In-App Web View using AngularJS
• HTTP & JSON API written in Go
• Hosted on Google App Engine
Slide 7
Slide 7 text
Show Premiere
• May, 12th 2014
• ~1.6M viewers
• 200,000 pre-registered players
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
Nothing worked… :-/
• The very first round of “Team Germany” failed
• Bad press; much speculation
about hackers, [D]DoS etc.
Slide 10
Slide 10 text
Disaster Recovery
Slide 11
Slide 11 text
I was called in
Slide 12
Slide 12 text
During the next days…
• large scale load testing to provide insights
• profiling, debugging, refactoring, …
Slide 13
Slide 13 text
The Challenge?
Slide 14
Slide 14 text
• This is what makes the show “interactive"
• API polling every 1-10 sec
• “server-side DDoS orchestration” (synchronized
state polling & you have to answer questions within
15 sec)
TV Synchronicity
Slide 15
Slide 15 text
Y U NO
LOAD TEST BEFORE!?
Slide 16
Slide 16 text
pre launch load tests:
up to 85k rps (~250k Users)
Slide 17
Slide 17 text
pre launch load tests:
up to 85k rps (~250k Users)
New load tests:
Up to 330k rps (~1M Users)
Slide 18
Slide 18 text
Remember…
Call* before load
testing with 1M Users!
*even Google
Slide 19
Slide 19 text
May 21st 2014:
First try with App…
Slide 20
Slide 20 text
No content
Slide 21
Slide 21 text
SUCCESS!
Slide 22
Slide 22 text
Issues
• Google DoS Protection
• Understand Google App Engine’s tuning & scaling
knobs
• Runtime environment on App Engine is not
transparent
Slide 23
Slide 23 text
The Actual Problem
• Customer insisted on last minute changes to the
backend, mostly real-time statistic related
• no time to load test again prior to show premiere
Slide 24
Slide 24 text
And the moral of this story…
Do continuous load testing!*
*e.g. with stormforger.com :)
Slide 25
Slide 25 text
Thanks!
Sebastian Cohnen (@tisba), stormforger.com
Article: http://bit.ly/Loadtest1MUsers
Slide 26
Slide 26 text
No content
Slide 27
Slide 27 text
Test Setup
• 50 Load Generators (AWS EC2 Ireland)
• 800 cores, 1.5 TB RAM, lot’s of bandwidth
• 3.3 TB data moved in over 2B requests
• 1M User, 330k rps peak