Slide 1

Slide 1 text

Load Testing with 1,000,000 Users Sebastian Cohnen, @tisba
 stormforger.com, @StormForgerApp
 WebPerfDays Barcelona 2014

Slide 2

Slide 2 text

The year is 2014… • Still no global IPv6 rollout • But we finally have .technology, .domains, .xyz and .guru TLDs
 • TV Shows are getting interactive

Slide 3

Slide 3 text

Quizduell • alias "QuizReto", "QuizClash", … • Mobile Quiz Game/App • >30M players worldwide • >14M in Germany

Slide 4

Slide 4 text

Let’s make an interactive TV show out of it!

Slide 5

Slide 5 text

"Quizduell im Ersten" VS

Slide 6

Slide 6 text

Behind the Scenes • In-App Web View using AngularJS • HTTP & JSON API written in Go • Hosted on Google App Engine

Slide 7

Slide 7 text

Show Premiere • May, 12th 2014 • ~1.6M viewers • 200,000 pre-registered players

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Nothing worked… :-/ • The very first round of “Team Germany” failed • Bad press; much speculation
 about hackers, [D]DoS etc.

Slide 10

Slide 10 text

Disaster Recovery

Slide 11

Slide 11 text

I was called in

Slide 12

Slide 12 text

During the next days… • large scale load testing to provide insights • profiling, debugging, refactoring, …

Slide 13

Slide 13 text

The Challenge?

Slide 14

Slide 14 text

• This is what makes the show “interactive" • API polling every 1-10 sec • “server-side DDoS orchestration” (synchronized state polling & you have to answer questions within 15 sec) TV Synchronicity

Slide 15

Slide 15 text

Y U NO LOAD TEST BEFORE!?

Slide 16

Slide 16 text

pre launch load tests:
 up to 85k rps (~250k Users)

Slide 17

Slide 17 text

pre launch load tests:
 up to 85k rps (~250k Users) New load tests:
 Up to 330k rps (~1M Users)

Slide 18

Slide 18 text

Remember… Call* before load testing with 1M Users! *even Google

Slide 19

Slide 19 text

May 21st 2014:
 First try with App…

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

SUCCESS!

Slide 22

Slide 22 text

Issues • Google DoS Protection • Understand Google App Engine’s tuning & scaling knobs • Runtime environment on App Engine is not transparent

Slide 23

Slide 23 text

The Actual Problem • Customer insisted on last minute changes to the backend, mostly real-time statistic related • no time to load test again prior to show premiere

Slide 24

Slide 24 text

And the moral of this story… Do continuous load testing!* *e.g. with stormforger.com :)

Slide 25

Slide 25 text

Thanks! Sebastian Cohnen (@tisba), stormforger.com Article: http://bit.ly/Loadtest1MUsers

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

Test Setup • 50 Load Generators (AWS EC2 Ireland) • 800 cores, 1.5 TB RAM, lot’s of bandwidth • 3.3 TB data moved in over 2B requests • 1M User, 330k rps peak