Numbers
99.95 availability
Max 250ms latency
$2B of transactions processed per day
3 deployments in fortune 500
companies
Growing 2 person / week (Hiring!)
Slide 5
Slide 5 text
A typical deployment
Pulse server
App engine
SQL Databases
Zookeeper
RabbitMq
Cassandra
18 machines
Slide 6
Slide 6 text
The largest deployment
2 Datacenters
6 environments
150 Servers
13Tb of memory
3000 cores
250Tb of storage
Slide 7
Slide 7 text
Go beyond unit tests
End to end functionality
Fault-tolerance
High-availability
Machine / datacenter synchronization
Slide 8
Slide 8 text
Docker and bash
Hard to maintain
Hard to expand
Hard to CI
Hard to test scripts
Hard to script tests
Slide 9
Slide 9 text
System-tests framework
Slide 10
Slide 10 text
System-tests framework
Slide 11
Slide 11 text
System-tests framework
Write system-tests like unit tests
Slide 12
Slide 12 text
System-tests framework
Write system-tests like unit tests
Slide 13
Slide 13 text
Feature highlights
Set a multi-container, multi-datacenter
test scenario in couple of lines
Easy to assert cluster states,
connections, applications, etc.
Make deployment-like scenario system-
tests like you do a unit test
Slide 14
Slide 14 text
Feature highlights
Abstracted container creation, setup,
volumes complexity from the tests
Integrated in CI environment (jenkins)
Developers easily implement system-
tests
Slide 15
Slide 15 text
Bonus – production firewalls
All containers are blocked by a firewall
When testing, open only the necessary
ports
Ensures functionality with production
security
Slide 16
Slide 16 text
Bumps along the way
Docker demon may have some
concurrency problems and some calls
fail in some tests
Containers would not be removed
Solved by doing retries on some docker
calls
Slide 17
Slide 17 text
Bumps along the way
Docker-java leaves TCP IP sockets stuck
(TCP-WAIT)
No sockets available during a build run
Workaround by enabling socket reuse
at kernel level
Slide 18
Slide 18 text
Bumps along the way
Before 1.9, container IP would change
when containers are stopped
/etc/hosts updated on every docker run
After 1.9, IPs are static and more
predictable
Containers always keep their IP
Slide 19
Slide 19 text
Current status
128Gb RAM
24 CPU cores
over SSD
315 tests
3h build
7 parallel tests running
Up to 100 parallel containers running
bottlene
ck
Slide 20
Slide 20 text
Next steps
Integrate more images
Docker swarm integration
Distributed test execution
Slide 21
Slide 21 text
Thank You
Slide 22
Slide 22 text
Visit www.feedzai.com/wickedsmart
LIVE EVENTS | MEET THE PEOPLE BEHIND THE MACHINE