Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bulletproof Test Environments

1f686da361195e15bb4e478397a4fc8f?s=47 emanuil
November 12, 2020

Bulletproof Test Environments

We all want fast, stable and reliable automated tests. Usually, we can’t reach those goals because of something outside of our control — our test environments. Or so we think. Test environments are manually cobbled together, a long, long time ago by someone not even working at the company anymore. They are someone else’s problem. They are a house of cards, requiring slow, manual intervention. As it is so hard to create new environment from scratch, they are shared by teams to run both automated and exploratory tests. In order to get stable automated tests results in such environments we have to rely on crutches: add complex retry logic, increase timeouts, quarantine flaky tests.

With the ubiquity of containerization there is a better way. We take control of our test environment. We use code to describe it. We create it on the fly when we need it. It is in the same pristine, clean state each time it starts. It is dedicated only to our automated tests. It has a lifespan as long as our tests are running, and then it is destroyed. It is isolated from the outside world, making it very fast and reliable. In fact, in such an environment at Falcon, we achieved 120x times faster automated tests while keeping the flakiness as low as 0,13%. These environments can also run on your laptop for easy debugging of a micro-services application.

In this talk you’ll learn why it is important to start owning your test environment if you want to achieve unparalleled tests reliability. How to start simple with containers and then gradually make the setup more complex by adding different services. What tools are most useful in simulating services that your team does not own (Authentication, Permissions) as well as any external API dependencies (Cloud Storage, Social Networks). You will learn how to multiply the value of your automated tests by probing deeper in the test environment to find problems that usually only surface in production. This talk is for everyone that works in a fast paced, modern development environment.



November 12, 2020


  1. Bulletproof Test Environments @EmanuilSlavov emo@falcon.io

  2. Test Environment Problems Used for manual and automation tests Lack

    of control and limited access Need constant manual interventions
  3. You should have maximum control over your test environment. described

    as code, start/stop, debug, full control of ext. dependencies [ [
  4. Advantages

  5. 3 hours 3 minutes *Need for Speed: Accelerate Tests from

    3 Hours to 3 Minutes
  6. Falcon’s flaky test rate: 0.13% Google’s flaky test rate: 1.5%*

    *Flaky Tests at Google and How We Mitigate Them
  7. How to Start

  8. None
  9. Execute tests on this service It still talks to the

    ‘old’ test env. Extract single service in a container
  10. Still using the ‘old’ databases Continue extracting the rest of

    the services
  11. Full test env. now fully containerized. Need to simulate for

    truly “hermetic” test env.
  12. External Dependencies to Simulate

  13. Runs in Memory

  14. Test Data Generation * From Highly Reliable Tests Data Management

    Talk Synthetic Test Data Prepare Test Data in Advance The Actual Test Case Starts Bellow
  15. Test Data Generation Advantages Makes it possible to run tests

    in parallel. Easy to debug a failing test. Independent on what test environment it runs on.
  16. Simulate External Dependencies

  17. Service Virtualization: Before Test Env Facebook Paypal Amazon S3

  18. Facebook Test Env Paypal Amazon S3 Proxy* *github.com/emanuil/nagual Service Virtualization:

    After Traffic redirected by:
 1. Entries in /etc/hosts file
 2. Trusted Root Certificate
  19. None
  20. How it all works in practice (Demo)

  21. None
  22. Bonus: Advanced Capabilities

  23. The Faults in Our Logs

  24. Some exceptions are caught, logged and never acted upon Look

    for unexpected error/exceptions in the app logs
  25. None
  26. Bad Data

  27. Bad data depends on the context.

  28. One of those values was zero (0)

  29. Custom Data Integrity Checks

  30. If all tests pass, but there is bad data, then

    fail the test run and investigate.
  31. Baseline Application Stats

  32. Record various application stats after each test run Easy on

    dedicated environment, especially with containers With fast tests* you can tie perf bottlenecks to specific commits
  33. 0 900 1800 2700 3600 Size of the app log

    file: New lines after each commit 54% increase
  34. 0 11500 23000 34500 46000 Total Mongo Queries: Count After

    Each Commit 26% increase
  35. Logs: lines, size, exceptions/errors count DB: read/write queries, transaction time,

    network connections OS: peak CPU and memory usage, swap size, disk i/o Network: 3rd party API calls, packets counts, DNS queries Language Specific: objects created, threads count, GC runs, heap size What data to collect after a test run is completed…
  36. Conclusion Reliable automated test requires full test environment control. Big

    bang not necessary. Start with a single service. Take advantage of advanced defect detection techniques. Create test data on the fly. Figure out how to simulate external dependencies.
  37. FALCON.IO WE’RE HIRING. Sofia · Copenhagen · Budapest · Chennai