$30 off During Our Annual Pro Sale. View Details »

Highly Reliable Tests Data Management

August 25, 2018

Highly Reliable Tests Data Management

Automated tests are a great tool for regression testing, however they are as good as the data they use. There are different approaches to test data management and they all juggle between quality/quantity and its availability. One may use the current data present, other may use an obfuscated subset of production data, another will preload the full db with synthetic data. There is even a costly commercial tools that will do magic ‘data virtualization’ for you.

However, if you want a highly reliable automated tests the best approach is for each test to create all the data that it needs. Using this strategy your tests will be reliable, independent, could be run in parallel, could be run on any environment, on empty or on dirty database, could detect problems that they are not specifically programmed to do. The tests will also be very stable — we achieved 0.13% flakiness, as well as fast — we lowered the execution time from 3 hours to less than 3 minutes.

Those great advantages come at a cost however — you need to completely overhaul your testing framework. This presentation will help you do just that. From deciding which interfaces to use for data insertion to how to abstract this low level functionality at the correct level in your framework.

Test generation at the test case level is only one part of the solution. This presentation will also touch on topics such as random test data generation, strategies for cleaning test data and how to deal with test data if you’re using service virtualization when testing against 3rd party service outside of your control.


August 25, 2018

More Decks by emanuil

Other Decks in Programming


  1. DATA MANAGEMENT HIGHLY RELIABLE TESTS @EmanuilSlavov emo@falcon.io

  2. High Level Automated Tests Problems Slow Unreliable @EmanuilSlavov

  3. 3 hours 3 minutes *Need for Speed: Accelerate Tests from

    3 Hours to 3 Minutes
  4. Falcon’s flaky test rate: 0.13% Google’s flaky test rate: 1.5%*

    *Flaky Tests at Google and How We Mitigate Them @EmanuilSlavov
  5. Each test creates all the data that it needs. The

    way we achieved this @EmanuilSlavov
  6. @EmanuilSlavov

  7. @EmanuilSlavov

  8. The time needed to create data for one test And

    then the test starts Call 12 API endpoints Modify data in 11 tables Takes about 1.2 seconds @EmanuilSlavov
  9. Static vs Dynamic Data

  10. @EmanuilSlavov

  11. @EmanuilSlavov

  12. Eum odit omnis impedit officia adipisci id non. random tweet

    '' Random Sentence Constant String Special Character random tweet Provident ipsa dolor excepturi quo asperiores animi. @someMention & random tweet Dignissimos eos accusamus aut ratione aracely@jenkins.co random tweet Ut optio illum libero. Natus accusantium aliquam dolore atque voluptatum et a. http://ryanpacocha.biz/nikita random tweet @EmanuilSlavov
  13. Service Virtualization Application Facebook Paypal Amazon S3 @EmanuilSlavov

  14. Facebook Application Paypal Amazon S3 Proxy* Service Virtualization *github.com/emanuil/nagual

  15. Existing Tools (March 2016) Transparent Fake SSL certs Dynamic Responses

    Persist State Return Binary Data Regex URL match Stubby4J WireMock Wilma soapUI MockServer mounteback Hoverfly Mirage
  16. @EmanuilSlavov

  17. Independent (run in isolation) Run in random order (do all

    the state setting) Run in parallel (to bring speed) Run on any database (only schema is needed) Easy to investigate (independent data per test) Catch more bugs (using realistic generators) @EmanuilSlavov Advantages
  18. Tips & Tricks

  19. Use an official interface to insert the test data Careful

    when testing in production - write operations Don’t expose test-only endpoints to the outside world @EmanuilSlavov
  20. Test Data Cleaning

  21. Each Test Deletes its Data @EmanuilSlavov

  22. Tag Test Data @EmanuilSlavov

  23. In case of a Dedicated Test Environment @EmanuilSlavov

  24. Other Test Data Strategies

  25. Use a dedicated test data set Use (sanitized) production data

    Seed a DB with test data before all tests start @EmanuilSlavov
  26. None
  27. FALCON.IO WE’RE HIRING. Sofia · Copenhagen · Budapest

  28. emo@falcon.io @EmanuilSlavov EmanuilSlavov.com