Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Highly Reliable Tests Data Management

emanuil
August 25, 2018

Highly Reliable Tests Data Management

Automated tests are a great tool for regression testing, however they are as good as the data they use. There are different approaches to test data management and they all juggle between quality/quantity and its availability. One may use the current data present, other may use an obfuscated subset of production data, another will preload the full db with synthetic data. There is even a costly commercial tools that will do magic ‘data virtualization’ for you.

However, if you want a highly reliable automated tests the best approach is for each test to create all the data that it needs. Using this strategy your tests will be reliable, independent, could be run in parallel, could be run on any environment, on empty or on dirty database, could detect problems that they are not specifically programmed to do. The tests will also be very stable — we achieved 0.13% flakiness, as well as fast — we lowered the execution time from 3 hours to less than 3 minutes.

Those great advantages come at a cost however — you need to completely overhaul your testing framework. This presentation will help you do just that. From deciding which interfaces to use for data insertion to how to abstract this low level functionality at the correct level in your framework.

Test generation at the test case level is only one part of the solution. This presentation will also touch on topics such as random test data generation, strategies for cleaning test data and how to deal with test data if you’re using service virtualization when testing against 3rd party service outside of your control.

emanuil

August 25, 2018
Tweet

More Decks by emanuil

Other Decks in Programming

Transcript

  1. DATA MANAGEMENT
    HIGHLY RELIABLE TESTS
    @EmanuilSlavov
    [email protected]

    View full-size slide

  2. High Level Automated Tests Problems
    Slow
    Unreliable
    @EmanuilSlavov

    View full-size slide

  3. 3
    hours
    3
    minutes
    *Need for Speed: Accelerate Tests from 3 Hours to 3 Minutes

    View full-size slide

  4. Falcon’s flaky test rate: 0.13%
    Google’s flaky test rate: 1.5%*
    *Flaky Tests at Google and How We Mitigate Them
    @EmanuilSlavov

    View full-size slide

  5. Each test creates all the data that it needs.
    The way we achieved this
    @EmanuilSlavov

    View full-size slide

  6. @EmanuilSlavov

    View full-size slide

  7. @EmanuilSlavov

    View full-size slide

  8. The time needed to create data for one test
    And then the test starts
    Call 12 API endpoints
    Modify data in 11 tables
    Takes about 1.2 seconds
    @EmanuilSlavov

    View full-size slide

  9. Static vs Dynamic Data

    View full-size slide

  10. @EmanuilSlavov

    View full-size slide

  11. @EmanuilSlavov

    View full-size slide

  12. Eum odit omnis impedit officia adipisci id non. random tweet ''
    Random Sentence Constant String Special Character
    random tweet Provident ipsa dolor excepturi quo asperiores animi. @someMention
    & random tweet Dignissimos eos accusamus aut ratione
    [email protected] random tweet Ut optio illum libero.
    Natus accusantium aliquam dolore atque voluptatum et a. http://ryanpacocha.biz/nikita random tweet
    @EmanuilSlavov

    View full-size slide

  13. Service Virtualization
    Application
    Facebook
    Paypal
    Amazon
    S3
    @EmanuilSlavov

    View full-size slide

  14. Facebook
    Application Paypal
    Amazon
    S3
    Proxy*
    Service Virtualization
    *github.com/emanuil/nagual

    View full-size slide

  15. Existing Tools (March 2016)
    Transparent
    Fake SSL certs
    Dynamic Responses
    Persist State
    Return Binary Data
    Regex URL match
    Stubby4J
    WireMock
    Wilma
    soapUI
    MockServer
    mounteback
    Hoverfly
    Mirage

    View full-size slide

  16. @EmanuilSlavov

    View full-size slide

  17. Independent (run in isolation)
    Run in random order (do all the state setting)
    Run in parallel (to bring speed)
    Run on any database (only schema is needed)
    Easy to investigate (independent data per test)
    Catch more bugs (using realistic generators)
    @EmanuilSlavov
    Advantages

    View full-size slide

  18. Tips & Tricks

    View full-size slide

  19. Use an official interface to insert the test data
    Careful when testing in production - write operations
    Don’t expose test-only endpoints to the outside world
    @EmanuilSlavov

    View full-size slide

  20. Test Data Cleaning

    View full-size slide

  21. Each Test Deletes its Data
    @EmanuilSlavov

    View full-size slide

  22. Tag Test Data
    @EmanuilSlavov

    View full-size slide

  23. In case of a Dedicated Test Environment
    @EmanuilSlavov

    View full-size slide

  24. Other Test Data Strategies

    View full-size slide

  25. Use a dedicated test data set
    Use (sanitized) production data
    Seed a DB with test data before all tests start
    @EmanuilSlavov

    View full-size slide

  26. FALCON.IO
    WE’RE HIRING.
    Sofia · Copenhagen · Budapest

    View full-size slide

  27. [email protected]
    @EmanuilSlavov
    EmanuilSlavov.com

    View full-size slide