Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a screenshot testing pipeline that scales

Avatar for Lukas Appelhans Lukas Appelhans
September 21, 2023
150

Building a screenshot testing pipeline that scales

Scaling screenshot tests has been a difficult topic on Android for a while. When rigorously visual regression testing design system components or large scale applications, we face issues of scale such as long CI/CD run times, increased costs or a reduction of productivity due to conflicting pull requests.

This talk will go through some of the tradeoffs being made and proposes a screenshot testing pipeline that combines paparazzi to generate screenshots with reg-suit to create a diffing report in a GCS bucket.

Avatar for Lukas Appelhans

Lukas Appelhans

September 21, 2023
Tweet

Transcript

  1. 9 Choosing which frameworks to use. Structure of this talk

    Building a screenshot testing pipeline. Using the screenshot testing pipeline. Adoption at Mercari and what we’re missing. 02 03 04 01
  2. 10 What is the purpose of screenshot testing? Verifying that

    UI code is rendering correctly. (As opposed to: Verifying that our code’s logic is working correctly.)
  3. 11 How many tests do we expect to have? A

    lot of them. E2E Tests Integration tests Unit tests Screenshot tests
  4. 12 How does screenshot testing differ from other tests? Unit/Integration/E2E

    tests • Given, when, then is clearly specified. • Post-condition can be verified automatically.
  5. 13 How does screenshot testing differ from other tests? Unit/Integration/E2E

    tests • Given, when, then is clearly specified. • Post-condition can be verified automatically. Screenshot tests • Given (UI code with data), then. • Post-condition cannot be verified automatically. (Whether UI code renders “correctly” is up to the viewer.) • Use diffing to find which parts of the UI changed to make manual review easier.
  6. 16 Test cases: Framework choices Paparazzi • Fast and cheap.

    • Easy to set up. • There are some differences between UI rendered using Paparazzi and on-device. @Test fun launchComposable() { paparazzi.snapshot { MyComposable() } }
  7. 17 Test cases: Framework choices Shot • As close to

    the “real world” as possible. • More difficult setup, especially on CI. (Instrumented tests.) • Takes significantly longer to run. • More costly. (You’ll probably run this on Firebase test lab.) @Test fun rendersMyComposable() { composeRule.setContent { MyComposable() } compareScreenshot(composeRule) }
  8. 18 Reporting visual differences: Diffing tools Compare two sets of

    screenshots, make visual differences easy to review. Built-in reporting (e.g. Paparazzi) Reg-suit 02 01
  9. 19 Reporting visual differences: Where to store screenshot files? Store

    directly in git (use git-lfs) Store externally (e.g. GCS bucket) 02 01
  10. 20 Reporting visual differences: What to compare to? In a

    pull request, we want a report of the visual differences between the current state of the branch and when the branch was started. 1 2 3 4 Base commit master HEAD branch HEAD
  11. 22 Generate screenshots 1. Generate screenshots in all modules. ./gradlew

    recordPaparazzi[Debug/Dev/Release] 2. Copy screenshots of each module into a separate directory. cp ./*/src/test/snapshots/images/*.png ./screenshots Or create further subdirectories per module, package, etc.
  12. 23 Generate diff report 3. Use reg-suit to generate report

    and store screenshots in the cloud. export EXPECTED_KEY=$(git merge-base HEAD origin/master) export ACTUAL_KEY=$GITHUB_SHA npx reg-suit run
  13. 24 Generate diff report 3. Use reg-suit to generate report

    and store screenshots in the cloud. export EXPECTED_KEY=$(git merge-base HEAD origin/master) export ACTUAL_KEY=$GITHUB_SHA npx reg-suit run
  14. 26 Configuring reg-suit { …, "plugins": { "reg-simple-keygen-plugin": { "expectedKey":

    "${EXPECTED_KEY}", "actualKey": "${ACTUAL_KEY}" }, … } } ./regconfig.json
  15. 27 Configuring reg-suit { …, "plugins": { …, "reg-publish-gcs-plugin": {

    "bucketName": "gcs-bucket-name" }, … } } ./regconfig.json
  16. 28 Configuring reg-suit { …, "plugins": { …, "reg-notify-github-plugin": {

    "clientId": "your-client-id", "setCommitStatus": false, "shortDescription": true } } } ./regconfig.json
  17. 29 Configuring reg-suit { …, "plugins": { …, "reg-notify-github-plugin": {

    "clientId": "your-client-id", "setCommitStatus": false, "shortDescription": true } } } ./regconfig.json
  18. 30 Configuring reg-suit { …, "plugins": { …, "reg-notify-github-plugin": {

    "clientId": "your-client-id", "setCommitStatus": false, "shortDescription": true } } } ./regconfig.json
  19. 31 Using screenshot tests We know how to build the

    infrastructure, but how do we use it?
  20. 32 How to write tests class ChipScreenshotTest { @get:Rule val

    paparazzi = MercariPaparazzi() // Test cases }
  21. 33 How to write test cases class ChipScreenshotTest { …

    @Test fun shortLabel() = paparazzi.snapshot { Chip( label = "Foo", selected = false, onSelectionChanged = {} ) } }
  22. 39 Current scale ~30 secs ~700 Time it takes to

    generate the screenshots Amount of screenshots
  23. 40 Caveats 1. Code that requires multiple compositions will not

    render correctly. val styles = listOf(Large, Medium, Small) var index by remember { mutableStateOf(0) } Text( maxLines = 1, style = styles[index], onTextLayout = { textLayoutResult -> if (textLayoutResult.hasVisualOverflow) { index = index.plus(1).coerceAtMost(styles.size - 1) } } )
  24. 41 Caveats 1. Code that requires multiple compositions will not

    render correctly. 2. Supporting multiple densities for each test case. In addition, there was a bug in Paparazzi that prevented it to work properly with our build cache. Fixed in version 1.3.
  25. 42 What are we missing? Full screens Adoption Testing full

    screens Adoption outside of Design System Components