True reasons behind software testing

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Us vs Them (startup I work for) (everyone else 😒)

Slide 3

Slide 3 text

6 years later, mobile engineering at Azimo ● Release once per week ● Crash free 99.99% ● Decentralised mobile engineering ● QA culture ● Small team

Slide 4

Slide 4 text

“We don’t follow the hype” - one of Azimo’s company values ● No devices racks shelves ● No 24/7 monkey runners ● Unit tests coverage - probably ~50% ● Keep number of UI and functional tests as low as possible For hard working people Every moment counts None of us is as smart as all of us We don’t follow the hype

Slide 5

Slide 5 text

The beginning ● Zero tests ● Crash free 90-95% ● Monolithic code (~ 3k lines of code classes) 👉 ● Release every 1-3 months Preview of 1/20 of the file 34 A4 pages when printed

Slide 6

Slide 6 text

This 💩 code made the company gaining traction 💰.

Slide 7

Slide 7 text

How to make company earning more money? Deliver faster, iterate more often.

Slide 8

Slide 8 text

The problem: release every 1-3 months The causes I pointed out: ● Zero tests ● Crash free 90-95% ● Monolithic code My remedy: “Freeze product development for 3-6 months and let me build this 👉”

Slide 9

Slide 9 text

Next product change in half a year!!! 😱 Can your company afford that? (*in the end we got 1 full month to improve our codebase)

Slide 10

Slide 10 text

Why our release process wasn’t stable? ● Centralised QA team (Available for us usually on Thursdays) ● Manual testing from the ground up ● A lot of back and forths between QA and devs THURSDAY

Slide 11

Slide 11 text

Monday: Start coding a feature Tuesday: Coding finished Wednesday: 1 day left for QA, let’s add one more feature… Thursday: ... Friday: “It took more than predicted… 󰣻” Let’s wait another 6 days.

Slide 12

Slide 12 text

Next Thursday: QA: “app is crashing”. Me: “Ok, 3 lines of code” QA: “Cool, commit it. We will check it next thursday” Next Friday to Wednesday: Let’s add a few more changes... Next Next Thursday: QA: “app is crashing”. Me: “🤬...”

Slide 13

Slide 13 text

Goal #1: stable release cycle, once per month

Slide 14

Slide 14 text

Goal #1: stable release cycle, once per month How: Unit Tests to decrease back and forths between QA and devs. Supporting metrics: Unit Tests coverage

Slide 15

Slide 15 text

Our rules for unit testing ● Bug, once found, will never be repeated ● Test a logic which is hard to reproduce ● Test tedious things which need to be tested ● Improve code architecture (“if it’s hard to test, it’s a bug”)

Slide 16

Slide 16 text

“It’s easy to fool code coverage metrics.” Yes, unless you have a good purpose to use them. Our goal’s tracker in 2015/16

Slide 17

Slide 17 text

Purpose of measuring test coverage and improving it ● Good practice ● Others do this ● Faster product delivery ● Identify what’s not tested Martin Fowler about test coverage metrics: https://martinfowler.com/bliki/TestCoverage.html

Slide 18

Slide 18 text

Milestone #1: At least 1 release per month What else: ● Crash free 95% -> 99.0% ● Better code architecture (MVP, DI, testing is easier) ● Unit tests coverage 50-60%

Slide 19

Slide 19 text

Goal #2: Release cycle 1 month -> 2 weeks, Crash Free >= 99.0%

Slide 20

Slide 20 text

Goal #2: Release cycle 1 month -> 2 weeks + Crash Free >= 99.0% How: ● QA testers -> QA engineers ● Reduce manual testing as much as possible

Slide 21

Slide 21 text

QA engineers in the team, why now? ● Not possible before code cleanup ● Without unit tests we would automate wrong things - see testing pyramid 👉 ● Internal career progression (QA Tester => QA Engineer) Martin Fowler about testing pyramid: https://martinfowler.com/bliki/TestPyramid.html

Slide 22

Slide 22 text

QA engineers priorities 1. 󰡷 Test new releases (we cannot be slower than 1/mo) 2. 🤖 Automate as much as possible (we have to be faster than 1/mo)

Slide 23

Slide 23 text

Why functional and UI tests?

Slide 24

Slide 24 text

Mobile fragmentation OpenSignal report on Android fragmentation in 2015 (link)

Slide 25

Slide 25 text

Unit tests aren’t enough (esp. after 50-60% test coverage)

Slide 26

Slide 26 text

Test things in the easiest place to test them Backend (monolithic system)

Slide 27

Slide 27 text

UI & functional tests coverage - not % but product features 1. Login, registration 2. Price, transaction, payment 3. Everything else (with the focus on things which take the most of our manual testing time)

Slide 28

Slide 28 text

Milestone #2: Release train - 2 weeks Stable crash-free - 99.0% What else: ● QA engineers in the team ● Hundreds of functional and UI tests ● Unit tests coverage 60-70%

Slide 29

Slide 29 text

Goal #3: Release cycle, 2 weeks to 1 week, Crash Free >= 99.5%

Slide 30

Slide 30 text

Goal #3: Release cycle, 2 weeks -> 1 week + Crash-free >= 99.5% How: ● Breaking changes in testing stack

Slide 31

Slide 31 text

Testing stack was pushed to its limits 🥵 ● 5 hours for full test suite (probably no single successful run) ● Non-measurable flakiness ● Hard to debug (eps. AVDs, ADB) ● No internal competencies to improve test runs management (Fastlane/Ruby)

Slide 32

Slide 32 text

AutomationTestSupervisor Configurable tests sharding Re-run failing tests Multi-level logging AVD, ADB, app logs Testing stack as a code (AVD management, test packages split)

Slide 33

Slide 33 text

1. A few months of development 2. Ruby migrated to Python (our competency) 3. Logs which work for us 4. 5-6 parallel simulators on maxed out Macbook Pro 5. Testing time reduced by 50% (2-3hrs now) AutomationTestSupervisor Full blog post about ATS (link) AutomationTestSupervisor on Github (link)

Slide 34

Slide 34 text

1. 2017 (Xcode 9) - test sharding via command line 2. 2018 (Xcode 10) - test sharding integrated in Xcode UI 3. 4hrs reduced to 1hr due to parallelisation How about iOS? Blog post with full coverage of iOS parallel testing (link)

Slide 35

Slide 35 text

Milestone #3: Release train - 1 week Crash-free - 99.5% What else: ● 2-3hrs for QA ● Full control over testing stack ● Tests parallelisation ● Emerging picture of flakiness

Slide 36

Slide 36 text

Goal #4: Decentralize mobile engineering team Keep 1 week release and crash free 99.5% How: ● Process speed-up through simplifications

Slide 37

Slide 37 text

Challenges ● 20% flakiness ● Overgrown test suite - too many tests, too long to run ● Custom test stack

Slide 38

Slide 38 text

Test things in the easiest right place to test them Backend (microservices system)

Slide 39

Slide 39 text

Speed up test configuration phase Launch app Create recipient Make payment Main screen Check price Assertions on transfer status Launch app Create recipient Make payment Main screen Check price Assertions on transfer status QA utils preconfiguration (microservice) Transfer status screen Transfer status screen configuration Do what you need to do assertions +1s added 30s saved

Slide 40

Slide 40 text

Remove unnecessary tests

Slide 41

Slide 41 text

"Adding is favoured over subtracting in problem solving" “(...) subtractive solutions are also less likely to be appreciated. People might expect to receive less credit for subtractive solutions than for additive ones. A proposal to get rid of something might feel less creative than would coming up with something new to add(...)” Nature | Vol 592 | April 2021 https://www.nature.com/articles/d41586-021-00592-0

Slide 42

Slide 42 text

Results after test count reduction and simplification ● 20% flakiness -> 10% ● Test execution < 2hrs ● Removed ~100 UI/functional tests (~20%) while product was growing Our thoughts on flakiness (link)

Slide 43

Slide 43 text

Milestone #4: Decentralised team (2 teams work in parallel) Release train - 1 week Stable crash-free - 99.5% What else: ● QA test run < 2hrs ● Flakiness <10% ● Tests in the right place

Slide 44

Slide 44 text

Goal #5: Try team’s scalability (2 -> 3 teams working in parallel) + Keep 1 week release train + Improve crash free >99.9%

Slide 45

Slide 45 text

● Keeping it up to date (SDKs, dev tools, emulators) ● Limitations of local machines ● No support from the outside of the world (we’re still team of 5) AutomationTestSupervisor - when unlocker becomes an obstacle

Slide 46

Slide 46 text

Flakiness 10% 300 test = 30 flaky tests 👇 QA engineers insights and manual re-tests double or triple the testing time

Slide 47

Slide 47 text

Firebase Test Lab

Slide 48

Slide 48 text

One step backward ● No control over ADB (e.g. resetting the app or device state) ● Tests debugging 😱 😱 😱 (scrolling through videos, kilometers of logs) ● Harder to get data for our reports (no webhooks, just scraping data from console output). Firebase Test Lab

Slide 49

Slide 49 text

One step forward ● Support from the community https://firebase.community/ ● QA engineer’s machine isn't blocked ● Sharing test results with software engineers by copy/pasting URL addresses (remote work) ● Unlimited scaling. We effortlessly increased from 5 emulators on the local machine to 20 on Firebase Test Lab Firebase Test Lab

Slide 50

Slide 50 text

Results? ● Testing time 2h -> 25min ● flakiness 10% -> 2% (thanks to faster iterations)

Slide 51

Slide 51 text

💰 400$/mo - is it a lot? ● QA machine is not blocked anymore ● Software engineer gets feedback 8x faster ● No maintenance freezes = few hundreds of hours saved per year

Slide 52

Slide 52 text

Firebase Test Lab Implementation details ● Tests sharding with Flank https://flank.github.io/flank/ ● Do we really need video? Better logs instead 👉

Slide 53

Slide 53 text

Milestone #5: Decentralised team Release train - 1 week Stable crash-free > 99.99% 💪 What else: ● QA test run < 30min ● Flakiness 1-2%

Slide 54

Slide 54 text

Goal #6 (the current one): Everyday release (if we want to) + Keep crash free > 99.99%

Slide 55

Slide 55 text

How to release a change within 24hrs? ● Low flakiness and <30min test results mean: ○ QA self-service for software engineers ○ QA engineers responsible for test stack, not just testing ● Better code review process 👉 ● and others: multimodule project, CI/CD in the cloud, Kotlin How we improved code review process (link)

Slide 56

Slide 56 text

6 years later, mobile engineering at Azimo ● Release once per week (soon: on demand) ● Crash free 99.99% ● Decentralised mobile engineering ● QA culture ● Small team

Slide 57

Slide 57 text

Release 1/month, crash-free > 95% Release 1/2-weeks, crash-free 99.0% Release 1/week, crash-free 99.5% Release 1/week, crash-free 99.5% 2 parallel teams Release 1/week, crash-free 99.99% 2-3 parallel teams (ongoing) Release on demand & crash-free 99.99% The journey

Slide 58

Slide 58 text

Don’t follow the hype. Be 1% better each day.

Slide 59

Slide 59 text

References AzimoLabs blog - Series about testing history (5 articles) https://medium.com/azimolabs/the-evolution-of-apps-quality-assurance-at-azimo-b2fa31d5cc5e - Code review process improvements https://medium.com/azimolabs/how-we-improved-code-review-process-in-android-engineering-team-a637dd68cfaa - Parallel testing of iOS app https://medium.com/azimolabs/parallel-testing-get-feedback-earlier-release-faster-b66d4dd08930 - Story behind AutomationTestSupervisor https://medium.com/azimolabs/story-behind-automationtestsupervisor-our-custom-made-tool-for-android-automation-tests-180c74a5cbfb - What is flakiness https://medium.com/azimolabs/what-is-flakiness-and-how-we-deal-with-it-39b270ed5445 Martin Fowler blog - Test coverage - https://martinfowler.com/bliki/TestCoverage.html - Tests pyramid - https://martinfowler.com/bliki/TestPyramid.html Nature Magazine - Adding is favoured over subtracting in problem solving - https://www.nature.com/articles/d41586-021-00592-0

Slide 60

Slide 60 text

Thank you! [email protected] twitter.com/froger_mcs