Why you should run A/B tests in your Android apps

Why you should run A/B tests in your Android apps
Ed, Connie and Jamie

What we’ll cover Surprises Surprising results from experiments we’ve run
at Deliveroo Learnings What we learnt from those experiments Best Practices Best Practices that you should know about before running your own experiments

About Deliveroo • 100k+ restaurants & 150k+ riders • 11
markets • 3 Android apps • ~200 experiments per year

Edward Harker Android Engineer - Menu Connie Reinholdsson Android Engineer
- Loyalty Jamie Adkins Android Engineer - Home

Experiment #1 Changing a button color

8 We can run an A/B test!

What is an A/B test? • A way to compare
multiple versions of a feature against each other • Both versions are live at the same time to different users ◦ Compare apples to apples • Analyse results and release best performing variant

Control Variant

Did more people sign up? or

More sign ups :)

13 A small change can have a big impact!

Experiment #2 Baskets

1 2 3 Look at customer feedback Measure how big
of a problem it is Save the basket to disk What did we do?

Did people order more food? or

More orders :)

22 State restoration matters!

Experiment #3 Preloading the home feed

What is happening during the splash screen? • We need
to do some important work when you open the app • Fetching the user’s location • Fetching feature flag values • We have a pretty animation to hide that work

30% Improvement Time To Interactive

More orders :)

Learnings • Time To Interactive matters! ◦ This was one
of our most successful experiments of 2021 • Only build as much as you need to be able to run the experiment ◦ We only did this initially on iOS ◦ You need to be conﬁdent that the feature works, but maybe those UI tests can wait until later?

Experiment #4 Android 12 Splash

Android 12 Splash Android 12 added a mandatory splash screen
that displays until your app draws its ﬁrst frame

We already had a splash screen Weird double splash screen
effect that feels odd :(

Even worse for plus users Teal -> Purple transition really
doesn’t work

Android 12 Splash Can we achieve similar TTI improvements as
the iOS preloading experiment, without doing all the work? Adopt new android 12 splash format -> shorter animation -> lower TTI -> more orders?

10% Improvement Time To Interactive

Fixed double splash effect

But we removed the plus animation

Less orders :(

Plus users ordered less often

Learnings • Users do care about splash screens! ◦ This
was one of our worst experiments of 2021 • Negative results are not necessarily bad • Challenge your assumptions ◦ We almost shipped this without experimenting • Knowing the exact impact of your work is engaging

Experiment #5 Allowing users to easily switch between Plus subscription
tiers

Deliveroo Plus

Pick Plus subscription …

Unlimited free delivery 🎉

To switch to Silver tier, existing customers have to …
1. Cancel their current subscription 🛑 2. Wait for their subscription cycle to ﬁnish … ⏱ 3. Sign up to the Silver plan … 👆

Experiment: Allow users to easily switch between subscription tiers

Hypothesis The ability to switch plans will decrease cancellations and
thus retain users on Plus for longer. Control Variant

Did Plus retention increase or decrease? or

Plus retention increased

And … more users switched from Silver to Gold!

Learnings • An experiment can provide unexpected results on other
tracked metrics which can be used for customer insight and building future experiments • Solving a known customer problem is a big win! For example, customer will no longer contact our Care teams to ask to switch tiers

Experiment #6 Showing a Plus subscription pop-up

How do we feel about pop-ups? Source: https://wisepops.com/mobile-popup/

Our user perspective: • Can be intrusive and interfere with
the app experience • Can be frustrating when irrelevant or too frequent How do we feel about pop-ups? Our developer perspective: • Pop-ups / dialogs and banners can be difﬁcult to work with if there are lots of them, especially if using a third-party tools to drive in app-messages • Can be hard to test • Can be hard to make accessible • Already handling permissions / errors using dialogs can result in overlap

The Plus Pop-up Experiment Background: We had a number of
promotional pop-ups on the basket screen Hypothesis: Removing these pop-ups will provide insight on how valuable they are in increasing Plus subscriptions

Did removing the pop-up increase or decrease Plus subscriptions ?
or

Plus subscriptions decreased

Learnings • Challenge your assumptions ◦ Pop-ups and dialogs can
be powerful, even if devs dislike them ◦ Experiment on the value of pop-ups to understand their impact

Takeaways Pun not intended

Why should you run A/B tests? • Evaluate the value
of the features that are rolled out • Learnings from one experiment often informs the next one • Data driven decision making • Apples to apples comparison

Best Practices • Decide your metrics before you run the
experiment • Power analysis ◦ How long does our experiment need to run for? • More effort does not equal better results • Avoid overlapping interaction between experiments • Cleanup after experiment ﬁnishes

Determinator https://github.com/deliveroo/determinator • Open source backend tool for determining which
experiments/flags are enabled for a given user • Isolated systems will have the same outcomes for every user, feature flag and experiment. • When increasing and lowering rollout fractions, the same users will be included and excluded at each fraction. • Other tools are available!

Any questions? Also, we’re hiring! careers.deliveroo.co.uk

Why you should run A/B tests in your Android apps

Why you should run A/B tests in your Android apps

More Decks by Jamie Adkins

Other Decks in Technology

Featured

Transcript