Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why you should run A/B tests in your Android apps

Why you should run A/B tests in your Android apps

Jamie Adkins

October 06, 2022
Tweet

More Decks by Jamie Adkins

Other Decks in Technology

Transcript

  1. What we’ll cover Surprises Surprising results from experiments we’ve run

    at Deliveroo Learnings What we learnt from those experiments Best Practices Best Practices that you should know about before running your own experiments
  2. About Deliveroo • 100k+ restaurants & 150k+ riders • 11

    markets • 3 Android apps • ~200 experiments per year
  3. What is an A/B test? • A way to compare

    multiple versions of a feature against each other • Both versions are live at the same time to different users ◦ Compare apples to apples • Analyse results and release best performing variant
  4. 1 2 3 Look at customer feedback Measure how big

    of a problem it is Save the basket to disk What did we do?
  5. What is happening during the splash screen? • We need

    to do some important work when you open the app • Fetching the user’s location • Fetching feature flag values • We have a pretty animation to hide that work
  6. Learnings • Time To Interactive matters! ◦ This was one

    of our most successful experiments of 2021 • Only build as much as you need to be able to run the experiment ◦ We only did this initially on iOS ◦ You need to be confident that the feature works, but maybe those UI tests can wait until later?
  7. Android 12 Splash Android 12 added a mandatory splash screen

    that displays until your app draws its first frame
  8. Android 12 Splash Can we achieve similar TTI improvements as

    the iOS preloading experiment, without doing all the work? Adopt new android 12 splash format -> shorter animation -> lower TTI -> more orders?
  9. Learnings • Users do care about splash screens! ◦ This

    was one of our worst experiments of 2021 • Negative results are not necessarily bad • Challenge your assumptions ◦ We almost shipped this without experimenting • Knowing the exact impact of your work is engaging
  10. To switch to Silver tier, existing customers have to …

    1. Cancel their current subscription 🛑 2. Wait for their subscription cycle to finish … ⏱ 3. Sign up to the Silver plan … 👆
  11. Hypothesis The ability to switch plans will decrease cancellations and

    thus retain users on Plus for longer. Control Variant
  12. Learnings • An experiment can provide unexpected results on other

    tracked metrics which can be used for customer insight and building future experiments • Solving a known customer problem is a big win! For example, customer will no longer contact our Care teams to ask to switch tiers
  13. Our user perspective: • Can be intrusive and interfere with

    the app experience • Can be frustrating when irrelevant or too frequent How do we feel about pop-ups? Our developer perspective: • Pop-ups / dialogs and banners can be difficult to work with if there are lots of them, especially if using a third-party tools to drive in app-messages • Can be hard to test • Can be hard to make accessible • Already handling permissions / errors using dialogs can result in overlap
  14. The Plus Pop-up Experiment Background: We had a number of

    promotional pop-ups on the basket screen Hypothesis: Removing these pop-ups will provide insight on how valuable they are in increasing Plus subscriptions
  15. Learnings • Challenge your assumptions ◦ Pop-ups and dialogs can

    be powerful, even if devs dislike them ◦ Experiment on the value of pop-ups to understand their impact
  16. Why should you run A/B tests? • Evaluate the value

    of the features that are rolled out • Learnings from one experiment often informs the next one • Data driven decision making • Apples to apples comparison
  17. Best Practices • Decide your metrics before you run the

    experiment • Power analysis ◦ How long does our experiment need to run for? • More effort does not equal better results • Avoid overlapping interaction between experiments • Cleanup after experiment finishes
  18. Determinator https://github.com/deliveroo/determinator • Open source backend tool for determining which

    experiments/flags are enabled for a given user • Isolated systems will have the same outcomes for every user, feature flag and experiment. • When increasing and lowering rollout fractions, the same users will be included and excluded at each fraction. • Other tools are available!