Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Present & Future of AI in Mobile Software

The Present & Future of AI in Mobile Software

“The future is already here - it’s just not very evenly distributed” - William Gibson

What’s already happening in startups and with early adopters can tell us a lot about where AI is taking our industry. There are also a number of trends in hardware, model capabilities and the economics of AI that can help us predict where we might end up in the future.

Mark Wilcox is a Principal Engineer at Olio, he started working in the mobile software industry a quarter of a century ago. He’s seen it change more in the last 6 months than the previous 10 years and believes the change is only just getting started.

Avatar for Leeds Mobile

Leeds Mobile

June 01, 2026

More Decks by Leeds Mobile

Other Decks in Programming

Transcript

  1. L E E D S M O B I L

    E · M A Y ‘ 2 6 C O V E R 0 1 / 2 0 L E E D S M O B I L E M E E T U P · M A Y 2 0 2 6 The Present and Future of AI AI in Mobile Software Six months of change, the new builder's loop, and where this is all heading. $ ./start
  2. / A G E N D A Entering Plan Mode.

    ~/meetup · cat agenda.md mobile-dev@meetup $ cat agenda.md 01 The last six months — what just changed 02 How teams are actually working — two variants 03 Mobile-specific bottlenecks — at both edges 04 Where it's going — models, silicon, on-device $ ./start L E E D S M O B I L E · M A Y ‘ 2 6 A G E N D A 0 2 / 2 0
  3. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 1 · T H E L A S T S I X M O N T H S 0 3 / 2 0 01 The last six months. What changed since Nov '25 — and how fast it actually moved. C H A P T E R O N E
  4. W H A T C H A N G E

    D Coding agents became good enough. The most opinionated voices in software went from skeptical to agent-first in months. AK Andrej Karpathy ✓ @karpathy · Jan '26 𝕏 Went from 80% manual + autocomplete in November to 80% agent coding in a few weeks. Biggest workflow change in two decades of programming. DH David Heinemeier Hansson ✓ @dhh · Jan '26 𝕏 AI agents really came alive for me. The most exciting thing we've made computers do since we connected them to the internet. LT Linus Torvalds ✓ @torvalds · May '26 L K M L The patch volume is wild. AI-assisted code is the new normal — even for the kernel. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 1 · T H E L A S T S I X M O N T H S 0 4 / 2 0
  5. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 1 · T H E L A S T S I X M O N T H S 0 5 / 2 0 T H E L E A P 10× 10× Capability per dollar, in one year. 3× from algorithmic improvement · 2× from hardware · the rest is price competition.
  6. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 2 · H O W T H E Y W O R K N O W 0 6 / 2 0 02 How do mobile teams work now? Early-adopter mobile teams aren't using AI like autocomplete anymore. C H A P T E R T W O
  7. / P O L L · R A I S

    E A H A N D How are you using AI to build apps? // pick the one closest to your daily flow A I'm not. B Glorified auto-complete. const add = (a, b) => { } ⇥ T A B T O A C C E P T return a + b; C In-IDE agent. D Terminal agent. $ claude → reading 12 files… → drafting diff (+182 / −34) $ L E E D S M O B I L E · M A Y ‘ 2 6 § 0 2 · H O W T H E Y W O R K N O W 0 7 / 2 0
  8. T W O W A Y S T O A

    C T U A L L Y U S E T H E M Same tools. Codebase picks the loop. Risk tolerance and existing constraints split early-adopter teams into two camps. 0 1 · C O N S E R V A T I V E Human-in-the-loop. Agent drafts. You review every diff. Tests gate everything. F I T Legacy code, regulated industries, mature products. L O O P spec → agent → review → tests → merge B E T Humans are the source of truth — agents are a force multiplier, not an authority. 0 2 · A G G R E S S I V E Agent-on-agent. Agents review agents. Ship to canary. Fix forward. F I T Greenfield, internal tools, prototype velocity. L O O P spec → agent → agent → canary → measure → patch B E T Next year's model will be good enough to fix our tech debt. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 2 · H O W T H E Y W O R K N O W 0 8 / 2 0
  9. V A R I A N T 0 1 ·

    I N A C T I O N The conservative loop, on a real branch. claude-code · feature/onboarding-paywall mobile-dev@meetup $ add the new paywall variant to onboarding, hook up the experiment → reading OnboardingFlow.swift, ExperimentClient.kt, 9 more files → drafting changes across 4 files (+182 / −34) → running xcodebuild test… ✓ 84 passed → running ./gradlew testDebugUnitTest… ✓ 312 passed → diff ready. opening for review. note · experiment client needed a default case — added one. flag this if you'd rather throw. $ /review L E E D S M O B I L E · M A Y ‘ 2 6 § 0 2 · H O W T H E Y W O R K N O W 0 9 / 2 0
  10. / R E S U L T S · P

    R O D U C T I O N Results at Olio Olio. // five months in · one 9-engineer team · agent-led day-to-day >2× >2× throughput per engineer −30% −30% AWS spend · infra refactors + perf wins 100% 100% frontend engineers now full-stack (React Native devs → Rails PRs) ✓ Lowest API response times in company history. ✓ Significant reduction in Sentry errors across the stack. ✓ Major refactors taken on with confidence — not deferred. ✓ Scalable end-to-end test infrastructure built; coverage up across the apps. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 2 · H O W T H E Y W O R K N O W 1 0 / 2 0
  11. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 3 · M O B I L E B O T T L E N E C K S 1 1 / 2 0 03 Mobile bottlenecks. Native app tooling for agents still hasn't caught up. C H A P T E R T H R E E
  12. → → T H E B O T T L

    E N E C K M O V E D Building got faster, scoping and validating… not yet. Just because you can build it, doesn't mean you should. Or that it works everywhere! S C O P E What & how. PM skills don't provide context and history. · Agents aren't doing user research for us. · Design can be accelerated, but UX insight is lacking. · System design trade-offs need judgement. · B U I L D Diffs. Agents write the code. · Fixes in minutes, not hours. Features in hours, not days. · Developers optimise the process by observing what went wrong. · R E V I E W Did it work? Right thing built? · Anything regress? · Will it ship cleanly on every device? · Agents testing apps are still clumsy and slow. · L E E D S M O B I L E · M A Y ‘ 2 6 § 0 3 · M O B I L E B O T T L E N E C K S 1 2 / 2 0
  13. $ S T A T U S - - M

    O B I L E Universal challenges, harder on mobile. mobile-ai · edges [ scope ] platform conventions ⟶ tribal knowledge, sparse in training data [ scope ] cross-platform parity ⟶ double the design spec, double the API surface [ scope ] "which native API?" ⟶ multiple valid frameworks per task [ review ] ui readability ⟶ screenshots or accessibility trees < DOM [ review ] real-device UI tests ⟶ flaky, retries eat minutes [ review ] store cycles ⟶ patches take hours to days, fix forward is high risk → six chokepoints · three at scope, three at review · the middle is fine. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 3 · M O B I L E B O T T L E N E C K S 1 3 / 2 0
  14. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 4 / 2 0 04 Where it's going. Models, weights, silicon. Trends heading towards your device? C H A P T E R F O U R
  15. The Pareto frontier is flattening. 1200 1300 1400 1500 $0.10

    $1 $10 $100 A R E N A E L O → ← C H E A P E R · U S D P E R M I L L I O N T O K E N S · P R I C I E R → Llama 3 8B Gemma 3 4B Gemma 3 12B Gemma 3 27B Gemini 3 Flash Gemini 3 Pro Jan 2026 May 2026 F R O N T I E R · J A N 2 0 2 6 Four months ago — Gemma 3 family at the cheap end; Gemini 3 had just landed (Nov/Dec '25), pushing the ceiling to 1486 Elo. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 5 / 2 0
  16. The Pareto frontier is flattening. 1200 1300 1400 1500 $0.10

    $1 $10 $100 A R E N A E L O → ← C H E A P E R · U S D P E R M I L L I O N T O K E N S · P R I C I E R → Llama 3 8B DeepSeek V4 Flash Gemma 4 31BDeepSeek V4 Pro Gemini 3 Flash Gemini 3.5 Flash Claude Opus 4.7 Jan 2026 May 2026 · today F R O N T I E R · T O D A Y Four months later — DeepSeek V4 fills the $0.20–$1 gap; Gemma 4 31B and Opus 4.7 extend the rest. $0.20/M today buys what $1/M did in January. L E E D S M O B I L E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 6 / 2 0
  17. O P E N V S P R O P

    R I E T A R Y · I N T E L L I G E N C E I N D E X Open weights rapidly catching up to the frontier. // artificialanalysis.ai · frontier models · may 2026 GPT-5.5 60 Claude Opus 4.7 57 Gemini 3.1 Pro 57 Kimi K2.6 54 MiMo V2.5 Pro 54 Claude Opus 4.6 53 DeepSeek V4 Pro 52 GLM-5.1 51 MiniMax M2.7 50 proprietary open-weights ‹ intelligence index, 0 → 60 › L E E D S M O B I L E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 7 / 2 0
  18. C L O U D S I L I C

    O N · G E N B Y G E N Cost per token follows the silicon down. // bar = FP8 PFLOPS per chip · $/Mtok indicative · ironwood anchored to Google's published $0.02 '24 H1 TPU Trillium · v6 G O O G L E $0.050 /Mtok '24 Q4 AWS Trainium 2 A W S $0.080 /Mtok '25 Q2 TPU Ironwood · v7 G O O G L E $0.020 /Mtok '26 Q1 Microsoft Maia 200 M I C R O S O F T $0.015 /Mtok '26 H2 AWS Trainium 3 A W S $0.040 /Mtok '26 Q4 TPU Zebrafish · v8i G O O G L E $0.012 /Mtok google · tpu aws · trainium microsoft · maia ‹ cost: vendor claims + estimates › L E E D S M O B I L E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 8 / 2 0
  19. L E E D S M O B I L

    E · M A Y ‘ 2 6 § 0 4 · W H E R E I T ' S G O I N G 1 9 / 2 0 O N - D E V I C E · O P E N Q U E S T I O N S How much really goes on-device? More questions than answers. The pros and cons depend on what your app actually does. ? Will every app ship its own model? No. App-size, battery, and update headache only make sense for a few categories. ? Will the built-in models get good? Yes, but will they work well enough for your use case? ? Will we need more RAM than devices have today? Likely, for anything close to today's frontier performance. What does that mean for the long tail of low-end devices?
  20. L E E D S M O B I L

    E · M A Y ‘ 2 6 T H A N K S 2 0 / 2 0 T H E E N D ? "Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning the end of the beginning." — w i n s t o n c h u r c h i l l , 1 9 4 2 /questions · t h a n k y o u · $ open ./organiser-slides → leeds-mobile.github.io/organiser-slides/#12