Mobile Continuous Integration at SoundCloud

Mobile Continuous Integration at SoundCloud

In the past two years at SoundCloud, we've shipped a brand new iPhone and iPad app and completely transformed the way we work. The team scaled from 3 to 10 developers, we've moved away from pull-requests and rely heavily on automated testing. As SoundCloud grew, its backend infrastructure got more complex with the addition of many microservices. Ensuring our apps to work with these microservices in production is key to SoundCloud's continuing success. That's why we're building custom continuous integration solutions for our mobile apps.

https://www.youtube.com/watch?v=Rq721qtKKNk

E76dc85e3486993280ae6b4d2810f670?s=128

Vincent Garrigues

September 16, 2015
Tweet

Transcript

  1. None
  2. Mobile Continuous Integration at SoundCloud NSSpain
 2015-09-16
 Vincent Garrigues @garriguv

  3. Why invest in a mobile CI pipeline?

  4. • move fast • keep codebase healthy • confidence •

    ship reliable apps
  5. • move fast • keep codebase healthy • confidence •

    ship reliable apps
  6. None
  7. None
  8. • started from scratch • months in development • millions

    of users
  9. iOS Crash Complaints (avg per Week) 0 35 70 105

    140 April May June July August 2014 SoundCloud community team
  10. iOS Crash Complaints (avg per Week) 0 35 70 105

    140 April 2014 August 2014 December 2014 April 2015 August 2015 2015 SoundCloud community team
  11. None
  12. How do we work?

  13. Theory

  14. In practice

  15. • pairing • short lived branches • feature flags

  16. - (id<AnalyticsProviderInterface>)soundcloudInternalProvider { if ([FlipTheSwitch isDevEventgatewayEnabled]) { return [self eventGatewayProvider];

    } else { return [self eventLoggerProvider]; } } Feature flagging github.com/michaelengland/fliptheswitch
  17. Feature flagging github.com/michaelengland/fliptheswitch

  18. Timeline

  19. 2012 2013 2014 2015

  20. started very simple

  21. None
  22. None
  23. • unit tests • α and β builds • app

    store build
  24. Overview of the current system

  25. build
 analysis
 unit tests acceptance tests AppStore
 AdHoc
 α and

    β
  26. build
 analysis
 unit tests acceptance tests AppStore
 AdHoc
 α and

    β ~ 10min ~ 10min ~ 3min ~ 20min
  27. Let's have a look at the details

  28. build
 analysis
 unit tests acceptance tests AppStore
 AdHoc
 α and

    β
  29. build unit tests acceptance tests build linter dependencies local libraries


    unit tests i18n push
  30. build local libraries
 unit tests i18n push unit tests acceptance

    tests build linter dependencies
  31. build unit tests acceptance tests build linter dependencies ~3000 local

    libraries
 unit tests i18n push
  32. build unit tests acceptance tests build linter dependencies ~3000 ~7000

    local libraries
 unit tests i18n push
  33. build unit tests acceptance tests build linter dependencies ~3000 local

    libraries
 unit tests i18n push ~7000
  34. build unit tests acceptance tests build linter dependencies ~3000 local

    libraries
 unit tests i18n push ~7000
  35. build
 analysis
 unit tests acceptance tests AppStore
 AdHoc
 α and

    β
  36. • frank • cucumber http://www.testingwithfrank.com
 https://cukes.info

  37. None
  38. • flexible platform & OS • bandwidth and reachability settings

    • launch arguments
  39. mobile api proxy The proxy can change: • response status

    code • response body • record and undo actions (like, repost…)
  40. ~350 acceptance tests

  41. ~350 acceptance tests ~200 min

  42. distributed cucumber

  43. distributed cucumber

  44. x20 VM VM

  45. build
 analysis
 unit tests acceptance tests AppStore
 AdHoc
 α and

    β ~ 10min ~ 10min ~ 3min ~ 20min
  46. acceptance tests • iOS version • iPhone (4S, 5, 6,

    6+) • iPad (retina and non retina) • feature flag configurations
  47. What's the value of an acceptance test?

  48. avoid regressions on hard to test flows/screens

  49. None
  50. None
  51. keeping the build green

  52. None
  53. None
  54. pre-ci

  55. None
  56. https://www.youtube.com/watch?v=BhMSzC1crr0 SpaceX

  57. Flaky tests

  58. Flaky tests • test driver is flaky • backend is

    flaky • app is unpredictable
  59. Flaky tests • identify • isolate • fix

  60. flakyrazor

  61. flakyrazor 1. Take failing test out of the test pool

    2. Run the test multiple times (flaky or failing?) 3. Assign it to the author/ committer
  62. flakyrazor 4. Assess test value 5. Act on test duration

    changes 6. Show statistics on why the test failed
  63. Taking care of our build machines

  64. ❤ ❤ ❤ ❤ ❤ ❤ ❤ ❤

  65. None
  66. • we have images for all the machines • CLIs

    to provision the machines remotely • OS X server stores the images and controls the imaging process
  67. Thank you! NSSpain
 2015-09-16
 Vincent Garrigues @garriguv