Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making CocoaPods Fast (with Modern Ruby Tooling)

Making CocoaPods Fast (with Modern Ruby Tooling)

Given at RubyConf Taiwan 2019 in Taipei

Samuel E. Giddins

July 27, 2019
Tweet

More Decks by Samuel E. Giddins

Other Decks in Technology

Transcript

  1. @segiddins Mobile Developer Experience @ Square CocoaPods Core Team Making

    CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 2
  2. quick aside ➡ how many of you write ruby almost

    every day? ➡ how many of you write ruby that's deployed to a web server? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 4
  3. CocoaPods combines a definition of libraries (podspecs) with a way

    to integrate them into a user's application (podfile) Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 7
  4. RubyGems Bundler CocoaPods Gemspec Podspec Gemfile Podfile Gemfile.lock Podfile.lock rubygems.org

    index.rubygems.org github.com/CocoaPods/ Specs Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 8
  5. Xcode ➡ Proprietary toolchain for compiling apps for apple platforms

    ➡ Uses its own manifest file format ➡ Invokes many compilation tools Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 10
  6. So, CocoaPods has to do a heck of lot more

    than RubyGems or Bundler Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 11
  7. pod install ➡ Updating local specs repositories ➡ Analyzing dependencies

    ➡ Fetching dependencies ➡ Installing dependencies ➡ Generating Pods project ➡ Integrating client project Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 12
  8. So, CocoaPods exists! It was slow? So what. Making CocoaPods

    Fast – Samuel Giddins @ RubyConf Taiwan 2019 13
  9. Why Performance? ➡ Optimization is fun! ➡ Rapid iteration ==

    happier & more productive developers ➡ "Scale" ➡ Free performance is hard to find Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 15
  10. So, I have a slow ruby app. How do I

    make it faster? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 16
  11. How do I make it faster? ➡ Is it really

    slow? ➡What is it doing? ➡How often does it run? ➡Would it being faster make a difference? ➡ Can it do less work? ➡ How can I keep it from getting worse? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 17
  12. Is it really slow? ➡ How much would I invest

    to make it 10% faster? 50%? 90%? ➡ How do I know which part of what it's doing is slow? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 18
  13. ➡ ! this feels slow ➡ ⏱ this took 10

    seconds ➡ # this segment took 10 seconds ➡ $ this method was called 97645 times, averaging 23ms per call, taking up 48% of total runtime Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 19
  14. Profiling The art of putting numbers to the feeling of

    "this feels slow" Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 21
  15. What can I profile? ➡ Method calls ➡ Call Stack

    Sampling ➡ Memory allocations ➡ Disk I/O ➡ Network I/O ➡ Database queries ➡ Garbage collection Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 22
  16. My work on CocoaPods focused on: Method Call Profiling Call

    Stack Sampling Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 23
  17. quick aside: One of our contributors got amazing gains by

    profiling allocations and memoizing things like #hash & Pathname#join Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 24
  18. Profilers Just like most tools, it's important to pick the

    right one for the job at hand Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 25
  19. Tracing Profilers Tell you how many times something was called,

    and usually how much time is spent in that call. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 26
  20. Tracing Profilers ➡ Can distort relative numbers ➡ Can make

    running a program painfully slow Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 27
  21. Tracing Profilers If it takes 35 minutes with no profiling,

    it may never finish under rubyprof. Sorry, I didn't think of that. — CocoaPods/CocoaPods#5180 Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 28
  22. Allocation Profilers Tracing profilers that count memory allocations instead of

    method calls. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 29
  23. Allocation Profilers ➡ Suffer from similiar problems, but even more

    so since the typical ruby program is incredibly allocation-heavy. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 30
  24. Sampling Profilers Takes a peek at your program at regular

    intervals, from the outside. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 31
  25. Sampling Profilers ➡ (Generally) will not interfere at all with

    your app's normal execution. ➡ Possible to miss things, it's luck whether a particular stack trace gets sampled at the right time. ➡ Can’t tell you how many times something has been called. ➡ “Here were the stack traces when I looked” ➡What was at the top of the stack at the time? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 32
  26. Manual Profilers Sometimes, you don't want or need a fancy

    tool. The easiest output to understand is the one you write yourself. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 33
  27. Manual Profilers ➡ Wrapping method calls in a call to

    Benchmark.measure will tell you how long it took. ➡ Can help you focus in on what you already suspect to be hotspots ➡ More likely to tell you if the distribution of calls is not unimodal ➡e.g. calling a memoized method, so only the first call will take a significant amount of time Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 34
  28. Chronometer A tool I built to automate this sort of

    low-fidelity tracing. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 35
  29. Chronometer ➡ Wraps methods with some timing / recording infrastructure

    ➡ Can output results in a format compatible with chrome://tracing ➡ Happens in-process, so you can write code that uses the timing results Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 36
  30. Chronofile for_class Pod::Installer do method :install!, context: event_phase_context[:toplevel] methods %i[

    prepare resolve_dependencies download_dependencies validate_targets generate_pods_project integrate_user_project perform_post_install_actions run_podfile_post_install_hooks ] end # ... for_class Pod::Resolver do method :resolve end Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 37
  31. Profilers For me, this meant a mix of ➡ rubyprof

    ➡Tracing Profiler ➡ rbspy ➡Sampling Profiler ➡ chronometer Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 38
  32. All 3 are amazing, and I used all of them

    extensively last year. But knowing which to reach for at each step of the investigation would’ve saved me a bunch of time Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 39
  33. Chronometer hit the sweet spot for the type of software

    I was developing, and also solved a couple other problems I had. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 40
  34. Graph Traversal The performance bottleneck that all build systems end

    up fighting. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 45
  35. Graph Traversal A / \ B C \ / D

    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 46
  36. Graph Traversal The base problem we often face is How

    do you find “all the nodes/vertices that come after A”? Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 48
  37. It's tempting to say: “A is followed by B and

    C, let’s recurse!” Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 49
  38. It's tempting to say: “A is followed by B and

    C, let’s recurse!” But then you end up visiting D twice! This might not sound like a big deal, but imagine 50 more nodes come after D... Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 50
  39. It's not just CocoaPods that has this issue! Making CocoaPods

    Fast – Samuel Giddins @ RubyConf Taiwan 2019 51
  40. Graph traversal like this happens all over build tools. Why?

    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 52
  41. DAGs ➡ Directed Acyclic Graphs ➡describe dependency structure. ➡ Every

    node (target, library, gem, etc.) has a list of dependencies ➡only rule is “you can’t depend upon something that depends on you” ➡ Most operations rely on what’s called the “transitive closure” of a node ➡“what are all the nodes that come after this one, no matter how far away?” Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 53
  42. def recursive_predecessors vertices = predecessors vertices += vertices.flat_map(&:recursive_predecessors) vertices.uniq! end

    to def recursive_predecessors vertices = Set.new visit = ->(vertex) do vertex.incoming_edges.each do |edge| vertex = edge.origin next unless vertices.add?(vertex) visit[vertex] end visit[self] vertices end Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 54
  43. Something that comes along with trying to solve graph traversal

    problems is memoization! Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 56
  44. From before: A / \ B C \ / D

    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 57
  45. Let’s say some build setting for A depends on a

    build setting of B, C, D. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 58
  46. Memoization Calculate the value for the setting once and then

    store it Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 60
  47. ➡ This is slightly different from caching ➡There’s no “invalidation”

    that can happen ➡ We’re just lazily computing a value, and storing it so next time it’s needed, we can return the stored value Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 61
  48. In ruby, we do this a lot! ever seen something

    like this: def expensive_to_calculate @expensive_to_calculate ||= ... end Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 62
  49. This works great! ➡ But imagine if the return value

    for expensive_to_calculate can be nil, or false. ➡ The ||= short-circuiting won’t kick in. ➡ We’ll need to recompute the value every time. ➡ Not ideal. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 63
  50. Instead, we need to turn to a more complicated variation

    on the pattern: def expensive_to_calculate return @expensive_to_calculate if defined?(@expensive_to_calculate) @expensive_to_calculate = ... end Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 64
  51. In CocoaPods, when I re-wrote the build settings, I added

    a tiny little DSL to make memoization a bit easier: def self.define_build_settings_method( method_name, build_setting: false, memoized: false, sorted: false, uniqued: false, compacted: false, frozen: true, from_search_paths_aggregate_targets: false, from_pod_targets_to_link: false, &implementation) Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 65
  52. ➡ It uses an ivar hash called @__memoized to store

    those values, calls #fetch on that hash, returns the value unless the hash doesnt yet contain the key. ➡ One benefit of this is that the memoized state can be easily discarded by clearing that one ivar, instead of needing to keep track of a bunch of ivars, each only used in a single method implementation. ➡ By carefully selecting the key, we’re able to memoize both what a superclass and subclass return separately, so multiple calls to super can also be memoized. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 66
  53. This DSL (and accompanying refactor) allowed me to fix years-old

    bugs in build settings generation. Bug fixes which would've slowed installation down by over 5 minutes under the old implementation. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 67
  54. Design Lessons Learned Even after all this work (and it’s

    still ongoing!), CocoaPods isn't as fast as I’d like. And it never will be. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 69
  55. ➡ Some of this comes down to language. ➡ Some

    of this comes down to the fact that the job CocoaPods does is inherently complicated. ➡ Some of it is our own fault, for having a system design from 7 years ago that wasn’t built to scale this far. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 70
  56. Exposing Mutable Objects ➡ Makes memoization hard. ➡ Need to

    have sidecar objects that hold computed values only when its safe to do so. ➡ (unless you want to get into cache invalidation bugs) Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 72
  57. Reading & Writing to the File System In-Line ➡ Makes

    it very hard to parallelize IO. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 73
  58. Having Consistent CLI Output ➡ Once again, makes parallelization hard.

    ➡ How do you show: ➡progress ➡underlying command invocations ➡failures Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 74
  59. Inefficient Data Structures ➡ Particularly the podfile & podspec. ➡

    The entire Xcode project model ➡ Built using nested hashes, making change-tracking hard. ➡ Lack of copy-on-write. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 75
  60. Using Ruby to eval Objects Stored on Disk ➡ Allows

    state to leak in, makes computing stable hashes complicated because there are multiple valid representations to compute them from. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 76
  61. Not Storing all Podfile information in the Podfile.lock ➡ Figuring

    out when its changed based on only the info checked into git is nearly impossible. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 77
  62. All-or-Nothing Installation Operation ➡ Needing to do the entire installation

    every time means all invocations are equally slow. ➡ Fixed now, thanks to @sebastianv1! Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 78
  63. Pathname ➡ Nope. ➡ Just nope. Making CocoaPods Fast –

    Samuel Giddins @ RubyConf Taiwan 2019 79
  64. Design Lessons Learned Architecture matters just as much as algorithms.

    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 80
  65. Picking your programming language is a tradeoff Making CocoaPods Fast

    – Samuel Giddins @ RubyConf Taiwan 2019 81