Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making CocoaPods Fast (with Modern Ruby Tooling)

Making CocoaPods Fast (with Modern Ruby Tooling)

Given at RubyConf Taiwan 2019 in Taipei

Samuel E. Giddins

July 27, 2019
Tweet

More Decks by Samuel E. Giddins

Other Decks in Technology

Transcript

  1. Making CocoaPods Fast
    with Modern Ruby Tooling
    Samuel Giddins
    1

    View Slide

  2. @segiddins
    Mobile Developer Experience @ Square
    CocoaPods Core Team
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 2

    View Slide

  3. bundler & rubygems
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 3

    View Slide

  4. quick aside
    ➡ how many of you write ruby almost every day?
    ➡ how many of you write ruby that's deployed to a web server?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 4

    View Slide

  5. What even is CocoaPods?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 5

    View Slide

  6. ==
    +
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 6

    View Slide

  7. CocoaPods combines a definition of libraries (podspecs)
    with a way to integrate them into a user's application (podfile)
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 7

    View Slide

  8. RubyGems Bundler CocoaPods
    Gemspec Podspec
    Gemfile Podfile
    Gemfile.lock Podfile.lock
    rubygems.org index.rubygems.org github.com/CocoaPods/
    Specs
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 8

    View Slide

  9. That little difference: Xcode
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 9

    View Slide

  10. Xcode
    ➡ Proprietary toolchain for compiling apps for apple platforms
    ➡ Uses its own manifest file format
    ➡ Invokes many compilation tools
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 10

    View Slide

  11. So, CocoaPods has to do a heck of lot
    more than RubyGems or Bundler
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 11

    View Slide

  12. pod install
    ➡ Updating local specs repositories
    ➡ Analyzing dependencies
    ➡ Fetching dependencies
    ➡ Installing dependencies
    ➡ Generating Pods project
    ➡ Integrating client project
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 12

    View Slide

  13. So, CocoaPods exists!
    It was slow?
    So what.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 13

    View Slide

  14. Why Performance?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 14

    View Slide

  15. Why Performance?
    ➡ Optimization is fun!
    ➡ Rapid iteration == happier & more productive developers
    ➡ "Scale"
    ➡ Free performance is hard to find
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 15

    View Slide

  16. So, I have a slow ruby app.
    How do I make it faster?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 16

    View Slide

  17. How do I make it faster?
    ➡ Is it really slow?
    ➡What is it doing?
    ➡How often does it run?
    ➡Would it being faster make a difference?
    ➡ Can it do less work?
    ➡ How can I keep it from getting worse?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 17

    View Slide

  18. Is it really slow?
    ➡ How much would I invest to make it 10% faster?
    50%? 90%?
    ➡ How do I know which part of what it's doing
    is slow?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 18

    View Slide


  19. !
    this feels slow


    this took 10 seconds

    #
    this segment took 10 seconds

    $
    this method was called 97645 times,
    averaging 23ms per call,
    taking up 48% of total runtime
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 19

    View Slide

  20. Profiling
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 20

    View Slide

  21. Profiling
    The art of putting numbers to the feeling of
    "this feels slow"
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 21

    View Slide

  22. What can I profile?
    ➡ Method calls
    ➡ Call Stack Sampling
    ➡ Memory allocations
    ➡ Disk I/O
    ➡ Network I/O
    ➡ Database queries
    ➡ Garbage collection
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 22

    View Slide

  23. My work on CocoaPods focused on:
    Method Call Profiling
    Call Stack Sampling
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 23

    View Slide

  24. quick aside:
    One of our contributors got amazing gains
    by profiling allocations
    and memoizing things like #hash & Pathname#join
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 24

    View Slide

  25. Profilers
    Just like most tools,
    it's important to pick the right one for the job at hand
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 25

    View Slide

  26. Tracing Profilers
    Tell you how many times something was called,
    and usually how much time is spent in that call.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 26

    View Slide

  27. Tracing Profilers
    ➡ Can distort relative numbers
    ➡ Can make running a program painfully slow
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 27

    View Slide

  28. Tracing Profilers
    If it takes 35 minutes with no profiling, it may never finish under
    rubyprof. Sorry, I didn't think of that.
    — CocoaPods/CocoaPods#5180
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 28

    View Slide

  29. Allocation Profilers
    Tracing profilers that count memory allocations
    instead of method calls.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 29

    View Slide

  30. Allocation Profilers
    ➡ Suffer from similiar problems,
    but even more so since the typical ruby program
    is incredibly allocation-heavy.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 30

    View Slide

  31. Sampling Profilers
    Takes a peek at your program at regular intervals,
    from the outside.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 31

    View Slide

  32. Sampling Profilers
    ➡ (Generally) will not interfere at all with your app's normal execution.
    ➡ Possible to miss things,
    it's luck whether a particular stack trace
    gets sampled at the right time.
    ➡ Can’t tell you how many times something has been called.
    ➡ “Here were the stack traces when I looked”
    ➡What was at the top of the stack at the time?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 32

    View Slide

  33. Manual Profilers
    Sometimes, you don't want or need a fancy tool.
    The easiest output to understand
    is the one you write yourself.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 33

    View Slide

  34. Manual Profilers
    ➡ Wrapping method calls in a call to Benchmark.measure
    will tell you how long it took.
    ➡ Can help you focus in on what you already suspect
    to be hotspots
    ➡ More likely to tell you if the distribution of calls is not unimodal
    ➡e.g. calling a memoized method, so only the first call will take a
    significant amount of time
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 34

    View Slide

  35. Chronometer
    A tool I built to automate this sort of low-fidelity tracing.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 35

    View Slide

  36. Chronometer
    ➡ Wraps methods with some timing / recording infrastructure
    ➡ Can output results in a format compatible with
    chrome://tracing
    ➡ Happens in-process, so you can write code that uses the timing
    results
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 36

    View Slide

  37. Chronofile
    for_class Pod::Installer do
    method :install!, context: event_phase_context[:toplevel]
    methods %i[
    prepare
    resolve_dependencies
    download_dependencies
    validate_targets
    generate_pods_project
    integrate_user_project
    perform_post_install_actions
    run_podfile_post_install_hooks
    ]
    end
    # ...
    for_class Pod::Resolver do
    method :resolve
    end
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 37

    View Slide

  38. Profilers
    For me, this meant a mix of
    ➡ rubyprof
    ➡Tracing Profiler
    ➡ rbspy
    ➡Sampling Profiler
    ➡ chronometer
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 38

    View Slide

  39. All 3 are amazing,
    and I used all of them extensively last year.
    But knowing which to reach for at each step of the investigation
    would’ve saved me a bunch of time
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 39

    View Slide

  40. Chronometer hit the sweet spot
    for the type of software I was developing,
    and also solved a couple other problems I had.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 40

    View Slide

  41. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 41

    View Slide

  42. Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 42

    View Slide

  43. Profiling revealed a fundemental problem with
    CocoaPods' architecture.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 43

    View Slide

  44. Graph Traversal
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 44

    View Slide

  45. Graph Traversal
    The performance bottleneck that all
    build systems end up fighting.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 45

    View Slide

  46. Graph Traversal
    A
    / \
    B C
    \ /
    D
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 46

    View Slide

  47. Dependency Graph
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 47

    View Slide

  48. Graph Traversal
    The base problem we often face is
    How do you find “all the nodes/vertices that come after A”?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 48

    View Slide

  49. It's tempting to say:
    “A is followed by B and C, let’s recurse!”
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 49

    View Slide

  50. It's tempting to say:
    “A is followed by B and C, let’s recurse!”
    But then you end up visiting D twice!
    This might not sound like a big deal,
    but imagine 50 more nodes come after D...
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 50

    View Slide

  51. It's not just CocoaPods that has this issue!
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 51

    View Slide

  52. Graph traversal like this happens all over build tools.
    Why?
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 52

    View Slide

  53. DAGs
    ➡ Directed Acyclic Graphs
    ➡describe dependency structure.
    ➡ Every node (target, library, gem, etc.) has a list of dependencies
    ➡only rule is “you can’t depend upon something that depends on you”
    ➡ Most operations rely on what’s called the “transitive closure” of a node
    ➡“what are all the nodes that come after this one, no matter how far
    away?”
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 53

    View Slide

  54. def recursive_predecessors
    vertices = predecessors
    vertices += vertices.flat_map(&:recursive_predecessors)
    vertices.uniq!
    end
    to
    def recursive_predecessors
    vertices = Set.new
    visit = ->(vertex) do
    vertex.incoming_edges.each do |edge|
    vertex = edge.origin
    next unless vertices.add?(vertex)
    visit[vertex]
    end
    visit[self]
    vertices
    end
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 54

    View Slide

  55. Memoization
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 55

    View Slide

  56. Something that comes along with trying to solve graph traversal
    problems is memoization!
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 56

    View Slide

  57. From before:
    A
    / \
    B C
    \ /
    D
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 57

    View Slide

  58. Let’s say some build setting for A
    depends on a build setting of B, C, D.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 58

    View Slide

  59. This can get slow.
    Really slow.
    Reeeeeeeeeaaaaaaaallllllly slow.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 59

    View Slide

  60. Memoization
    Calculate the value for the setting once and then store it
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 60

    View Slide

  61. ➡ This is slightly different from caching
    ➡There’s no “invalidation” that can happen
    ➡ We’re just lazily computing a value, and storing it
    so next time it’s needed,
    we can return the stored value
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 61

    View Slide

  62. In ruby, we do this a lot! ever seen something like this:
    def expensive_to_calculate
    @expensive_to_calculate ||= ...
    end
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 62

    View Slide

  63. This works great!
    ➡ But imagine if the return value for expensive_to_calculate can
    be nil, or false.
    ➡ The ||= short-circuiting won’t kick in.
    ➡ We’ll need to recompute the value every time.
    ➡ Not ideal.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 63

    View Slide

  64. Instead, we need to turn to a more complicated variation on the
    pattern:
    def expensive_to_calculate
    return @expensive_to_calculate if defined?(@expensive_to_calculate)
    @expensive_to_calculate = ...
    end
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 64

    View Slide

  65. In CocoaPods, when I re-wrote the build settings, I added a tiny
    little DSL to make memoization a bit easier:
    def self.define_build_settings_method(
    method_name, build_setting: false,
    memoized: false,
    sorted: false,
    uniqued: false,
    compacted: false,
    frozen: true,
    from_search_paths_aggregate_targets: false,
    from_pod_targets_to_link: false,
    &implementation)
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 65

    View Slide

  66. ➡ It uses an ivar hash called @__memoized to store those values,
    calls #fetch on that hash,
    returns the value unless the hash doesnt yet contain the key.
    ➡ One benefit of this is that the memoized state can be easily
    discarded by clearing that one ivar, instead of needing to keep track
    of a bunch of ivars, each only used in a single method
    implementation.
    ➡ By carefully selecting the key, we’re able to memoize both what a
    superclass and subclass return separately, so multiple calls to super
    can also be memoized.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 66

    View Slide

  67. This DSL (and accompanying refactor)
    allowed me to fix years-old bugs in build settings generation.
    Bug fixes which would've slowed installation down by over
    5 minutes under the old implementation.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 67

    View Slide

  68. Design Lessons Learned
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 68

    View Slide

  69. Design Lessons Learned
    Even after all this work (and it’s still ongoing!),
    CocoaPods isn't as fast as I’d like.
    And it never will be.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 69

    View Slide

  70. ➡ Some of this comes down to language.
    ➡ Some of this comes down to the fact that
    the job CocoaPods does is inherently complicated.
    ➡ Some of it is our own fault,
    for having a system design from 7 years ago
    that wasn’t built to scale this far.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 70

    View Slide

  71. Some particulars
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 71

    View Slide

  72. Exposing Mutable Objects
    ➡ Makes memoization hard.
    ➡ Need to have sidecar objects that hold computed values only
    when its safe to do so.
    ➡ (unless you want to get into cache invalidation bugs)
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 72

    View Slide

  73. Reading & Writing to the File System In-Line
    ➡ Makes it very hard to parallelize IO.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 73

    View Slide

  74. Having Consistent CLI Output
    ➡ Once again, makes parallelization hard.
    ➡ How do you show:
    ➡progress
    ➡underlying command invocations
    ➡failures
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 74

    View Slide

  75. Inefficient Data Structures
    ➡ Particularly the podfile & podspec.
    ➡ The entire Xcode project model
    ➡ Built using nested hashes, making change-tracking hard.
    ➡ Lack of copy-on-write.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 75

    View Slide

  76. Using Ruby to eval Objects Stored on Disk
    ➡ Allows state to leak in, makes computing stable hashes
    complicated because there are multiple valid representations to
    compute them from.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 76

    View Slide

  77. Not Storing all Podfile information in the Podfile.lock
    ➡ Figuring out when its changed based on only the info checked
    into git is nearly impossible.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 77

    View Slide

  78. All-or-Nothing Installation Operation
    ➡ Needing to do the entire installation every time means all
    invocations are equally slow.
    ➡ Fixed now, thanks to @sebastianv1!
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 78

    View Slide

  79. Pathname
    ➡ Nope.
    ➡ Just nope.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 79

    View Slide

  80. Design Lessons Learned
    Architecture matters just as much as algorithms.
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 80

    View Slide

  81. Picking your programming language is a
    tradeoff
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 81

    View Slide

  82. The Future of CocoaPods Performance
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 82

    View Slide

  83. Samuel Giddins
    @segiddins
    Making CocoaPods Fast – Samuel Giddins @ RubyConf Taiwan 2019 83

    View Slide