Slide 1

Slide 1 text

Making CocoaPods Fast with Modern Ruby Tooling Samuel Giddins 1

Slide 2

Slide 2 text

@segiddins Mobile Developer Experience @ Square CocoaPods Core Team Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 2

Slide 3

Slide 3 text

bundler & rubygems Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 3

Slide 4

Slide 4 text

What even is CocoaPods? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 4

Slide 5

Slide 5 text

== + Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 5

Slide 6

Slide 6 text

CocoaPods combines a definition of libraries (podspecs) with a way to integrate theminto a user's application (podfile) Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 6

Slide 7

Slide 7 text

RubyGems Bundler CocoaPods Gemspec Podspec Gemfile Podfile Gemfile.lock Podfile.lock rubygems.org index.rubygems.org github.com/CocoaPods/ Specs Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 7

Slide 8

Slide 8 text

That little difference: Xcode Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 8

Slide 9

Slide 9 text

Xcode ➡ Proprietary toolchain for compiling apps for  platforms ➡ Uses its own manifest file format ➡ Invokes many compilation tools Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 9

Slide 10

Slide 10 text

So, CocoaPods has to do a heck of lot more than RubyGems or Bundler Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 10

Slide 11

Slide 11 text

➡ Updating local specs repositories ➡ Analyzing dependencies ➡ Fetching dependencies ➡ Installing dependencies ➡ Generating Pods project ➡ Integrating client project Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 11

Slide 12

Slide 12 text

So, CocoaPods exists! It was slow? So what. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 12

Slide 13

Slide 13 text

Why Performance? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 13

Slide 14

Slide 14 text

Why Performance? ➡ Optimization is fun! ➡ Rapid iteration == happier & more productive developers ➡ "Scale" ➡ Free performance is hard to find Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 14

Slide 15

Slide 15 text

So, I have a slow ruby app. How do I make it faster? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 15

Slide 16

Slide 16 text

How do I make it faster? ➡ Is it really slow? ➡What is it doing? ➡How often does it run? ➡Would it being faster make a difference? ➡ Can it do less work? ➡ How can I keep it from getting worse? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 16

Slide 17

Slide 17 text

Is it really slow? ➡ How much would I invest to make it 10% faster? 50%? 90%? ➡ How do I know which part of what it's doing is slow? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 17

Slide 18

Slide 18 text

➡ ! this feels slow ➡ ⏱ this took 10 seconds ➡ # this segment took 10 seconds ➡ $ this method was called 97645 times, averaging 23ms per call, taking up 48% of total runtime Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 18

Slide 19

Slide 19 text

Profiling Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 19

Slide 20

Slide 20 text

Profiling The art of putting numbers to the feeling of "this feels slow" Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 20

Slide 21

Slide 21 text

What can I profile? ➡ Method calls ➡ Call Stack Sampling ➡ Memory allocations ➡ Disk I/O ➡ Network I/O ➡ Database queries ➡ Garbage collection Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 21

Slide 22

Slide 22 text

My work on CocoaPods focused on: Method Call Profiling Call Stack Sampling Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 22

Slide 23

Slide 23 text

quick aside: One of our contributors got amazing gains by profiling allocations and memoizing things like #hash & Pathname#join Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 23

Slide 24

Slide 24 text

Profilers Just like most tools, it's important to pick the right one for the job at hand Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 24

Slide 25

Slide 25 text

Tracing Profilers Tell you how many times something was called, and usually how much time is spent in that call. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 25

Slide 26

Slide 26 text

Tracing Profilers ➡ Can distort relative numbers ➡ Can make running a program painfully slow Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 26

Slide 27

Slide 27 text

Tracing Profilers If it takes 35 minutes with no profiling, it may never finish under rubyprof. Sorry, I didn't think of that. — CocoaPods/CocoaPods#5180 Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 27

Slide 28

Slide 28 text

Allocation Profilers Tracing profilers that count memory allocations instead of method calls. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 28

Slide 29

Slide 29 text

Allocation Profilers ➡ Suffer from similiar problems, but even more so since the typical ruby program is incredibly allocation-heavy. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 29

Slide 30

Slide 30 text

Sampling Profilers Take a peek at you program at regular intervals, from the outside. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 30

Slide 31

Slide 31 text

Sampling Profilers ➡ (Generally) will not interfere at all with your app's normal execution. ➡ Possible to miss things, it's luck whether a particular stack trace gets sampled at the right time. ➡ Can’t tell you how many times something has been called. ➡ “Here were the stack traces when I looked” ➡What was at the top of the stack at the time? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 31

Slide 32

Slide 32 text

Manual Profilers Sometimes, you don't want or need a fancy tool. The easiest output to understand is the one you write yourself. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 32

Slide 33

Slide 33 text

Manual Profilers ➡ Wrapping method calls in a call to Benchmark.measure will tell you how long it took. ➡ Can help you focus in on what you already suspect to be hotspots ➡ More likely to tell you if the distribution of calls is not unimodal ➡e.g. calling a memoized method, so only the first call will take a significant amount of time Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 33

Slide 34

Slide 34 text

Chronometer A tool I built to automate this sort of low-fidelity tracing. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 34

Slide 35

Slide 35 text

Chronometer ➡ Wraps methods with some timing / recording infrastructure ➡ Can output results in a format compatible with chrome://tracing ➡ Happens in-process, so you can write code that uses the Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 35

Slide 36

Slide 36 text

Chronofile for_class Pod::Installer do method :install!, context: event_phase_context[:toplevel] methods %i[ prepare resolve_dependencies download_dependencies validate_targets generate_pods_project integrate_user_project perform_post_install_actions run_podfile_post_install_hooks ] end # ... for_class Pod::Resolver do method :resolve end Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 36

Slide 37

Slide 37 text

Profilers For me, this meant a mix of ➡ rubyprof ➡Tracing Profiler ➡ rbspy ➡Sampling Profiler ➡ chronometer Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 37

Slide 38

Slide 38 text

All 3 are amazing, and I used all of them extensively last year. But knowing which to reach for at each step of the investigation would’ve saved me a bunch of time Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 38

Slide 39

Slide 39 text

Chronometer hit the sweet spot for the type of software I was developing, and also solved a couple other problems I had. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 39

Slide 40

Slide 40 text

Profiling revelead a fundemental problem with CocoaPods' architecture. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 40

Slide 41

Slide 41 text

Graph Traversal Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 41

Slide 42

Slide 42 text

Graph Traversal The performance bottleneck that all build systems end up fighting. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 42

Slide 43

Slide 43 text

Graph Traversal A / \ B C \ / D Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 43

Slide 44

Slide 44 text

Dependency Graph Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 44

Slide 45

Slide 45 text

Graph Traversal The base problem we often face is How do you find “all the nodes/vertices that come after A”? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 45

Slide 46

Slide 46 text

It's tempting to say: “A is followed by B and C, let’s recurse!” Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 46

Slide 47

Slide 47 text

It's tempting to say: “A is followed by B and C, let’s recurse!” But then you end up visiting D twice! This might not sound like a big deal, but imagine 50 more nodes come after D... Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 47

Slide 48

Slide 48 text

It's not just CocoaPods that has this issue! Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 48

Slide 49

Slide 49 text

Graph traversal like this happens all over build tools. Why? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 49

Slide 50

Slide 50 text

DAGs ➡ Directed Acyclic Graphs ➡describe dependency structure. ➡ Every node (target, library, gem, etc.) has a list of dependencies ➡only rule is “you can’t depend upon something that depends on you” ➡ Most operations rely on what’s called the “transitive closure” of a node ➡“what are all the nodes that come after this one, no matter how far away?” Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 50

Slide 51

Slide 51 text

def recursive_predecessors vertices = predecessors vertices += vertices.flat_map(&:recursive_predecessors) vertices.uniq! end to def recursive_predecessors vertices = Set.new visit = ->(vertex) do vertex.incoming_edges.each do |edge| vertex = edge.origin next unless vertices.add?(vertex) visit[vertex] end visit[self] vertices end Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 51

Slide 52

Slide 52 text

Memoization Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 52

Slide 53

Slide 53 text

Something that comes along with trying to solve graph traversal problems is memoization! Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 53

Slide 54

Slide 54 text

From before: A / \ B C \ / D Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 54

Slide 55

Slide 55 text

Let’s say some build setting for A depends on a build setting of B, C, D. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 55

Slide 56

Slide 56 text

This can get slow. Really slow. Reeeeeeeeeaaaaaaaallllllly slow. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 56

Slide 57

Slide 57 text

Memoization Calculate the value for the setting once and then store it Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 57

Slide 58

Slide 58 text

➡ This is slightly different from caching ➡There’s no “invalidation” that can happen ➡ We’re just lazily computing a value, and storing it so next time it’s needed, we can return the stored value Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 58

Slide 59

Slide 59 text

In ruby, we do this a lot! ever seen something like this: def expensive_to_calculate @expensive_to_calculate ||= ... end Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 59

Slide 60

Slide 60 text

This works great! - But imagine if the return value for expensive_to_calculate can be nil, or false. - The ||= short-circuiting won’t kick in. - We’ll need to recompute the value every time. - Not ideal. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 60

Slide 61

Slide 61 text

Instead, we need to turn to a more complicated variation on the pattern: def expensive_to_calculate return @expensive_to_calculate if defined?(@expensive_to_calculate) @expensive_to_calculate = ... end Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 61

Slide 62

Slide 62 text

In CocoaPods, when I re-wrote the build settings, I added a tiny little DSL to make memoization a bit easier: def self.define_build_settings_method(method_name, build_setting: false, memoized: false, sorted: false, uniqued: false, compacted: false, frozen: true, from_search_paths_aggregate_targets: false, from_pod_targets_to_link: false, &implementation) Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 62

Slide 63

Slide 63 text

➡ It uses an ivar hash called @__memoized to store those values, calls #fetch on that hash, returns the value unless the hash doesnt yet contain the key. ➡ One benefit of this is that the memoized state can be easily discarded by clearing that one ivar, instead of needing to keep track of a bunch of ivars, each only used in a single method implementation. ➡ By carefully selecting the key, we’re able to memoize both what a superclass and subclass return separately, so multiple calls to super can also be memoized. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 63

Slide 64

Slide 64 text

This DSL (and accompanying refactor) allowed me to fix years-old bugs in build settings generation. Bug fixes which would've slowed installation down by over 5 minutes under the old implementation. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 64

Slide 65

Slide 65 text

Design Lessons Learned Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 65

Slide 66

Slide 66 text

Design Lessons Learned Even after all this work (and it’s still ongoing!), CocoaPods isn't as fast as I’d like. And it never will be. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 66

Slide 67

Slide 67 text

➡ some of this comes down to language ➡ some of this comes down to the fact that the job CocoaPods does is inherently complicated ➡ some of it is our own fault, for having a system design from 7 years ago that wasn’t built to scale this far Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 67

Slide 68

Slide 68 text

Some particulars Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 68

Slide 69

Slide 69 text

Exposing Mutable Objects Makes memoization hard. Need to have sidecar objects that hold computed values only when its safe to do so. (unless you want to get into cache invalidation bugs) Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 69

Slide 70

Slide 70 text

Reading & Writing to the file system in-line Makes it very hard to parallelize IO. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 70

Slide 71

Slide 71 text

Having Consistent CLI Output Once again, makes parallelization hard. How do you show progress, underlying command invocations, failures, etc? Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 71

Slide 72

Slide 72 text

Inefficient Data Structures Particularly the podfile & podspec. Built using nested hashes, making change-tracking hard. Lack of copy-on-write. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 72

Slide 73

Slide 73 text

Using Ruby to eval Objects Stored on Disk Allows state to leak in, makes computing stable hashes complicated because there are multiple valid representations to compute them off of. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 73

Slide 74

Slide 74 text

Not Storing all Podfile information in the Podfile.lock Figuring out when its changed based on only the info checked into git is nearly impossible. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 74

Slide 75

Slide 75 text

All-or-Nothing Installation Operation Needing to do the entire installation every time means all invocations are equally slow. Fixed now, thanks to @sebastianv1! Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 75

Slide 76

Slide 76 text

Pathname Nope. Just nope. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 76

Slide 77

Slide 77 text

Design Lessons Learned Architecture matters just as much as algorithms. Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 77

Slide 78

Slide 78 text

Samuel Giddins @segiddins Making CocoaPods Fast – Samuel Giddins @ Ruby on Ice 2019 78