Slide 1

Slide 1 text

Yuta Saito (@kateinoigakukun) 2023/01/21 Optimizing your Swift code try! Swift 2023 Tokyo Meetup

Slide 2

Slide 2 text

• Yuta Saito / @kateinoigakukun • Waseda University B4 • Working at • Maintainer of SwiftWasm • Commiter to Swift / LLVM / CRuby About me

Slide 3

Slide 3 text

https://www.goodnotes.com/windows

Slide 4

Slide 4 text

Outline 1. Motivation: Why is performance important for us? 2. Background: Why is Swift “slow”? 3. Techniques: How to write compiler-friendly code 1. Pro fi le! Pro fi le!! Pro fi le!!! 2. Reduce dynamic dispatch 3. Reveal hidden CoW cost 4. Value operation cost

Slide 5

Slide 5 text

1. Motivation:
 Why is performance important for us?

Slide 6

Slide 6 text

Motivation When/Where does performance matter? • Apps sensitive to frame dropping • Most apps don’t need to care • Game App, Camera App, etc… • 1 frame must be done within 16ms (60fps) or 8ms (120fps)

Slide 7

Slide 7 text

Motivation When/Where does performance matter? Non-optimized WebAssembly is still slow • V8 has two AOT compilers*1: • Baseline (Lifto ff ) • Optimizing (TurboFan) • Baseline is about 2x slower 😣 *1: https://v8.dev/docs/wasm-compilation-pipeline

Slide 8

Slide 8 text

Motivation GoodNotes’ Ink algorithm Compute Curve

Slide 9

Slide 9 text

2. Background: Why is Swift “slow”?

Slide 10

Slide 10 text

Why is Swift “slow”? https://benchmarksgame-team.pages.debian.net/benchmarksgame/box-plot-summary-charts.html

Slide 11

Slide 11 text

Why is Swift “slow”? • Tend to be slower than C/C++ • High-level language features • ARC (Automatic Reference Counting) • CoW (Copy-on-Write) • Protocol • Short code can have large hidden cost

Slide 12

Slide 12 text

Why is Swift “slow”? Automatic Reference Counting class Animal { func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() let animal = cat as Animal animal.bark() Q. Where retain/release will be placed?

Slide 13

Slide 13 text

Why is Swift “slow”? Automatic Reference Counting class Animal { func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() // retain(cat) let animal = cat as Animal animal.bark() // release(animal) // release(cat) Q. Where retain/release will be placed? A. → Hidden cost!!

Slide 14

Slide 14 text

3. Techniques:
 How to write compiler-friendly code

Slide 15

Slide 15 text

What do you mean “compiler-friendly”? • Easy for compilers to optimize • Compilers can optimize only a set of program patterns • Hand-annotated restriction would help compiler to optimize

Slide 16

Slide 16 text

Profile! Profile!! Profile!!! Performance is often bound by non-CPU work • GPU • Alpha blending • Event latency • Blocking IO • Disk IO • Network IO Do we really need to optimize CPU work?

Slide 17

Slide 17 text

Instruments.app • Go https://help.apple.com/instruments/mac/

Slide 18

Slide 18 text

Reduce dynamic dispatch Dynamic dispatch happens when: • Method called through class instances • Method called through protocol types

Slide 19

Slide 19 text

Reduce dynamic dispatch • Avoid open access modi fi er open class Animal { open func bark() {} } func useAnimal(_ x: Animal) { x.bark() // Animal.bark can be overridden outside the module } • Check if compiler can know the callee method at compile-time 🧐 • Use small type as much as possible (Cat < Animal < Any) Class instance methods

Slide 20

Slide 20 text

Reduce dynamic dispatch • Avoid existential container to be specialization-friendly func usePingable(_ x: Pingable) { x.ping() } // Better if possible 👍 func usePingable(_ x: some Pingable) { x.ping() } Protocol methods

Slide 21

Slide 21 text

Reduce dynamic dispatch Protocol methods Module A Module B public func usePingable(_ x: some Pingable) { x.ping() } import ModuleA Cannot remove dynamic dispatch 😣 struct PingableImpl: Pingable { ... } usePingable(PingableImpl())

Slide 22

Slide 22 text

Reduce dynamic dispatch Protocol methods Module A Module B import ModuleA public struct PingableImpl: Pingable { ... } usePingable(PingableImpl()) @_specialize(where X == PingableImpl) public func usePingable(_ x: some Pingable) { x.ping() } No dynamic dispatch! 💨

Slide 23

Slide 23 text

Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve { var allPoints = self.points for event in newEvents { allPoints.append(event.point) } let tails = Array(allPoints.suffix(3)) return Curve(p0: tails[0], p1: tails[1], p2: tails[2]) } ⚠ Large copy ⚠ Temporary Array allocation

Slide 24

Slide 24 text

Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve { let p0, p1, p2: Point switch newEvents.count { case 0: p0 = self.points[self.points.count - 3] p1 = self.points[self.points.count - 2] p2 = self.points[self.points.count - 1] case 1: p0 = self.points[self.points.count - 2] p1 = self.points[self.points.count - 1] p2 = newEvents[0].point case 2: p0 = self.points[self.points.count - 1] p1 = newEvents[0].point p2 = newEvents[1].point default: p0 = newEvents[newEvents.count - 3].point p1 = newEvents[newEvents.count - 2].point p2 = newEvents[newEvents.count - 1].point } return Curve(p0: p0, p1: p1, p2: p2) }

Slide 25

Slide 25 text

Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String) case array([JSON]) } func appendString(json: inout JSON) { switch json { case .array(var array): array.append(.string("extra string”)) json = .array(json) default: break } } ⚠ Sharing the same storage ⚠ CoW triggered!

Slide 26

Slide 26 text

Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String) // case array([JSON]) case array(Box<[JSON]>) } func appendString(json: inout JSON) { switch json { case .array(let array): array.value.append(.string(“extra string”)) default: break } } 🤔 Wrapped with Box to be uniquely referenced

Slide 27

Slide 27 text

Value operation cost • Struct copy is cheap only when the struct type is trivial • Trivial Types (POD: Plain Old Data): No extra copy, move, destruction semantics • Int, Bool, Double, … • A struct type that consists of trivial types • Many container types in stdlib (Array, Set, …) has fast-path for trivial types • Optimized to be a memcpy

Slide 28

Slide 28 text

Value operation cost class Owner { ... } struct Item { // non-trivial let id: Int // trivial let owner: Owner // non-trivial } var newItems = self.items // for item in newItems { // retain(item.owner) // } // memcpy(self.items, newItems) newItems.append(Item(...)) ⚠ Non-trivial copy operation struct Owner {} struct Item { // trivial let id: Int // trivial let owner: Owner // trivial } var newItems = self.items // memcpy(self.items, newItems) newItems.append(Item(...)) ✅ Relatively trivial copy operation

Slide 29

Slide 29 text

Value operation cost print(_isPOD(Int.self)) // true print(_isPOD(String.self)) // false print(_isPOD(Array.self)) // false struct Box { let value: T } print(_isPOD(Box.self)) // true print(_isPOD(Box.self)) // false Check a type is trivial or not by _isPOD

Slide 30

Slide 30 text

Summary • Swift has some hidden cost even in a short code • Understanding the underlying mechanism makes your code fast! • The GoodNotes’ Ink algorithm is now 7x faster! Before After 0 150 300 450 600 Benchmark Time (ms)

Slide 31

Slide 31 text

Resources • Low-level Swift optimization tips by Kelvin Ma
 https://swiftinit.org/articles/low-level-swift-optimization • Writing High-Performance Swift Code
 https://github.com/apple/swift/blob/main/docs/OptimizationTips.rst • Understanding Swift Performance - WWDC 2016
 https://developer.apple.com/videos/play/wwdc2016/416/