Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimizing your Swift code

Yuta Saito
January 21, 2023

Optimizing your Swift code

Yuta Saito

January 21, 2023
Tweet

More Decks by Yuta Saito

Other Decks in Technology

Transcript

  1. • Yuta Saito / @kateinoigakukun • Waseda University B4 •

    Working at • Maintainer of SwiftWasm • Commiter to Swift / LLVM / CRuby About me
  2. Outline 1. Motivation: Why is performance important for us? 2.

    Background: Why is Swift “slow”? 3. Techniques: How to write compiler-friendly code 1. Pro fi le! Pro fi le!! Pro fi le!!! 2. Reduce dynamic dispatch 3. Reveal hidden CoW cost 4. Value operation cost
  3. Motivation When/Where does performance matter? • Apps sensitive to frame

    dropping • Most apps don’t need to care • Game App, Camera App, etc… • 1 frame must be done within 16ms (60fps) or 8ms (120fps)
  4. Motivation When/Where does performance matter? Non-optimized WebAssembly is still slow

    • V8 has two AOT compilers*1: • Baseline (Lifto ff ) • Optimizing (TurboFan) • Baseline is about 2x slower 😣 *1: https://v8.dev/docs/wasm-compilation-pipeline
  5. Why is Swift “slow”? • Tend to be slower than

    C/C++ • High-level language features • ARC (Automatic Reference Counting) • CoW (Copy-on-Write) • Protocol • Short code can have large hidden cost
  6. Why is Swift “slow”? Automatic Reference Counting class Animal {

    func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() let animal = cat as Animal animal.bark() Q. Where retain/release will be placed?
  7. Why is Swift “slow”? Automatic Reference Counting class Animal {

    func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() // retain(cat) let animal = cat as Animal animal.bark() // release(animal) // release(cat) Q. Where retain/release will be placed? A. → Hidden cost!!
  8. What do you mean “compiler-friendly”? • Easy for compilers to

    optimize • Compilers can optimize only a set of program patterns • Hand-annotated restriction would help compiler to optimize
  9. Profile! Profile!! Profile!!! Performance is often bound by non-CPU work

    • GPU • Alpha blending • Event latency • Blocking IO • Disk IO • Network IO Do we really need to optimize CPU work?
  10. Reduce dynamic dispatch Dynamic dispatch happens when: • Method called

    through class instances • Method called through protocol types
  11. Reduce dynamic dispatch • Avoid open access modi fi er

    open class Animal { open func bark() {} } func useAnimal(_ x: Animal) { x.bark() // Animal.bark can be overridden outside the module } • Check if compiler can know the callee method at compile-time 🧐 • Use small type as much as possible (Cat < Animal < Any) Class instance methods
  12. Reduce dynamic dispatch • Avoid existential container to be specialization-friendly

    func usePingable(_ x: Pingable) { x.ping() } // Better if possible 👍 func usePingable(_ x: some Pingable) { x.ping() } Protocol methods
  13. Reduce dynamic dispatch Protocol methods Module A Module B public

    func usePingable(_ x: some Pingable) { x.ping() } import ModuleA Cannot remove dynamic dispatch 😣 struct PingableImpl: Pingable { ... } usePingable(PingableImpl())
  14. Reduce dynamic dispatch Protocol methods Module A Module B import

    ModuleA public struct PingableImpl: Pingable { ... } usePingable(PingableImpl()) @_specialize(where X == PingableImpl) public func usePingable(_ x: some Pingable) { x.ping() } No dynamic dispatch! 💨
  15. Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve {

    var allPoints = self.points for event in newEvents { allPoints.append(event.point) } let tails = Array(allPoints.suffix(3)) return Curve(p0: tails[0], p1: tails[1], p2: tails[2]) } ⚠ Large copy ⚠ Temporary Array allocation
  16. Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve {

    let p0, p1, p2: Point switch newEvents.count { case 0: p0 = self.points[self.points.count - 3] p1 = self.points[self.points.count - 2] p2 = self.points[self.points.count - 1] case 1: p0 = self.points[self.points.count - 2] p1 = self.points[self.points.count - 1] p2 = newEvents[0].point case 2: p0 = self.points[self.points.count - 1] p1 = newEvents[0].point p2 = newEvents[1].point default: p0 = newEvents[newEvents.count - 3].point p1 = newEvents[newEvents.count - 2].point p2 = newEvents[newEvents.count - 1].point } return Curve(p0: p0, p1: p1, p2: p2) }
  17. Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String)

    case array([JSON]) } func appendString(json: inout JSON) { switch json { case .array(var array): array.append(.string("extra string”)) json = .array(json) default: break } } ⚠ Sharing the same storage ⚠ CoW triggered!
  18. Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String)

    // case array([JSON]) case array(Box<[JSON]>) } func appendString(json: inout JSON) { switch json { case .array(let array): array.value.append(.string(“extra string”)) default: break } } 🤔 Wrapped with Box to be uniquely referenced
  19. Value operation cost • Struct copy is cheap only when

    the struct type is trivial • Trivial Types (POD: Plain Old Data): No extra copy, move, destruction semantics • Int, Bool, Double, … • A struct type that consists of trivial types • Many container types in stdlib (Array, Set, …) has fast-path for trivial types • Optimized to be a memcpy
  20. Value operation cost class Owner { ... } struct Item

    { // non-trivial let id: Int // trivial let owner: Owner // non-trivial } var newItems = self.items // for item in newItems { // retain(item.owner) // } // memcpy(self.items, newItems) newItems.append(Item(...)) ⚠ Non-trivial copy operation struct Owner {} struct Item { // trivial let id: Int // trivial let owner: Owner // trivial } var newItems = self.items // memcpy(self.items, newItems) newItems.append(Item(...)) ✅ Relatively trivial copy operation
  21. Value operation cost print(_isPOD(Int.self)) // true print(_isPOD(String.self)) // false print(_isPOD(Array<Int>.self))

    // false struct Box<T> { let value: T } print(_isPOD(Box<Int>.self)) // true print(_isPOD(Box<String>.self)) // false Check a type is trivial or not by _isPOD
  22. Summary • Swift has some hidden cost even in a

    short code • Understanding the underlying mechanism makes your code fast! • The GoodNotes’ Ink algorithm is now 7x faster! Before After 0 150 300 450 600 Benchmark Time (ms)
  23. Resources • Low-level Swift optimization tips by Kelvin Ma
 https://swiftinit.org/articles/low-level-swift-optimization

    • Writing High-Performance Swift Code
 https://github.com/apple/swift/blob/main/docs/OptimizationTips.rst • Understanding Swift Performance - WWDC 2016
 https://developer.apple.com/videos/play/wwdc2016/416/