$30 off During Our Annual Pro Sale. View Details »

Slow Swift

Slow Swift

Exploring Swift runtime performance. Singapore 2017. http://iOSConf.sg

Video: https://www.youtube.com/watch?v=G88qaR9R0v0

Marcin Krzyzanowski

October 20, 2017
Tweet

More Decks by Marcin Krzyzanowski

Other Decks in Programming

Transcript

  1. 2017
    Slow Swift

    View Slide

  2. Marcin Krzyżanowski
    @krzyzanowskim
    PDFViewer.io
    pspdfkit.com
    github.com/krzyzanowskim
    CryptoSwift
    ObjectivePGP

    Natalie
    krzyzanowskim.com

    View Slide

  3. View Slide

  4. 1.4x

    View Slide

  5. –Waldi Ravens
    “A Objective-
    (red.)
    C program is like a fast dance on a newly
    waxed dance floor by people carrying razors.”
    Why Swift?

    View Slide

  6. –Waldi Ravens
    “A Objective-
    (red.)
    C program is like a fast dance on a newly
    waxed dance floor by people carrying razors.”
    Why Swift?

    View Slide

  7. Script Swift?
    #!/usr/bin/env xcrun -sdk macosx swift
    print(“iOSConf.sg Rocks!”)

    View Slide

  8. Script Swift?
    #!/usr/bin/env xcrun -sdk macosx swift
    print(“iOSConf.sg Rocks!”)
    -Onone
    500x to 1000x slower than default optimized build

    View Slide

  9. Compiled Swift
    • Optimized builds

    • Build times ;-(

    View Slide

  10. Compiled Swift
    • Optimized builds

    • Build times ;-(

    let arr: [Int] = [1] + [2] + [3] + [4] + [5] + [6]

    View Slide

  11. Compiled Swift
    • Optimized builds

    • Build times ;-(

    let arr: [Int] = [1] + [2] + [3] + [4] + [5] + [6]
    Expression is too complex
    to be solved in reasonable time

    View Slide

  12. Compiled Swift
    “Speedup AES.encrypt() compilation time from 68230.37ms to 678.34ms”

    View Slide

  13. Source → Binary
    as seen…

    View Slide

  14. Compilation
    SWIFT_OPTIMIZATION_LEVEL
    -Onone
    -O
    -O -whole-module-optimization

    View Slide

  15. Compilation
    SWIFT_OPTIMIZATION_LEVEL
    -Onone
    -O
    -O -whole-module-optimization

    View Slide

  16. Compilation
    SWIFT_DISABLE_SAFETY_CHECKS
    Disable runtime safety checks when optimizing.

    View Slide

  17. Compilation
    SWIFT_ENFORCE_EXCLUSIVE_ACCESS
    full
    compile-time
    none
    (SE-0176)

    View Slide

  18. Compilation
    GCC_GENERATE_DEBUGGING_SYMBOLS
    The shortcuts taken by optimized code may occasionally produce
    surprising results: some variables you declared may not exist at all

    View Slide

  19. Source Optimization
    • &+ and &- and &* but + and - and *
    • discarding any overflow (~~SWIFT_DISABLE_SAFETY_CHECKS)

    • init(truncatingIfNeeded:)

    • init(truncatingBitPattern:) no longer public ;-)

    • private init(_truncatingBits:) uses Builtin.truncOrBitCast

    View Slide

  20. Source Optimization
    In-Out Parameters
    –Swift Programming Language
    “An inout parameter has a value that is passed in to the
    function, is modified by the function, and is passed back out
    of the function to replace the original value.”
    func core(block: inout Array)

    View Slide

  21. Source Optimization
    • Can be optimized to pass-by-value

    • Pointers are useful for performance
    In-Out Parameters

    View Slide

  22. Source Optimization
    Obvious
    • struct over class

    • final class

    • private final class

    View Slide

  23. Source Optimization
    non-Obvious
    2.7s

    View Slide

  24. Source Optimization
    non-Obvious

    View Slide

  25. Source Optimization
    non-Obvious
    1.2s
    2.7s

    View Slide

  26. Source Optimization
    • for-loop over map
    • for-loop over reduce
    • transformation over the dictionary with reduce may end up with O(n2)

    View Slide

  27. Source Optimization
    0.9s

    View Slide

  28. Source Optimization
    0.8s

    View Slide

  29. Source Optimization
    Inlining

    View Slide

  30. Source Optimization
    Inlining
    • Automatic inlining (SIL, LLVM)

    View Slide

  31. Source Optimization
    Inlining
    • Automatic inlining (SIL, LLVM)
    • Inline all the things with @inline(__always)

    View Slide

  32. Source Optimization
    Inlining
    • Automatic inlining (SIL, LLVM)
    • Inline all the things with @inline(__always)
    • Force inlining with @_transparent

    View Slide

  33. Source Optimization
    Inlining
    • Automatic inlining (SIL, LLVM)
    • Inline all the things with @inline(__always)
    • Force inlining with @_transparent
    • Public interface @_inlineable

    View Slide

  34. Source Optimization
    Inlining
    • Automatic inlining (SIL, LLVM)
    • Inline all the things with @inline(__always)
    • Force inlining with @_transparent
    • Public interface @_inlineable
    • Be careful with ABI and public API

    View Slide

  35. Memory

    View Slide

  36. Memory

    View Slide

  37. Memory
    • What if…. raw chunk of memory (Unmanaged)

    • allocate buffer

    • deallocate buffer

    • no ARC, no COW, nothing

    View Slide

  38. Memory

    View Slide

  39. Array()
    turns out to be nearly 10% slower compared to

    UnsafeMutablePointer.allocate(capacity:)
    Memory

    View Slide

  40. Memory
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions 0x000000
    0xFFFFFF

    View Slide

  41. Memory
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    • Stack allocation - fast (%rsp)

    View Slide

  42. Memory
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    • Stack allocation - fast (%rsp)
    • Heap allocation - slow (slower) (syscalls)

    View Slide

  43. Memory
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    • Stack allocation - fast (%rsp)
    • Heap allocation - slow (slower) (syscalls)
    • Preallocation

    View Slide

  44. Memory
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    • Stack allocation - fast (%rsp)
    • Heap allocation - slow (slower) (syscalls)
    • Preallocation
    • array.reserveCapacity(1024)

    View Slide

  45. Quiz

    View Slide

  46. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions

    View Slide

  47. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    (lldb) register read rsp
    rsp = 0x00007fff5fbff630
    (lldb) frame variable -L
    0x00007fff5fbff658: ([Int]) array = 3 values {
    0x00007fa5391de610: [0] = 0
    0x00007fa53e6bf3f0: [1] = -1
    0x00007fa53e6a91d0: [2] = 1
    }

    View Slide

  48. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions

    View Slide

  49. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    (lldb) register read rsp
    rsp = 0x000070000ec06c00
    (lldb) fr variable -L
    0x000070000ec06c18: (@lvalue [Int]) array = 0x0000000100d002b0: {
    0x0000000100d002b0: &array = 2 values {
    0x00007fd7df41ae10: [0] = 0
    0x00007fd7df485c90: [1] = -1
    0x00007fd7df4829b0: [2] = 1
    }

    View Slide

  50. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions

    View Slide

  51. Allocation
    Stack
    free space
    Heap
    Static data
    Literals
    Instructions
    (lldb) register read rsp
    rsp = 0x000070000ec06c00
    (lldb) fr variable -L
    0x00007fff5fbff680: (@lvalue MyStruct) s = 0x0000000100f06820: {
    0x0000000100f06820: &s = {
    0x0000000100f06820: value = "test"
    }
    }

    View Slide

  52. Generics

    View Slide

  53. –who knows
    “Generic code enables you to write flexible, reusable
    functions and types that can work with any type, subject to
    requirements that you define.”

    View Slide

  54. View Slide

  55. Generics

    View Slide

  56. Generics
    0.20s

    View Slide

  57. Generics

    View Slide

  58. Generics
    0.06s
    0.20s

    View Slide

  59. Generics

    View Slide

  60. Generics

    View Slide

  61. Generics

    View Slide

  62. Generics & Optimization
    • Automatic specialization

    View Slide

  63. Generics & Optimization
    • Automatic specialization
    • in the same module

    View Slide

  64. Generics & Optimization
    • Automatic specialization
    • in the same module
    • otherwise use @_specialize(exported: true, where T == Int)

    View Slide

  65. Generics & Optimization
    • Automatic specialization
    • in the same module
    • otherwise use @_specialize(exported: true, where T == Int)
    • Whole Module Optimization won’t help

    View Slide

  66. Generics & Optimization
    • Automatic specialization
    • in the same module
    • otherwise use @_specialize(exported: true, where T == Int)
    • Whole Module Optimization won’t help
    • Sometimes it won’t optimize for the same module

    View Slide

  67. Generics & Optimization
    • Automatic specialization
    • in the same module
    • otherwise use @_specialize(exported: true, where T == Int)
    • Whole Module Optimization won’t help
    • Sometimes it won’t optimize for the same module
    • Avoid generics in public API

    View Slide

  68. Let's Recap…

    View Slide

  69. Let's Recap…
    Debug
    is
    slow

    View Slide

  70. Pre-allocate
    memory
    Let's Recap…

    View Slide

  71. Let's Recap…
    Stack
    vs
    Heap

    View Slide

  72. Let's Recap…Copy
    is
    bad

    View Slide

  73. Let's Recap…
    Use carefully

    View Slide

  74. Let's Recap…
    Less Swift is
    faster Swift

    View Slide

  75. @krzyzanowskim
    krzyzanowskim.com
    Thank you!

    View Slide