Upgrade to Pro — share decks privately, control downloads, hide ads and more …

go-perftuner

 go-perftuner

Oleg Kovalov

April 25, 2019
Tweet

More Decks by Oleg Kovalov

Other Decks in Programming

Transcript

  1. go-perftuner
    WARSAW, APR 25 2019
    Oleg Kovalov
    Allegro
    Twitter:
    oleg_kovalov
    Github: cristaloleg

    View Slide

  2. Me

    View Slide

  3. What is a performance optimization?

    View Slide

  4. What is a performance optimization?

    View Slide

  5. What is a performance optimization?

    View Slide

  6. What is a performance optimization?

    View Slide

  7. What is a performance optimization?

    View Slide

  8. What is a performance optimization?

    View Slide

  9. When performance optimization is needed?

    View Slide

  10. When performance optimization is needed?

    View Slide

  11. When performance optimization is needed?

    View Slide

  12. When performance optimization is needed?

    View Slide

  13. When performance optimization is needed?

    View Slide

  14. When performance optimization is needed?

    View Slide

  15. When performance optimization is needed?

    View Slide

  16. When performance optimization is needed?

    View Slide

  17. When performance optimization isn’t needed?

    View Slide

  18. When performance optimization isn’t needed?

    View Slide

  19. When performance optimization isn’t needed?

    View Slide

  20. When performance optimization isn’t needed?

    View Slide

  21. When performance optimization isn’t needed?

    View Slide

  22. When performance optimization isn’t needed?

    View Slide

  23. When performance optimization isn’t needed?

    View Slide

  24. Types of performance tuning

    View Slide

  25. Types of performance tuning

    View Slide

  26. Types of performance tuning

    View Slide

  27. Types of performance tuning

    View Slide

  28. Types of performance tuning

    View Slide

  29. Types of performance tuning

    View Slide

  30. Types of performance tuning

    View Slide

  31. Types of performance tuning

    View Slide

  32. Types of performance tuning

    View Slide

  33. What Go compiler can do?

    View Slide

  34. What Go compiler can do?

    View Slide

  35. What Go compiler can do?

    View Slide

  36. What Go compiler can do?

    View Slide

  37. So, welcome go-perftuner

    View Slide

  38. So, welcome go-perftuner

    View Slide

  39. So, welcome go-perftuner

    View Slide

  40. So, welcome go-perftuner

    View Slide

  41. So, welcome go-perftuner

    View Slide

  42. So, welcome go-perftuner

    View Slide

  43. Code inlining

    View Slide

  44. Code inlining

    View Slide

  45. Code inlining

    View Slide

  46. Code inlining

    View Slide

  47. Code inlining

    View Slide

  48. func ReadRuneSimple(s string) (ch rune, size int) {
    if len(s) == 0 {
    return
    }
    ch, size = utf8.DecodeRuneInString(s)
    return
    }
    almostInlined example (1)

    View Slide

  49. func ReadRuneSimple(s string) (ch rune, size int) {
    if len(s) == 0 {
    return
    }
    ch, size = utf8.DecodeRuneInString(s)
    return
    }
    $ go-perftuner inl -threshold=100 old.go
    almostInlined example (1)

    View Slide

  50. func ReadRuneSimple(s string) (ch rune, size int) {
    if len(s) == 0 {
    return
    }
    ch, size = utf8.DecodeRuneInString(s)
    return
    }
    $ go-perftuner inl -threshold=100 old.go
    $
    almostInlined example (1)

    View Slide

  51. func ReadRuneLogged(s string) (ch rune, size int) {
    if len(s) == 0 {
    return
    }
    log.Printf("we're working")
    ch, size = utf8.DecodeRuneInString(s)
    return
    }
    $ go-perftuner inl -threshold=100 new.go
    almostInlined example (2)

    View Slide

  52. func ReadRuneLogged(s string) (ch rune, size int) {
    if len(s) == 0 {
    return
    }
    log.Printf("we're working")
    ch, size = utf8.DecodeRuneInString(s)
    return
    }
    $ go-perftuner inl -threshold=100 new.go
    inl: ./new.go:34:6: ReadRuneLogged: budget exceeded by 50
    almostInlined example (2)

    View Slide

  53. Bounds-checking elimination

    View Slide

  54. Bounds-checking elimination

    View Slide

  55. Bounds-checking elimination

    View Slide

  56. Bounds-checking elimination

    View Slide

  57. boundChecks example (1)
    func PutUint32(b []byte, v uint32) {
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    }

    View Slide

  58. boundChecks example (2)
    func PutUint32(b []byte, v uint32) {
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    }
    $ go-perftuner bce old.go
    bce: ./old.go:4:7: slice/array has bound checks
    bce: ./old.go:5:7: slice/array has bound checks
    bce: ./old.go:6:7: slice/array has bound checks
    bce: ./old.go:7:7: slice/array has bound checks
    $

    View Slide

  59. func PutUint32(b []byte, v uint32) {
    _ = b[3] // early check to guarantee safety
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    }
    boundChecks example (3)

    View Slide

  60. boundChecks example (4)
    func PutUint32(b []byte, v uint32) {
    _ = b[3] // early check to guarantee safety
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    }
    $ go-perftuner bce new.go
    bce: ./new.go:4:7: slice/array has bound checks
    $

    View Slide

  61. boundChecks example (4)
    func PutUint32(b []byte, v uint32) {
    if len(b) < 4 { return }
    b[0] = byte(v)
    b[1] = byte(v >> 8)
    b[2] = byte(v >> 16)
    b[3] = byte(v >> 24)
    }
    $ go-perftuner bce new.go
    $

    View Slide

  62. Escape analysis

    View Slide

  63. Escape analysis

    View Slide

  64. Escape analysis

    View Slide

  65. Escape analysis

    View Slide

  66. func getRands(size int) []int {
    nums := make([]int, size)
    for i := range nums {
    nums[i] = rand.Int()
    }
    return nums
    }
    escapedVariable example (1)

    View Slide

  67. func getRands(size int) []int {
    nums := make([]int, size)
    for i := range nums {
    nums[i] = rand.Int()
    }
    return nums
    }
    $ go-perftuner esc old.go
    esc: ./old.go:16:14: make([]int, size)
    $
    escapedVariable example (1)

    View Slide

  68. func sumRand() (total int) {
    nums := make([]int, 8191)
    for i := range nums {
    nums[i] = rand.Int()
    }
    for _, x := range nums {
    total += x
    }
    return total
    }
    escapedVariable example (2)

    View Slide

  69. func sumRand() (total int) {
    nums := make([]int, 8191)
    for i := range nums {
    nums[i] = rand.Int()
    }
    for _, x := range nums {
    total += x
    }
    return total
    }
    $ go-perftuner esc 1.go
    $
    escapedVariable example (3)

    View Slide

  70. func sumRand() (total int) {
    nums := make([]int, 8191 + 1)
    for i := range nums {
    nums[i] = rand.Int()
    }
    for _, x := range nums {
    total += x
    }
    return total
    }
    $ go-perftuner esc 1.go
    escapedVariable example (4)

    View Slide

  71. func sumRand() (total int) {
    nums := make([]int, 8191 + 1)
    for i := range nums {
    nums[i] = rand.Int()
    }
    for _, x := range nums {
    total += x
    }
    return total
    }
    $ go-perftuner esc 1.go
    esc: ./old.go:16:14: make([]int, 8191 + 1)
    $
    escapedVariable example (4)

    View Slide

  72. A very handy tool

    View Slide

  73. A very handy tool

    View Slide

  74. A very handy tool

    View Slide

  75. A very handy tool

    View Slide

  76. $ go get golang.org/x/perf/cmd/benchstat
    $ go test -bench=. -count 10 > old.txt
    benchstat

    View Slide

  77. $ go get golang.org/x/perf/cmd/benchstat
    $ go test -bench=. -count 10 > old.txt
    $ # < do some coding and magic with go-perftuner >
    $ go test -bench=. -count 10 > new.txt
    benchstat

    View Slide

  78. benchstat
    $ go get golang.org/x/perf/cmd/benchstat
    $ go test -bench=. -count 10 > old.txt
    $ # < do some coding and magic with go-perftuner >
    $ go test -bench=. -count 10 > new.txt
    $ benchstat old.txt new.txt
    name old time/op new time/op delta
    Foo 13.6ms ± 1% 11.8ms ± 1% -13.31% (p=0.016 n=4+5)
    Bar 32.1ms ± 1% 31.8ms ± 1% ~ (p=0.286 n=4+5)

    View Slide

  79. Performance tuning summary

    View Slide

  80. Performance tuning summary

    View Slide

  81. Performance tuning summary

    View Slide

  82. Performance tuning summary

    View Slide

  83. Performance tuning summary

    View Slide

  84. Performance tuning summary

    View Slide

  85. That’s all folks

    View Slide