Adventures in Cgo Preformance

Adventures in Cgo Preformance

Cgo is a powerful tool in the Go programmer’s arsenal. It allows Go programmers to interoperate with other languages. However, Cgo documentation is scarce and best practices for performance are hard to come by. In this tutorial session, I discuss lessons I've has learned working on the Go API for Wallaroo, a high-performance distributed stream processor written in Pony.

I cover hard-won knowledge about using Cgo in performance sensitive code including: ways in which Cgo makes interoperation with other languages difficult, how you can work around common sources of performance and scaling problems, and an issue with the Go runtime that can't be worked around.

3c53e91d2a6ceb1b7f202d709f638b1b?s=128

Sean T Allen

August 29, 2018
Tweet

Transcript

  1. ADVENTURES IN CGO PERFORMANCE

  2. SEAN T. ALLEN VP OF ENGINEERING AT WALLAROO LABS AUTHOR

    OF “STORM APPLIED” MEMBER OF THE PONY CORE TEAM LOVER OF ARTISANAL STREET ART @SEANTALLEN @WALLAROOLABS @PONYLANG
  3. WHAT’S IN THIS TALK…

  4. CGO AND ME GLUING LANGUAGES TOGETHER FOR FUN AND PROFIT

  5. I’M A C PROGRAMMER

  6. I’M A PONY PROGRAMMER

  7. YOU MIGHT EVEN CALL ME A CGO PROGRAMMER

  8. YOU PROBABLY WOULDN’T CALL ME A GO PROGRAMMER

  9. I ENDED UP HERE BECAUSE WALLAROO LABS NEEDED TO CALL

    GO CODE FROM PONY CODE
  10. WALLAROO SCALE-INDEPENDENT COMPUTING FOR GO (AND PYTHON)

  11. FRAMEWORK FOR DOING “BIG DATA STUFF” * think Hadoop, Spark,

    Storm, Flink, Kafka Streams
  12. FRAMEWORK FOR HORIZONTALLY SCALING EVENT-BY-EVENT STREAM PROCESSING APPLICATIONS * think

    Storm, Flink, Kafka Streams
  13. TWO-LAYER ARCHITECTURE

  14. scale-independent scale-aware API

  15. scale-independent scale-aware API

  16. scale-independent scale-aware API

  17. scale-independent scale-aware API

  18. scale-independent scale-aware API

  19. scale-independent scale-aware API

  20. WALLAROO & GO A MARRIAGE MADE IN CGO

  21. WALLAROO Decode Compute Encode WALLAROO RUNTIME

  22. WALLAROO: PONY RUNTIME Decode Compute Encode WALLAROO RUNTIME

  23. WALLAROO: GO COMPUTATIONS Decode Compute Encode WALLAROO RUNTIME

  24. WALLAROO: CGO BRIDGE Decode Compute Encode WALLAROO RUNTIME

  25. TWO-LAYER ARCHITECTURE REVISITED scale-independent scale-aware API

  26. SCALE-AWARE PONY RUNTIME scale-independent scale-aware API

  27. SCALE-AWARE PONY RUNTIME scale-independent Pony API

  28. USER SUPPLIED SCALE-INDEPENDENT GO scale-independent Pony API

  29. USER SUPPLIED SCALE-INDEPENDENT GO Go Pony API

  30. CGO BRIDGE BETWEEN GO AND PONY Go Pony API

  31. CGO BRIDGE BETWEEN GO AND PONY Go Pony cgo

  32. AND SO ENDS THE SEAN T. ALLEN CGO BACKSTORY

  33. CGO WE AREN’T IN KANSAS ANYMORE

  34. CALL “C” FROM GO

  35. CALL GO FROM “C”

  36. TO MY EYE, CGO IS NOT FFI* * except it

    is according to Wikipedia
  37. AND IT’S NOT GO

  38. CGO PERFORMANCE IT’S COMPLICATED

  39. CGO PERFORMANCE CALLING “C” FROM GO

  40. CALLING “C” FROM GO METHOD OPERATIONS COMPLETED NANOS PER OPERATION

    CGO 10,000,000 171 NS/OP GO 2,000,000,000 1.83 NS/OP * according to a simple Cockroach Labs benchmark
  41. CGO PERFORMANCE CALLING GO FROM “C”

  42. CALLING GO FROM C TEST MACHINE MILLISECONDS PER OPERATION AWS

    (various instance types) 1-2 MS/OP 2014 MacBook Pro 5-6 MS/OP * according to a simple Sean T. Allen benchmark
  43. RUNTIME/PROC.GO LINE 1771

  44. RECOMMENDATION: BATCH YOUR CGO CALLS

  45. GO => “C” DO AS MUCH AS YOU CAN IN

    A SINGLE “C” CALL
  46. “C” => GO DO AS MUCH AS YOU CAN IN

    A SINGLE GO CALL
  47. THE PROBLEM WITH POINTERS

  48. BUT FIRST, LET’S TALK ABOUT GARBAGE COLLECTION

  49. THERE ARE MORE TYPES OF GARBAGE COLLECTION IN HEAVEN AND

    EARTH THAN ARE DREAMT OF IN YOUR PHILOSOPHY
  50. “COPYING” GARBAGE COLLECTORS WILL MOVE OBJECTS IN MEMORY

  51. RELOCATING OBJECTS IN MEMORY ADDS COMPLEXITY TO FFI

  52. C ISN’T ALLOWED TO HOLD ONTO GO POINTERS

  53. C ISN’T ALLOWED TO HOLD ONTO GO POINTERS AND IT’S

    CHECKED AT RUNTIME
  54. AND BAD THINGS HAPPEN IF YOU DON’T FOLLOW THE RULES

    • Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers • C code may not keep a copy of a Go pointer after the call returns • A Go function called by C code may not return a Go pointer • Go code may not store a Go pointer in C memory.
  55. BUT… WHAT IF I REALLY NEED TO? * like Wallaroo

    for example
  56. “BIG OLD MAP” A SOLUTION OF SORTS

  57. GO IS ALLOWED TO RETURN NON-POINTERS TO C * like

    a unit64
  58. A “BIG OLD MAP” OF INTEGERS TO GO OBJECTS SOLVES

    OUR POINTER PROBLEM
  59. None
  60. None
  61. None
  62. None
  63. BONUS PROBLEM SOLVED…

  64. HOLDING OBJECTS IN THE BIG OLD MAP KEEPS THEM FROM

    BEING GARBAGE COLLECTED
  65. “BIG OLD MAP” THERE ARE PROBLEMS

  66. PERFORMANCE WILL SUFFER UNDER CONTENTION

  67. CONTENDED LOCKS DESTROY PERFORMANCE

  68. “BIG OLD MAP” WON’T TAKE YOU VERY FAR

  69. TIME FOR A LITTLE SHARDING

  70. CONCURRENT MAP FROM 1 LOCK TO MANY LOCKS

  71. None
  72. None
  73. None
  74. None
  75. None
  76. ID TO SHARD WITH 8 SHARDS ID SHARD 0 0

    1 1 8 0 12 4
  77. ID GENERATION

  78. None
  79. None
  80. CONCURRENT MAP THERE ARE PROBLEMS

  81. THERE’S THAT LOCK IN OUR ID GENERATOR

  82. CAN WE DITCH THAT ID GENERATION LOCK?

  83. ATOMICS A MORE CONCURRENCY FRIENDLY ALTERNATIVE

  84. None
  85. RECOMMENDATIONS

  86. FOR ID GENERATION: USE THE ATOMICS PACKAGE OR SOMETHING INTRINSIC

    TO THE VALUE
  87. PICK YOUR “MAP” CAREFULLY.. CONCURRENT MAP PROBABLY ISN’T RIGHT FOR

    YOU
  88. CONSIDER PERFORMANCE UPFRONT

  89. AVOID LOCKS

  90. CGO AND YOU HOW YOU CAN HELP IMPROVE CGO

  91. DOCUMENTATION IS NEEDED

  92. IF YOU ARE USING CGO, TALK ABOUT THE PAIN *

    and the value
  93. WORK TO MAKE THE CGO EXPERIENCE MORE LIKE THE GO

    EXPERIENCE
  94. WORK ON THAT TODO ON LINE 1771 IN RUNTIME/PROC.GO *

    or otherwise contribute code
  95. THANKS BRIAN KETELSEN ANDREW TURLEY

  96. SPECIAL THANKS TO JEFF WENDLING AKA @ZEEBO ON THE GOPHER

    SLACK
  97. LEARN MORE GITHUB.COM/SEANTALLEN/ ADVENTURES-IN-CGO- PERFORMANCE