OSS Performance Tuning Tips

6247c099ad62bf727a4f9df10b6c6f23?s=47 orisano
October 28, 2019

OSS Performance Tuning Tips

6247c099ad62bf727a4f9df10b6c6f23?s=128

orisano

October 28, 2019
Tweet

Transcript

  1. OSS Performance Tuning Tips #gocon #gocon_hall GoCon 2019 Autumn @orisano

  2. ໨త ̎ͭͷ͜ͱΛڞ༗͍ͨ͠

  3. νϡʔχϯάΛͲ͏ਐΊΔͷ͔ Ҿ͖ग़͠Λ૿΍͢

  4. νϡʔχϯάΛͲ͏ਐΊΔͷ͔ Ҿ͖ग़͠Λ૿΍͢

  5. Ͳ͏ਐΊ͍ͯΔͷ͔

  6. 0. ෆຬͷϋʔυϧΛԼ͛Δ

  7. ෆຬ͕ վળͷ͖͔͚ͬʹͳΓ·͢

  8. ීஈ࢖͍ͬͯΔ ϥΠϒϥϦɺιϑτ΢ΣΞ͕ ஗͍΋ͷͩͱࢥͬͯΈΔ

  9. ͖͔͚ͬͱ ϞνϕʔγϣϯΛ ࣗΒ࡞Γग़͠·͠ΐ͏

  10. 1. ܭଌͷϋʔυϧΛԼ͛Δ

  11. ஗͍ݪҼΛ ؾܰʹ ௐ΂ΒΕΔೳྗΛ਎ʹ͚ͭΔ

  12. ख͕͔͔ؒΔߦҝΛ ਓؒ͸͋·Γ͠ͳ͍

  13. ͨ͘͞Μܦݧ͢Δ

  14. ؾܰʹ ௐࠪͰ͖ΔΑ͏ͳ࢓૊ΈΛ ೖΕΔ

  15. ͜Ε͸OSSΛެ։͢Δଆͷ࿩ ؀ڥม਺Ͱࢦఆͨ͠Γ σϑΥϧτͰprofile͕औΕΔ ࢓૊ΈΛఏڙ͢Δͱ վળ͠΍͍͢/͞Ε΍͍͢

  16. ໰୊ʹૺ۰ͨ͠ਓ͕ ݪҼௐࠪ͠΍͍͢ͱخ͍͠

  17. 2. ໰୊ՕॴΛಛఆ͢Δ

  18. ஗͍ͱײͯ͡ ໰୊Օॴ͕ؔ਺ϨϕϧͰ Θ͔Βͳ͍ͷ͸ cliΛ࣮ߦ͍ͯ͠Δͱ͖

  19. mainʹ github.com/pkg/profile Λ௥Ճͯ͠ΈΔ

  20. None
  21. ͜Ε͚ͩ

  22. cliͷ࣮ߦ͕1౓ʹ 1࣌ؒҎ্͔͔Δ৔߹ͳͲ͸ net/http/pprofͷ΄͏͕ ྑ͍৔߹΋

  23. profileΛݟͯ ؔ਺ϨϕϧͰಛఆ͢Δ

  24. pprofͷweb൛ʹ׳ΕΔ ͪΌΜͱҙຯ͕Θ͔Δ

  25. ͍ΖΜͳݟํͰݟͯΈΔ top, graph, flame-graph source, disasm

  26. 3. BenchmarkΛॻ͘

  27. ͢Ͱʹॻ͍ͯ͋Δ৔߹͸ ϥοΩʔ ஗͘ͳ͍ͬͯͳ͍৔߹͸ ஗͘ͳΔέʔεΛ௥Ճ͢Δ

  28. BenchmarkΛॻ࣌͘͸ ࠷దԽͰফ͍͑ͯͳ͍͔֬ೝ

  29. BenchmarkΛॻ͘ͱ ؆୯ʹprofile͕औΕΔ

  30. go test -cpuprofile go test -memprofile Λ֮͑Α͏ (go help testflagΛಡ΋͏)

  31. 4. ղܾࡦΛߟ͑Δ

  32. Ͳ͏ղܾ͢Δ͔ʹ͍ͭͯ͸ “Ҿ͖ग़͠Λ૿΍͢”Ͱ

  33. OSSͷ ύϑΥʔϚϯενϡʔχϯά ͸Ϛʔδ͞Εͯ׬ྃ

  34. Ͳ͏΍Ε͹Ϛʔδ͞ΕΔ ύονʹͳΔ͔

  35. Ϛʔδ͢Δͱ͍͏͜ͱ͸ ૬ख͕ϝϯςφϯε͢Δ ͱ͍͏͜ͱ

  36. มߋ͕গͳ͍ ૬खͷίετΛ૿΍͞ͳ͍

  37. ഁյతมߋ͕ͳ͍ ޿͘࢖ΘΕ͍ͯΔ΋ͷ΄Ͳ ॏཁͳͱ͜Ζ

  38. ҟৗͳίετ͕ͳ͍ มߋ, อक͕೉͘͠ͳΔ࣮૷ (ΞηϯϒϥΛ࢖ͬͨΓ)

  39. ґଘؔ܎Λ௥Ճ͢Δ৔߹͸ ৻ॏʹબ୒͢Δ ΞΫςΟϒʹϝϯςφϯε͞ Ε͍ͯΔ΋ͷ͔ ຊ౰ʹඞཁ͔?

  40. ͘͢͝ύϑΥʔϚϯε͕ վળ͢Δ ܶతͳվળ͸ड͚ೖΕΒΕΔ

  41. ςετ͕ॻ͍ͯ͋Δ ͢Ͱʹॻ͍ͯ͋Δ৔߹͸डཧ ͞Ε΍͍͢

  42. 5. ࢼߦࡨޡ͢Δ

  43. ·ͣॳظঢ়ଶͰ े෼ͳճ਺ timeoutʹͳΒͳ͍Α͏ʹ Benchmark݁ՌΛऔ͓ͬͯ͘

  44. go test -bench . -count=10 -timeout=30000s | tee old.txt

  45. ύονΛૹΔͱ͖ʹ benchstatͷ݁ՌΛૹΔͷͰ ϕʔεϥΠϯ͸े෼ͳճ਺Ͱ

  46. timeout͸ σϑΥϧτͩͱ10mͰ ҙ֎ͱΦʔόʔ͢Δ

  47. ղܾࡦΛࢥ͍͍ͭͨΒ ·ͣ͸1౓Benchmark

  48. ے͕ྑͦ͞͏ͳΒ े෼ͳճ਺Benchmark

  49. ଟ͘ࢼߦࡨޡ͍ͯ͠Δͱ profileͱsourceͷؔ܎͕ Θ͔Βͳ͘ͳΔ

  50. pprofʹ͸ sourceΛݟΔػೳ͕͋Δ͕ อ͍࣋ͯ͠ΔΘ͚Ͱ͸ͳ͍ pathͷΈ

  51. git add . git commit -m “$2” REV=$(git rev-parse HEAD)

    go test -bench $1 -benchmem -cpuprofile cpu.${REV}.pb.gz -memprofile mem.$ {REV}.pb.gz | tee ${REV}.txt rm ./${REV}.* && git reset HEAD^
  52. source͸gitʹ؅ཧͯ͠΋Β͏ ͪΌΜͱඥ෇͚ΒΕΔ

  53. ଟ͘ͷprofile͕͋Δͱ ୯ମͷޮՌ͕Θ͔Βͳ͍ ͜ͱ͕͋Δ

  54. go tool pprof -diff_base Ͱprofileؒͷࠩ෼͕ݟΕΔ

  55. 6. ύονΛૹΔ

  56. ࠷ऴͷBenchmarkΛ े෼ͳճ਺ߦ͏

  57. ઐ༻ͷΠϯελϯεͰܭଌ ೉͍͠৔߹͸ ۃྗ֎෦ϓϩηεΛఀࢭ͢Δ

  58. νϡʔχϯάΛͲ͏ਐΊΔͷ͔ Ҿ͖ग़͠Λ૿΍͢

  59. ࣮ྫϕʔεͰ ͲͷΑ͏ʹղܾ͔ͨ͠ ࣗ෼ͳΒͲ͏ղܾ͢Δ͔ ߟ͑ͯ΋Β͍͍ͨ

  60. 1. ࣮ࡍʹ౤͛ͨύον 2. ଎͍ϥΠϒϥϦ͕ॻ͖͍ͨ

  61. 1. ࣮ࡍʹ౤͛ͨύον 2. ଎͍ϥΠϒϥϦ͕ॻ͖͍ͨ

  62. src-d/go-git

  63. pure-goͰ gitͷૢ࡞Λ͢ΔͨΊͷ ϥΠϒϥϦ

  64. ύονΛૹͬͨഎܠ

  65. aquasecurity/trivy Ͱ౰ॳ࢖͍͕ͬͯͨ େ͖ͳrepositoryͷcloneʹ ҟৗͳ࣌ؒ(10min~)͕͔͔Δ ໰୊͕͋ͬͨ

  66. ݪҼ͸ gitͷIndexͷߏஙΛߦ͏Օॴ ͩͬͨ

  67. Indexͷ಺෦දݱ(public)͕ sliceʹͳ͓ͬͯΓ nameͰҰҙʹ͢ΔͨΊʹ deleteͱappend͕ ϑΝΠϧ਺ճ࣮ߦ͞Ε͍ͯͨ

  68. delete͸nameΛࢦఆͯ͠ ࡟আ͢ΔͷͰ sliceͷཁૉΛ͢΂ͯ૸ࠪ͢Δ

  69. ౰ॳ͸ ಺෦දݱΛmapʹมߋ͢Ε͹ deleteͷܭࢉྔΛݮΒͤΔ ͷͰͦ͏͠Α͏ͱࢥͬͨ

  70. ͔͠͠ publicͳϑΟʔϧυ internalͰͳ͍ύοέʔδ 4500 star௒͑ͷ஌໊౓

  71. ౸ఈड͚ೖΕΒΕͳ͍ มߋͩͱࢥ͍ఘΊͨ

  72. ޙ೔

  73. ιʔείʔυΛ ݟ௚͍ͯ͠Δͱ

  74. ໌ه͞Ε͍ͯΔ ϑΟʔϧυͷ࢓༷ (sliceͷॱং͸อূ͠ͳ͍) େྔͷIndexΛߏங͢Δ λΠϛϯά(clone࣌)

  75. ݟ௚ͨ݁͠Ռ ഁյతมߋແ͠Ͱ मਖ਼͢Δํ๏Λࢥ͍͍ͭͨ

  76. ଟ͘ݺͼग़͞ΕΔՕॴ ͷΈʹ͓͍ͯ mapͰอ࣋͠return͢Δࡍʹ sliceʹม׵͢Δ

  77. mapΛprivateͳؔ਺Ͱ Ҿ͖ճ͢Α͏ʹ͢Δ͜ͱͰ ഁյతมߋΛճආͰ͖ͨ

  78. None
  79. ͦ͜Λղܾ͢Δͱ mallocgc͕ॏ͘ͳͬͨ

  80. ϝϞϦ࢖༻ྔͷݪҼ͸ io.Copyʹ͋ΔΑ͏ͩͬͨ

  81. ϑΝΠϧ͝ͱʹio.Copy͕ ݺ͹Ε͍ͯͨ

  82. େྔʹݺ͹ΕΔՄೳੑ͕͋Δ ՕॴͰ͸io.CopyͰ͸ͳ͘ io.CopyBufferΛ࢖͏ͱ ϝϞϦ࢖༻ྔΛ੍ޚͰ͖Δ

  83. ࢖͏bufferΛͲ͏ࢦఆ͢Δ͔ ֎෦͔Β༩͑ΒΕͳ͍ͱ ݁ہϝϞϦ࢖༻ྔมΘΒͳ͍

  84. privateͳϝιουͱ͸͍͑ Ҿ਺ʹ௥Ճ͢Δͱ มߋൣғ͕େ͖͘ͳΔ

  85. ղܾࡦͱͯ͠ globalʹsync.PoolΛ ஔ͘͜ͱʹͨ͠

  86. None
  87. https://github.com/src-d/ go-git/pull/1179

  88. None
  89. 605ඵ -> 249ඵ

  90. ͔͠͠ ·ͩ·ͩϝϞϦ࢖༻ྔ͕ଟ͍ 56 GB/op

  91. ϑΝΠϧʹࠩ෼Λద༻͢Δ ॲཧ͕ϝϞϦΛେྔʹ࢖༻

  92. ؔ਺಺෦Ͱ͸ publicͷؔ਺Ͱ buffer͕֎෦͔Β౉ͤͳ͍ ΋ͷ͕࢖ΘΕ͍ͯͨ

  93. গͳ͘ͱ΋಺෦͔Βͷ࢖༻ Ͱ͸bufferΛࢦఆ͍ͨ͠

  94. None
  95. ಺෦޲͚ʹbufferࢦఆͰ͖Δ Α͏ʹͯ͠ղܾ

  96. https://github.com/src-d/ go-git/pull/1180

  97. 56.1 GB -> 29.8 GB

  98. None
  99. None
  100. image/png

  101. ύονΛૹͬͨഎܠ

  102. ࣾ಺ISUCONͰ QRίʔυΛߴ଎ʹੜ੒͢Δ ඞཁ͕͋ͬͨͨΊ

  103. ࣾ಺ISUCONͷৼΓฦΓͰ ύϑΥʔϚϯενϡʔχϯά ΛҰਓͰָ͠ΜͰ͍ͨ

  104. ऄ଍

  105. ߴ଎ʹେྔͷpngΛ ग़ྗ͢ΔͨΊʹ͸

  106. 1.9͔Βೖͬͨ png.EncoderBufferPool

  107. CompressionLevelͷઃఆ

  108. ಠࣗimage.Image࣮૷Λ ࢖Θͳ͍

  109. നࠇը૾Ͱ͋Ε͹ image.GrayΛ࢖͏ (Opaque͕bypassͰ͖Δ)

  110. ऄ଍ऴྃ

  111. image/png͕ bottleneckʹͳ͖ͬͯͨ

  112. ͢ͰʹBenchmark͕ ॻ͔Ε͍ͯͨͷͰ࣮ߦ

  113. ࠓճ࢖͍ͬͯΔ ՕॴͰ͸ͳ͍͕࿐ࠎʹ஗͍ ෦෼͕͋ͬͨ

  114. https://go- review.googlesource.com/ c/go/+/187417

  115. None
  116. ࣮ࡍॏ͔ͬͨՕॴ͸ compress/deflate

  117. pprof͕ॏ͍ͱࣔͨ͠ͷ͸ for͕ॻ͔Ε͍ͯΔߦͩͬͨ

  118. None
  119. ౰֘forจͷasmΛݟͯΈΔͱ ແବͳϝϞϦΞΫηε͕

  120. compilerʹregisterΛ ࢖ͬͯ΋Β͏ͨΊʹ ϩʔΧϧม਺Λఆٛ

  121. None
  122. https://go- review.googlesource.com/ c/go/+/187837

  123. None
  124. None
  125. GoogleContainerTools/ kaniko

  126. ύονΛૹͬͨഎܠ

  127. ΞΠσΟΞ͕εΩͩͬͨ

  128. kanikoΛCIͰ࢖ͬͯ Կʹ͕͔͔͍࣌ؒͬͯΔͷ͔ ؾʹͳͬͨ

  129. kaniko͸ϝϞϦ্ʹ filesystemͷsnapshotΛ࣋ͭ

  130. ίϚϯυΛ࣮ߦ͢Δͨͼʹ ࠩ෼͕ͳ͍͔ൺֱ͢Δ

  131. md5Ͱൺֱ͢Δ

  132. ͦͷmd5͕ॏ͔ͬͨ

  133. ϑΝΠϧ͕ ಉҰ͔Ͳ͏͔͚ͩͰྑ͍ͷͰ md5Ͱ͋Δඞཁ͸ͳ͍

  134. minio/HighwayHashʹมߋ (ຊ౰͸Α͘ͳ͍)

  135. None
  136. None
  137. mount͞Ε͍ͯΔ σΟϨΫτϦ͸ snapshotର৅֎

  138. ϑΝΠϧ͕whitelistʹ ؚ·ΕΔ൑ఆ͢ΔՕॴͰ strings.SplitΛ࢖༻

  139. ϑΝΠϧ਺͕ଟ͔ͬͨΓ directory͕ਂ͍ͱ ແବʹϝϞϦΛ࢖༻͢Δ

  140. ಛੑΛߟ͑ͯ strings.SplitNΛ࢖༻

  141. None
  142. None
  143. https://github.com/ GoogleContainerTools/ kaniko/pull/694

  144. 129.54s -> 88.29s

  145. None
  146. 1. ࣮ࡍʹ౤͛ͨύον 2. ଎͍ϥΠϒϥϦ͕ॻ͖͍ͨ

  147. orisano/wyhash

  148. kanikoͷύονΛ ॻ͍͍ͯΔͱ͖ʹ Կ͕ྑ͍hashͳͷͩΖ͏

  149. Q. ଎͍hash͸?

  150. Q. ଎͍hash͸? A. ܭଌ͠·͠ΐ͏

  151. dgryski/trifles/hashbench खݩͰ૸ΒͤͯΈ·͠ΐ͏

  152. (ݟ͔ͭΒͳ͍package͕ ͋ΔͷͰಈ͖·ͤΜ)

  153. wyhash͕ GitHubͷTrendingͰ ্͕͖ͬͯͨ

  154. ଎ͯ͘ϙʔλϒϧͰڧ͍ Β͍͠

  155. ඇৗʹ୯७ͳͷͰ GoʹҠ২ͯ͠ΈΑ͏ͱࢥͬͨ

  156. 2೔ҐͰҠ২͕ऴྃ

  157. hashbenchʹ௥Ճ ֬ೝ͢ΔͱϘϩෛ͚͍ͯ͠Δ

  158. ͜Μͳܭࢉ͔͠ͳ͍ॲཧΛ Ͳ͏΍ͬͯߴ଎Խ͢Δͷ͔

  159. ྨࣅϥΠϒϥϦͷௐࠪ

  160. ҉߸ܥ΍hashܥ͸ جຊతʹasm͕࢖ΘΕ͍ͯΔ

  161. asmΛ࢖͏ͱ଎͍?

  162. ॻ͍ͯΈΑ͏

  163. Go asm͸ಠಛͳײ͡ ॻ͍͍ͯΔ೔ຊਓ͕ ΄ͱΜͲ͍ͳ͍? ͋·Γࢿྉ͕ͳ͍

  164. ؤுͬͯ AVXΛ࢖ͬͯॏ͍ॲཧΛॻ͘

  165. lldbΛ࢖ͬͯ bug(SEGV)Λमਖ਼͢Δ

  166. Benchmark݁Ռ ஗͘ͳ͍ͬͯΔ

  167. None
  168. Կނ͔

  169. asmͰॻ͍ͨؔ਺͸ inlineԽ͞Εͳ͍

  170. math/bits encoding/binary ίϯύΠϥ͕ݡ͘࠷దԽ͢Δ https://dave.cheney.net/ 2019/08/20/go-compiler- intrinsics

  171. inlineԽ͞ΕΔΑ͏ͳ খ͞ͳؔ਺͸ asmͷޮՌ͕ಘΒΕͳ͍

  172. ࠓճͷΑ͏ͳ৔߹ͩͱ loop·ͰؚΊͯasmԽ͢΂͖

  173. ෆ׳ΕͳasmͰ ଟ͘ͷίʔυΛॻ͖ͨ͘ͳ͍

  174. mmcloughlin/avo Λ࢖͓͏

  175. None
  176. GoͰasmΛੜ੒͢Δ ϓϩάϥϜΛॻ͘Ξϓϩʔν

  177. Կ͕ྑ͍͔?

  178. Go asmͷ͓࡞๏Λ avo͕΍ͬͯ͘ΕΔ

  179. Go IDEͰͷิ׬͕ޮ͘

  180. avoΛ࢖ͬͯؤுͬͨ 5 GB/s -> 11 GB/s

  181. ߴ଎ͳasmΛॻ͘ͷ͕೉͍͠

  182. asmϨϕϧͰͳͥ஗͍ͷ͔ pprofͰ͸Θ͔Βͳ͍

  183. ύΠϓϥΠχϯάΛҙࣝ͢Δ 11 GB/s -> 14 GB/s

  184. ·ͱΊ

  185. νϡʔχϯάΛͲ͏ਐΊΔͷ͔ • 0. ෆຬͷϋʔυϧΛԼ͛Δ • 1. ܭଌͷϋʔυϧΛԼ͛Δ • 2. ໰୊ՕॴΛಛఆ͢Δ

    • 3. BenchmarkΛॻ͘
  186. νϡʔχϯάΛͲ͏ਐΊΔͷ͔ • 4. ղܾࡦΛߟ͑Δ • Ϛʔδ͞Ε΍͍͢ղܾࡦΛ໨ࢦ͢ • ϝϯςφϯείετΛ্͛ͳ͍΋ͷ • 5.

    ࢼߦࡨޡ͢Δ • 6. ύονΛૹΔ
  187. Ҿ͖ग़͠Λ૿΍͢ • ϝϞϦ࢖༻ྔ͸໰୊ʹͳΓ͕ͪ • ֎෦͔ΒBuffer͕ड͚औΕΔAPIΛߟྀ͢Δ • มߋ͕༰қͰ͸ͳ͍৔߹sync.PoolΛߟྀ͢Δ

  188. Ҿ͖ग़͠Λ૿΍͢ • ద੾ͳhashΛબ୒͢Δ • Ξηϯϒϥ͸ϝϯςφϯείετ͕ߴ͍ͷͰ ۃྗආ͚Δ • ॏ͍ॲཧͷ෦෼͚ͩσʔλͷ࣋ͪํΛม͑ͯ ΈΔ