OSSPerformance TuningTips#gocon #gocon_hall GoCon 2019 Autumn@orisano
View Slide
త̎ͭͷ͜ͱΛڞ༗͍ͨ͠
νϡʔχϯάΛͲ͏ਐΊΔͷ͔Ҿ͖ग़͠Λ૿͢
Ͳ͏ਐΊ͍ͯΔͷ͔
0. ෆຬͷϋʔυϧΛԼ͛Δ
ෆຬ͕վળͷ͖͔͚ͬʹͳΓ·͢
ීஈ͍ͬͯΔϥΠϒϥϦɺιϑτΣΞ͕͍ͷͩͱࢥͬͯΈΔ
͖͔͚ͬͱϞνϕʔγϣϯΛࣗΒ࡞Γग़͠·͠ΐ͏
1. ܭଌͷϋʔυϧΛԼ͛Δ
͍ݪҼΛؾܰʹௐΒΕΔೳྗΛʹ͚ͭΔ
ख͕͔͔ؒΔߦҝΛਓؒ͋·Γ͠ͳ͍
ͨ͘͞Μܦݧ͢Δ
ؾܰʹௐࠪͰ͖ΔΑ͏ͳΈΛೖΕΔ
͜ΕOSSΛެ։͢ΔଆͷڥมͰࢦఆͨ͠ΓσϑΥϧτͰprofile͕औΕΔΈΛఏڙ͢Δͱվળ͍͢͠/͞Ε͍͢
ʹૺ۰ͨ͠ਓ͕ݪҼௐ͍ࠪ͢͠ͱخ͍͠
2. ՕॴΛಛఆ͢Δ
͍ͱײͯ͡Օॴ͕ؔϨϕϧͰΘ͔Βͳ͍ͷcliΛ࣮ߦ͍ͯ͠Δͱ͖
mainʹgithub.com/pkg/profileΛՃͯ͠ΈΔ
͜Ε͚ͩ
cliͷ࣮ߦ͕1ʹ1࣌ؒҎ্͔͔Δ߹ͳͲnet/http/pprofͷ΄͏͕ྑ͍߹
profileΛݟͯؔϨϕϧͰಛఆ͢Δ
pprofͷweb൛ʹ׳ΕΔͪΌΜͱҙຯ͕Θ͔Δ
͍ΖΜͳݟํͰݟͯΈΔtop, graph, flame-graphsource, disasm
3. BenchmarkΛॻ͘
͢Ͱʹॻ͍ͯ͋Δ߹ϥοΩʔ͘ͳ͍ͬͯͳ͍߹͘ͳΔέʔεΛՃ͢Δ
BenchmarkΛॻ࣌͘࠷దԽͰফ͍͑ͯͳ͍͔֬ೝ
BenchmarkΛॻ͘ͱ؆୯ʹprofile͕औΕΔ
go test -cpuprofilego test -memprofileΛ֮͑Α͏(go help testflagΛಡ͏)
4. ղܾࡦΛߟ͑Δ
Ͳ͏ղܾ͢Δ͔ʹ͍ͭͯ“Ҿ͖ग़͠Λ૿͢”Ͱ
OSSͷύϑΥʔϚϯενϡʔχϯάϚʔδ͞Εͯྃ
Ͳ͏ΕϚʔδ͞ΕΔύονʹͳΔ͔
Ϛʔδ͢Δͱ͍͏͜ͱ૬ख͕ϝϯςφϯε͢Δͱ͍͏͜ͱ
มߋ͕গͳ͍૬खͷίετΛ૿͞ͳ͍
ഁյతมߋ͕ͳ͍͘ΘΕ͍ͯΔͷ΄Ͳॏཁͳͱ͜Ζ
ҟৗͳίετ͕ͳ͍มߋ, อक͕͘͠ͳΔ࣮(ΞηϯϒϥΛͬͨΓ)
ґଘؔΛՃ͢Δ߹৻ॏʹબ͢ΔΞΫςΟϒʹϝϯςφϯε͞Ε͍ͯΔͷ͔ຊʹඞཁ͔?
͘͢͝ύϑΥʔϚϯε͕վળ͢Δܶతͳվળड͚ೖΕΒΕΔ
ςετ͕ॻ͍ͯ͋Δ͢Ͱʹॻ͍ͯ͋Δ߹डཧ͞Ε͍͢
5. ࢼߦࡨޡ͢Δ
·ͣॳظঢ়ଶͰेͳճtimeoutʹͳΒͳ͍Α͏ʹBenchmark݁ՌΛऔ͓ͬͯ͘
go test -bench .-count=10-timeout=30000s| tee old.txt
ύονΛૹΔͱ͖ʹbenchstatͷ݁ՌΛૹΔͷͰϕʔεϥΠϯेͳճͰ
timeoutσϑΥϧτͩͱ10mͰҙ֎ͱΦʔόʔ͢Δ
ղܾࡦΛࢥ͍͍ͭͨΒ·ͣ1Benchmark
ے͕ྑͦ͞͏ͳΒेͳճBenchmark
ଟ͘ࢼߦࡨޡ͍ͯ͠Δͱprofileͱsourceͷ͕ؔΘ͔Βͳ͘ͳΔ
pprofʹsourceΛݟΔػೳ͕͋Δ͕อ͍࣋ͯ͠ΔΘ͚Ͱͳ͍pathͷΈ
git add .git commit -m “$2”REV=$(git rev-parse HEAD)go test -bench $1 -benchmem -cpuprofilecpu.${REV}.pb.gz -memprofile mem.${REV}.pb.gz | tee ${REV}.txtrm ./${REV}.* && git reset HEAD^
sourcegitʹཧͯ͠Β͏ͪΌΜͱඥ͚ΒΕΔ
ଟ͘ͷprofile͕͋Δͱ୯ମͷޮՌ͕Θ͔Βͳ͍͜ͱ͕͋Δ
go tool pprof -diff_baseͰprofileؒͷ͕ࠩݟΕΔ
6. ύονΛૹΔ
࠷ऴͷBenchmarkΛेͳճߦ͏
ઐ༻ͷΠϯελϯεͰܭଌ͍͠߹ۃྗ֎෦ϓϩηεΛఀࢭ͢Δ
࣮ྫϕʔεͰͲͷΑ͏ʹղܾ͔ͨࣗ͠ͳΒͲ͏ղܾ͢Δ͔ߟ͑ͯΒ͍͍ͨ
1. ࣮ࡍʹ͛ͨύον2. ͍ϥΠϒϥϦ͕ॻ͖͍ͨ
src-d/go-git
pure-goͰgitͷૢ࡞Λ͢ΔͨΊͷϥΠϒϥϦ
ύονΛૹͬͨഎܠ
aquasecurity/trivyͰॳ͍͕ͬͯͨେ͖ͳrepositoryͷcloneʹҟৗͳ࣌ؒ(10min~)͕͔͔Δ͕͋ͬͨ
ݪҼgitͷIndexͷߏஙΛߦ͏Օॴͩͬͨ
Indexͷ෦දݱ(public)͕sliceʹͳ͓ͬͯΓnameͰҰҙʹ͢ΔͨΊʹdeleteͱappend͕ϑΝΠϧճ࣮ߦ͞Ε͍ͯͨ
deletenameΛࢦఆͯ͠আ͢ΔͷͰsliceͷཁૉΛͯࠪ͢͢Δ
ॳ෦දݱΛmapʹมߋ͢ΕdeleteͷܭࢉྔΛݮΒͤΔͷͰͦ͏͠Α͏ͱࢥͬͨ
͔͠͠publicͳϑΟʔϧυinternalͰͳ͍ύοέʔδ4500 star͑ͷ໊
౸ఈड͚ೖΕΒΕͳ͍มߋͩͱࢥ͍ఘΊͨ
ޙ
ιʔείʔυΛݟ͍ͯ͠Δͱ
໌ه͞Ε͍ͯΔϑΟʔϧυͷ༷(sliceͷॱংอূ͠ͳ͍)େྔͷIndexΛߏங͢ΔλΠϛϯά(clone࣌)
ݟͨ݁͠Ռഁյతมߋແ͠Ͱमਖ਼͢Δํ๏Λࢥ͍͍ͭͨ
ଟ͘ݺͼग़͞ΕΔՕॴͷΈʹ͓͍ͯmapͰอ࣋͠return͢Δࡍʹsliceʹม͢Δ
mapΛprivateͳؔͰҾ͖ճ͢Α͏ʹ͢Δ͜ͱͰഁյతมߋΛճආͰ͖ͨ
ͦ͜Λղܾ͢Δͱmallocgc͕ॏ͘ͳͬͨ
ϝϞϦ༻ྔͷݪҼio.Copyʹ͋ΔΑ͏ͩͬͨ
ϑΝΠϧ͝ͱʹio.Copy͕ݺΕ͍ͯͨ
େྔʹݺΕΔՄೳੑ͕͋ΔՕॴͰio.CopyͰͳ͘io.CopyBufferΛ͏ͱϝϞϦ༻ྔΛ੍ޚͰ͖Δ
͏bufferΛͲ͏ࢦఆ͢Δ͔֎෦͔Β༩͑ΒΕͳ͍ͱ݁ہϝϞϦ༻ྔมΘΒͳ͍
privateͳϝιουͱ͍͑ҾʹՃ͢Δͱมߋൣғ͕େ͖͘ͳΔ
ղܾࡦͱͯ͠globalʹsync.PoolΛஔ͘͜ͱʹͨ͠
https://github.com/src-d/go-git/pull/1179
605ඵ -> 249ඵ
͔͠͠·ͩ·ͩϝϞϦ༻ྔ͕ଟ͍56 GB/op
ϑΝΠϧʹࠩΛద༻͢Δॲཧ͕ϝϞϦΛେྔʹ༻
ؔ෦ͰpublicͷؔͰbuffer͕֎෦͔Βͤͳ͍ͷ͕ΘΕ͍ͯͨ
গͳ͘ͱ෦͔Βͷ༻ͰbufferΛࢦఆ͍ͨ͠
෦͚ʹbufferࢦఆͰ͖ΔΑ͏ʹͯ͠ղܾ
https://github.com/src-d/go-git/pull/1180
56.1 GB -> 29.8 GB
image/png
ࣾISUCONͰQRίʔυΛߴʹੜ͢Δඞཁ͕͋ͬͨͨΊ
ࣾISUCONͷৼΓฦΓͰύϑΥʔϚϯενϡʔχϯάΛҰਓͰָ͠ΜͰ͍ͨ
ऄ
ߴʹେྔͷpngΛग़ྗ͢ΔͨΊʹ
1.9͔Βೖͬͨpng.EncoderBufferPool
CompressionLevelͷઃఆ
ಠࣗimage.Image࣮ΛΘͳ͍
നࠇը૾Ͱ͋Εimage.GrayΛ͏(Opaque͕bypassͰ͖Δ)
ऄऴྃ
image/png͕bottleneckʹͳ͖ͬͯͨ
͢ͰʹBenchmark͕ॻ͔Ε͍ͯͨͷͰ࣮ߦ
ࠓճ͍ͬͯΔՕॴͰͳ͍͕࿐ࠎʹ͍෦͕͋ͬͨ
https://go-review.googlesource.com/c/go/+/187417
࣮ࡍॏ͔ͬͨՕॴcompress/deflate
pprof͕ॏ͍ͱࣔͨ͠ͷfor͕ॻ͔Ε͍ͯΔߦͩͬͨ
֘forจͷasmΛݟͯΈΔͱແବͳϝϞϦΞΫηε͕
compilerʹregisterΛͬͯΒ͏ͨΊʹϩʔΧϧมΛఆٛ
https://go-review.googlesource.com/c/go/+/187837
GoogleContainerTools/kaniko
ΞΠσΟΞ͕εΩͩͬͨ
kanikoΛCIͰͬͯԿʹ͕͔͔͍࣌ؒͬͯΔͷ͔ؾʹͳͬͨ
kanikoϝϞϦ্ʹfilesystemͷsnapshotΛ࣋ͭ
ίϚϯυΛ࣮ߦ͢Δͨͼʹ͕ࠩͳ͍͔ൺֱ͢Δ
md5Ͱൺֱ͢Δ
ͦͷmd5͕ॏ͔ͬͨ
ϑΝΠϧ͕ಉҰ͔Ͳ͏͔͚ͩͰྑ͍ͷͰmd5Ͱ͋Δඞཁͳ͍
minio/HighwayHashʹมߋ(ຊΑ͘ͳ͍)
mount͞Ε͍ͯΔσΟϨΫτϦsnapshotର֎
ϑΝΠϧ͕whitelistʹؚ·ΕΔఆ͢ΔՕॴͰstrings.SplitΛ༻
ϑΝΠϧ͕ଟ͔ͬͨΓdirectory͕ਂ͍ͱແବʹϝϞϦΛ༻͢Δ
ಛੑΛߟ͑ͯstrings.SplitNΛ༻
https://github.com/GoogleContainerTools/kaniko/pull/694
129.54s -> 88.29s
orisano/wyhash
kanikoͷύονΛॻ͍͍ͯΔͱ͖ʹԿ͕ྑ͍hashͳͷͩΖ͏
Q. ͍hash?
Q. ͍hash?A. ܭଌ͠·͠ΐ͏
dgryski/trifles/hashbenchखݩͰΒͤͯΈ·͠ΐ͏
(ݟ͔ͭΒͳ͍package͕͋ΔͷͰಈ͖·ͤΜ)
wyhash͕GitHubͷTrendingͰ্͕͖ͬͯͨ
ͯ͘ϙʔλϒϧͰڧ͍Β͍͠
ඇৗʹ୯७ͳͷͰGoʹҠ২ͯ͠ΈΑ͏ͱࢥͬͨ
2ҐͰҠ২͕ऴྃ
hashbenchʹՃ֬ೝ͢ΔͱϘϩෛ͚͍ͯ͠Δ
͜Μͳܭࢉ͔͠ͳ͍ॲཧΛͲ͏ͬͯߴԽ͢Δͷ͔
ྨࣅϥΠϒϥϦͷௐࠪ
҉߸ܥhashܥجຊతʹasm͕ΘΕ͍ͯΔ
asmΛ͏ͱ͍?
ॻ͍ͯΈΑ͏
Go asmಠಛͳײ͡ॻ͍͍ͯΔຊਓ͕΄ͱΜͲ͍ͳ͍?͋·Γࢿྉ͕ͳ͍
ؤுͬͯAVXΛͬͯॏ͍ॲཧΛॻ͘
lldbΛͬͯbug(SEGV)Λमਖ਼͢Δ
Benchmark݁Ռ͘ͳ͍ͬͯΔ
Կނ͔
asmͰॻ͍ͨؔinlineԽ͞Εͳ͍
math/bitsencoding/binaryίϯύΠϥ͕ݡ͘࠷దԽ͢Δhttps://dave.cheney.net/2019/08/20/go-compiler-intrinsics
inlineԽ͞ΕΔΑ͏ͳখ͞ͳؔasmͷޮՌ͕ಘΒΕͳ͍
ࠓճͷΑ͏ͳ߹ͩͱloop·ͰؚΊͯasmԽ͖͢
ෆ׳ΕͳasmͰଟ͘ͷίʔυΛॻ͖ͨ͘ͳ͍
mmcloughlin/avoΛ͓͏
GoͰasmΛੜ͢ΔϓϩάϥϜΛॻ͘Ξϓϩʔν
Կ͕ྑ͍͔?
Go asmͷ͓࡞๏Λavo͕ͬͯ͘ΕΔ
Go IDEͰͷิ͕ޮ͘
avoΛͬͯؤுͬͨ5 GB/s -> 11 GB/s
ߴͳasmΛॻ͘ͷ͕͍͠
asmϨϕϧͰͳ͍ͥͷ͔pprofͰΘ͔Βͳ͍
ύΠϓϥΠχϯάΛҙࣝ͢Δ11 GB/s -> 14 GB/s
·ͱΊ
νϡʔχϯάΛͲ͏ਐΊΔͷ͔• 0. ෆຬͷϋʔυϧΛԼ͛Δ• 1. ܭଌͷϋʔυϧΛԼ͛Δ• 2. ՕॴΛಛఆ͢Δ• 3. BenchmarkΛॻ͘
νϡʔχϯάΛͲ͏ਐΊΔͷ͔• 4. ղܾࡦΛߟ͑Δ• Ϛʔδ͞Ε͍͢ղܾࡦΛࢦ͢• ϝϯςφϯείετΛ্͛ͳ͍ͷ• 5. ࢼߦࡨޡ͢Δ• 6. ύονΛૹΔ
Ҿ͖ग़͠Λ૿͢• ϝϞϦ༻ྔʹͳΓ͕ͪ• ֎෦͔ΒBuffer͕ड͚औΕΔAPIΛߟྀ͢Δ• มߋ͕༰қͰͳ͍߹sync.PoolΛߟྀ͢Δ
Ҿ͖ग़͠Λ૿͢• దͳhashΛબ͢Δ• Ξηϯϒϥϝϯςφϯείετ͕ߴ͍ͷͰۃྗආ͚Δ• ॏ͍ॲཧͷ෦͚ͩσʔλͷ࣋ͪํΛม͑ͯΈΔ