Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Java Ͱ࠷ͷ ϋογϡΞϧΰϦζϜΛٻΊͯ 2015-08-10 JJUG Night seminar @komiya_atsushi
Slide 2
Slide 2 text
͓·ͩΕ ʢ͓લ୭Αʁʣ
Slide 3
Slide 3 text
KOMIYA Atsushi @komiya_atsushi
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
bit.ly/WeLoveSmartNews
Slide 6
Slide 6 text
ຊͷτϐοΫ: ϋογϡΞϧΰϦζϜ (ؔ)
Slide 7
Slide 7 text
ϋογϡؔʁ
Slide 8
Slide 8 text
ϋογϡؔʁ ͋ͬɺ͜ΕਐݚθϛͰ+%,ͷ ιʔεͰΈͨͭͩʂʂ
Slide 9
Slide 9 text
ϋογϡؔʁ KBWBVUJM)BTI.BQ Λ͏ͱ͖ʹ͓ੈʹ ͳͬͯΔΞϨͰ͢
Slide 10
Slide 10 text
ϋογϡؔͷར༻༻్ • ΞϧΰϦζϜ / σʔλߏ • ϋογϡςʔϒϧ • ϒϧʔϜϑΟϧλ • Count-Min sketch • ػցֶश • Locality sensitive hashing • Feature hashing • ηΩϡϦςΟ • ϝοηʔδμΠδΣετ / ϝοηʔδೝূූ߸
Slide 11
Slide 11 text
ϋογϡؔͷར༻༻్ • ΞϧΰϦζϜ / σʔλߏ • ϋογϡςʔϒϧ • ϒϧʔϜϑΟϧλ • Count-Min sketch • ػցֶश • Locality sensitive hashing • Feature hashing • ηΩϡϦςΟ • ϝοηʔδμΠδΣετ / ϝοηʔδೝূූ߸ )BTI.BQҎ֎Ͱ සൟʹ͓ੈʹͳͬͯ·͢
Slide 12
Slide 12 text
ϋογϡΞϧΰϦζϜʹٻΊΔػೳɾੑೳ • ػೳ • Մมͷ͞ͷσʔλʹରͯ͠ɺϋογϡΛܭࢉͰ͖Δ • γʔυΛ༩͑Δ͜ͱͰɺϋογϡؔͷόϦΤʔγϣϯΛ࡞ Δ͜ͱ͕Ͱ͖Δ • ಉ͡ϋογϡΞϧΰϦζϜʹಉ͡σʔλΛ༩͑ͯɺ γʔυ͕ҟͳΔͳΒϋογϡҟͳΔ • ੑೳ • ͍ • িಥ͠ʹ͍͘
Slide 13
Slide 13 text
ϋογϡΞϧΰϦζϜʹٻΊΔػೳɾੑೳ • ػೳ • Մมͷ͞ͷσʔλʹରͯ͠ɺϋογϡΛܭࢉͰ͖Δ • γʔυΛ༩͑Δ͜ͱͰɺϋογϡؔͷόϦΤʔγϣϯΛ࡞ Δ͜ͱ͕Ͱ͖Δ • ಉ͡ϋογϡΞϧΰϦζϜʹಉ͡σʔλΛ༩͑ͯɺ γʔυ͕ҟͳΔͳΒϋογϡҟͳΔ • ੑೳ • ͍ • িಥ͠ʹ͍͘ ͍ʹਖ਼ٛ
Slide 14
Slide 14 text
※҉߸ֶతϋογϡؔ • ϝοηʔδμΠδΣετϝοηʔδೝূූ ߸ͷੜʹར༻Ͱ͖Δϋογϡؔ • ී௨ͷϋογϡؔͷಛੑʹՃ͑ɺڧিಥ ੑऑিಥੑͳͲͷಛੑΛͭඞཁ͕ ͋Δ • ྫɿMD5 SHA-xx γϦʔζͳͲ
Slide 15
Slide 15 text
※҉߸ֶతϋογϡؔ • ϝοηʔδμΠδΣετϝοηʔδೝূූ ߸ͷੜʹར༻Ͱ͖Δϋογϡؔ • ී௨ͷϋογϡؔͷಛੑʹՃ͑ɺڧিಥ ੑऑিಥੑͳͲͷಛੑΛͭඞཁ͕ ͋Δ • ྫɿMD5 SHA-xx γϦʔζͳͲ ҉߸ֶతϋογϡؔ ຊऔΓѻ͍·ͤΜ ʢ͍ͷͰʣ
Slide 16
Slide 16 text
ຊऔΓ্͛Δ ϋογϡΞϧΰϦζϜ
Slide 17
Slide 17 text
5BCMFGSPNIUUQTHJUIVCDPN$ZBOYY)BTI
Slide 18
Slide 18 text
5BCMFGSPNIUUQTHJUIVCDPN$ZBOYY)BTI 2VBMJUZ͕ेͳ ͜ͷͭΛऔΓ্͛·͢
Slide 19
Slide 19 text
MurmurHash series • 2008 ʙ • ༷ʑͳϓϩμΫτͰ৭ʑͳ༻్ͰΘΕ͍ͯΔ • Nginx, Hadoop, Cassandra, Solr… • from https://en.wikipedia.org/wiki/ MurmurHash#Implementations • Current version: MurmurHash3 • ࠷େ 128 bit ͷϋογϡΛܭࢉ͢Δ͜ͱ͕Ͱ͖Δ
Slide 20
Slide 20 text
CityHash • 2011 ʙ • Google ۘͷϋογϡΞϧΰϦζϜ • http://google-opensource.blogspot.jp/2011/04/introducing- cityhash.html • “inspired by (தུ) MurmurHash” • ࠷େ 128 bit ͷϋογϡΛܭࢉ͢Δ͜ͱ͕Ͱ͖Δ • ϦϦʔεϊʔτʹʮMurmurHash3 ΑΓ͍ʯతͳ͜ͱ͕ॻ͔Ε͍ͯΔ ͕… • https://code.google.com/p/cityhash/source/browse/trunk/README
Slide 21
Slide 21 text
xxHash • 2012 ʙ • Extremely fast ͳѹॖΞϧΰϦζϜ LZ4 Λ։ൃ͍ͯ͠Δํ (Yann Collet @ Facebook Paris) ͷɺ͜Ε·ͨ extremely fast ͳϋογϡΞϧΰϦζϜ • C ࣮Ͱ MurmurHash3 ͦͷଞΛ͑ͯ #1 ͷΒ͍͠ • ࠷େ 64 bit ͷϋογϡΛܭࢉ͢Δ͜ͱ͕Ͱ͖Δ • ར༻࣮·ͩଟ͘ͳ͛͞ • Presto ͷϋογϡΞϧΰϦζϜ͕ MurmurHash3 ͔Β xxHash ʹࠩ͠ ସ͑ΒΕΔͳͲɺ࠾༻ঃʑʹ͕͖͍ͬͯͯΔʁ • https://github.com/facebook/presto/commit/ 87cb4f2ba8a57a3edb6e4d5a89658b6a3191b3e7
Slide 22
Slide 22 text
֤छϋογϡΞϧΰϦζϜͷ Java ࣮
Slide 23
Slide 23 text
Java ͰͷϋογϡΞϧΰϦζϜ࣮ • ࠷ۙͷϋογϡΞϧΰϦζϜ CPU ͷ໋ྩΛҙࣝͯ͠ઃܭ͞Εͨͷ ͕ଟ͍ • JVM ্Ͱಈ͘ Java ɺͦͷઃܭͷԸܙΛड͚ΒΕΔͱݶΒͳ͍ • ಉ͡ϋογϡΞϧΰϦζϜͰɺ࣮ํ๏ʹΑ͕ͬͯࠩੜ͡Δ • Pure Java ࣮ • sun.misc.Unsafe ࣮ • Private API ͱͳΜͩͬͨͷ͔… • JNI ܦ༝ͷ native ࣮
Slide 24
Slide 24 text
Guava • ‘com.google.guava:guava:18.0’ • Google Core Libraries for Java • ศརͳػೳ͕͍Ζ͍Ζೖ͍ͬͯΔ • ϋογϡΞϧΰϦζϜͷ࣮ MurmurHash3 ͷΈ • ೖྗɾग़ྗΠϯλϑΣʔεͱʹॆ࣮͍ͯ͠Δ
Slide 25
Slide 25 text
Zero-allocation hashing (OpenHFT) • ‘net.openhft:zero-‐allocation-‐hashing:0.3’ • HFT (ߴසऔҾ) ͳձࣾʁͷϓϩμΫτ • ϋογϡΞϧΰϦζϜͷ࣮ʹಛԽ • MurmurHash3, CityHash, xxHash ͷ࣮͕ఏڙ͞Ε͍ͯΔ • Google guava ͱಉ͡Α͏ʹΠϯλϑΣʔε͕ͦͦ͜͜ॆ࣮͠ ͍ͯΔ • sun.misc.Unsafe API Λར༻͍ͯ͠Δ
Slide 26
Slide 26 text
lz4-java • ‘net.jpountz.lz4:lz4:1.3.0’ • LZ4 ͷ Java ϙʔςΟϯά • ͚ͩͲɺΕͳ͘ xxHash ͷ Java ࣮ ͍ͭͯ͘Δ • Pure Java / sun.misc.Unsafe / Native Ͱͷ ֤࣮Λఏڙ͍ͯ͠Δ
Slide 27
Slide 27 text
ೖྗΠϯλϑΣʔεͷൺֱ (VBWB 0QFO)'5 M[KBWB CZUF<> ̋ ̋ ̋ #ZUF#V⒎FS ̋ 4USJOH ̋ ̋ MPOH ̋ ̋ JOU ̋ ̋ 4USFBNJOH ̋ ̋ CZUF<> PUIFST 0CKFDU 0UIFSQSJNJUJWFT BSSBZ
Slide 28
Slide 28 text
ೖྗΠϯλϑΣʔεͷൺֱ (VBWB 0QFO)'5 M[KBWB CZUF<> ̋ ̋ ̋ #ZUF#V⒎FS ̋ 4USJOH ̋ ̋ MPOH ̋ ̋ JOU ̋ ̋ 4USFBNJOH ̋ ̋ CZUF<> PUIFST 0CKFDU 0UIFSQSJNJUJWFT BSSBZ ೖྗ*'ͷ๛͞ ;FSPBMMPDBUJPO IBTIJOH͕ѹత
Slide 29
Slide 29 text
ೖྗΠϯλϑΣʔεͷൺֱ (VBWB 0QFO)'5 M[KBWB CZUF<> ̋ ̋ ̋ #ZUF#V⒎FS ̋ 4USJOH ̋ ̋ MPOH ̋ ̋ JOU ̋ ̋ 4USFBNJOH ̋ ̋ CZUF<> PUIFST 0CKFDU 0UIFSQSJNJUJWFT BSSBZ
Slide 30
Slide 30 text
ग़ྗΠϯλϑΣʔεͷൺֱ (VBWB 0QFO)'5 M[KBWB CZUF<> ̋ MPOH ̋ ̋ ̋ JOU ̋ 4USJOH ̋
Slide 31
Slide 31 text
ग़ྗΠϯλϑΣʔεͷൺֱ (VBWB 0QFO)'5 M[KBWB CZUF<> ̋ MPOH ̋ ̋ ̋ JOU ̋ 4USJOH ̋ ग़ྗ*' (VBWB͕ ༏Ε͍ͯΔ
Slide 32
Slide 32 text
ϕϯνϚʔΫ
Slide 33
Slide 33 text
ೖྗσʔλ • byte ྻ • 8 byte, 1024 byte, 64K byte ͷ 3 ύλʔϯ • long (ϓϦϛςΟϒ) • String • 64K จࣈ
Slide 34
Slide 34 text
ϋογϡΞϧΰϦζϜ • ͍ͣΕͷϋογϡΞϧΰϦζϜγʔυݻఆ • MurmurHash • 128 bit ൛Λར༻ • CityHash • 64 bit ൛Λར༻ • xxHash • 64 bit ൛Λར༻
Slide 35
Slide 35 text
bit.ly/jjug-2015-hash-bench
Slide 36
Slide 36 text
ϕϯνϚʔΫ݁Ռ (byte array)
Slide 37
Slide 37 text
Byte array (8 bytes)
Slide 38
Slide 38 text
Byte array (8 bytes) /BUJWFΦʔόʔϔουେ͖Ί ͳͷ͔ɺ͍σʔλΛͨ͘͞Μ ॲཧ͢Δͷ͕ۤखͬΆ͍
Slide 39
Slide 39 text
Byte array (8 bytes)
Slide 40
Slide 40 text
Byte array (1024 bytes)
Slide 41
Slide 41 text
Byte array (64K bytes)
Slide 42
Slide 42 text
Byte array (64K bytes) 6OTBGF͍
Slide 43
Slide 43 text
Byte array (64K bytes) /BUJWFେ͖͍σʔλʹର͍ͯ͠
Slide 44
Slide 44 text
Byte array (64K bytes) YY)BTI$JUZ)BTI .VSNVS)BTI
Slide 45
Slide 45 text
ϕϯνϚʔΫ݁Ռ (primitive long)
Slide 46
Slide 46 text
Primitive long
Slide 47
Slide 47 text
ϕϯνϚʔΫ݁Ռ (string)
Slide 48
Slide 48 text
String
Slide 49
Slide 49 text
·ͱΊ
Slide 50
Slide 50 text
ࠓ͜Ε͚֮ͩ͑ͯؼ͍ͬͯͩ͘͞ • 20158݄࣌Ͱ࠷ͷϋογϡΞϧΰϦζϜ • xxHash • 20158݄࣌Ͱ࠷ͷ Java ࣮ • OpenHFT ͷ Zero-allocation hashing
Slide 51
Slide 51 text
࣮ࡍέʔεόΠέʔε • 128 bit ͷϋογϡ͕ཉ͍͠ • Guava • 64 bit Ͱ͍͍ͷͰɺ࠷ͷ MurmurHash ͕ཉ͍͠ • Zero-allocation hashing • ετϦʔϜతʹϋογϡΛܭࢉ͍ͨ͠ • lz4-java or Guava
Slide 52
Slide 52 text
ࠔͬͨΒͱΓ͋͑ͣɺ xxHash Zero-allocation hashing
Slide 53
Slide 53 text
Thank you & Happy hashing!!
Slide 54
Slide 54 text
We’re hiring! iOSΤϯδχΞ / AndroidΤϯδχΞ / WebΞϓϦέʔγϣϯΤϯδχΞ / ϓϩμΫςΟϏςΟΤϯδχΞ / ػցֶश / ࣗવݴޠॲཧΤϯδχΞ / άϩʔεϋοΫΤϯδχΞ / αʔόαΠυΤϯδχΞ / ࠂΤϯδχΞ…