Upgrade to Pro — share decks privately, control downloads, hide ads and more …

近似近傍探索エンジン Sannyを支える技術/sanny_inside

近似近傍探索エンジン Sannyを支える技術/sanny_inside

monochromegane

June 14, 2018
Tweet

More Decks by monochromegane

Other Decks in Programming

Transcript

  1. • m࣍ݩͷϕΫτϧxΛn࣍ݩͷϕΫτϧy΁ࣸ૾͢Δؔ਺fΛਪఆ͢Δֶशख๏ 6 χϡʔϥϧωοτϫʔΫ: Ϟσϧ x1 x2 xm b W

    y1 y2 yn 1 2 o1 = σ( m ∑ i=0 w1i xi + b1 ) h 3 o = σ (Wx + b) w11 w12 . . w1m w21 w22 . . w2m . . wh1 wh2 . . whm x1 x2 . . xm + b1 b2 . . bh x f y h wh1 wh2 x1 x2 ೖྗ͝ͱʹॏΈXΛ৐ࢉ όΠΞε ͨ͠΋ͷΛ׆ੑԽؔ਺ʹ௨͢ શϊʔυʹର͢Δॲཧ͸ߦྻͱͯ͠ૢ࡞Ͱ͖Δ
  2. Similar images search system using ANN 7 Feature Similar items

    ANN [2048]float64 query response Deep CNN index Features [n][2048]float64 Deep CNN register find similar features
  3. • ΫΤϦͱߴ࣍ݩϕΫτϧू߹Λ೚ҙͷ࣍ݩ਺Ͱ౳෼ͨ͠෦෼ϕΫτϧ୯ҐͰฒ ߦʹۙ๣୳ࡧͨ݁͠Ռͷ࿨ू߹Ͱ͋Δۙ๣ީิ͔Βɼ࠶౓ۙ๣୳ࡧΛߦ͏ɽ 12 ఏҊख๏ R@ R@ R@ 9@ 9@

    9@ /SFDPSET // R@ Y@O Y㱨9@ // R@ Y@O Y㱨9@ // R@ Y@O Y㱨9@ \   ^ \   ^ \   ^ BSHNJOE R Y Y㱨\    ^ 㱮 㱮 ෼ղલ ෼ղޙ ᶃ௿࣍ݩۭؒͷฒߦͨۙ͠๣୳ࡧ ᶄݻఆ਺ͷۙ๣ީิͷू໿ ᶅۙ๣ީิͷઢܗ୳ࡧ ᶃ ᶄ ᶅ ଎౓վળ ᶃ ᶄ ᶅ 2VFSZ
  4. 13 Sanny: ఏҊख๏ͷ࣮૷ 4BOOZ 4BOOZ 4BOOZ // // 㱮 //

    // ᶃ ᶄ ᶅ 2VFSZ "MHPSJTN "MHPSJTN "MHPSJTN • ΫΤϦฒͼʹ୳ࡧର৅σʔλͷ೚ҙ࣍ݩ΁ͷ౳෼ͱ݁Ռͷू໿Λ୲౰͢Δ • ෦෼ϕΫτϧͷۙ๣୳ࡧΞϧΰϦζϜ͸໰Θͳ͍ • ෦෼ϕΫτϧ͝ͱͷ୳ࡧॲཧ͸ಠཱͷͨΊ෼ࢄߏ੒͕Մೳ
  5. ύϑΥʔϚϯε 14 • ॎ࣠͸୳ࡧඵ਺ͷٯ਺දࣔɼ্ʹ ߦ͘΄Ͳߴ଎ɽԣ࣠͸ద߹཰ɽ • ݩͷΞϧΰϦζϜʹରͯ͠ఏҊख๏ Λద༻ͨ͜͠ͱͰਫ਼౓ɼ଎౓ͷվ ળ͕ݟΒΕͨ ʰSanny:

    େن໛ECαΠτͷͨΊͷਫ਼౓ͱ଎౓Λཱ྆ͨ͠෼ࢄՄೳͳۙࣅۙ๣୳ࡧΤϯδϯʱ https://blog.monochromegane.com/blog/2018/05/16/wsa_2_sanny/
  6. • SannyͰ͸ۙ๣ީิͷू߹͔ΒϢʔΫϦουڑ཭ʹΑΔۙ๣୳ࡧΛߦ͏ • ࣍ݩ਺ͱϨίʔυ਺ʹൺྫ͢Δઢܗ୳ࡧͳͷͰॲཧ࣌ؒΛ୹ॖ͍ͨ͠ • ɹɹɹɹ͸֤࣍ݩ͝ͱʹಠཱ͍ͯ͠ΔͷͰฒߦॲཧ͕Մೳ 17 ϢʔΫϦουڑ཭ܭࢉ d(p, q)

    = v u u t n X i=1 (pi qi)2 <latexit sha1_base64="4nK7zdU6Jks9oKKb01pFzDnzRmI=">AAACxXichVFNSxxBEH2OHzFqdNVLIJchi7KGuPQsAUVYED3o0a91lV0dZsZWG+drZ3oX1mHwnj8QwVOEHIKH/AMvOcQ/4MGfIB4N5JJDamYHxEhMNd1V/bre6yrK9G0RSsZuupTunt6+F/0vBwaHXg2P5EbHNkOvGVi8Ynm2F2yZRsht4fKKFNLmW37ADce0edU8Wkzeqy0ehMJzN2Tb5zuOceCKfWEZkiA9V9or1E0n8uP3iWvEU2pZrYeNQEb1sOnokShr8a6rFnxdTDd0MbVbivVcnhVZaurTQMuCPDJb8XKXqGMPHiw04YDDhaTYhoGQVg0aGHzCdhARFlAk0neOGAPEbVIWpwyD0CM6D+hWy1CX7olmmLIt+sWmHRBTxQS7Zl/ZPbtiF+yW/f6nVpRqJLW0yZsdLvf1kY+v13/9l+WQlzh8YD3DMCm7U9kN+0Z13W2/q1Wf7VFiH7Npb4J69VMk6drq6LSOP92vz61NRJPsnN1Rv59J+Tspu62f1pdVvnZG49L+Hs7TYLNU1FhRW/2Qn1/IBtePN3iLAk1nBvNYxgoq9OspLvEDV8qS4ihSaXVSla6MM45Hppz8AaM8qTc=</latexit> <latexit sha1_base64="4nK7zdU6Jks9oKKb01pFzDnzRmI=">AAACxXichVFNSxxBEH2OHzFqdNVLIJchi7KGuPQsAUVYED3o0a91lV0dZsZWG+drZ3oX1mHwnj8QwVOEHIKH/AMvOcQ/4MGfIB4N5JJDamYHxEhMNd1V/bre6yrK9G0RSsZuupTunt6+F/0vBwaHXg2P5EbHNkOvGVi8Ynm2F2yZRsht4fKKFNLmW37ADce0edU8Wkzeqy0ehMJzN2Tb5zuOceCKfWEZkiA9V9or1E0n8uP3iWvEU2pZrYeNQEb1sOnokShr8a6rFnxdTDd0MbVbivVcnhVZaurTQMuCPDJb8XKXqGMPHiw04YDDhaTYhoGQVg0aGHzCdhARFlAk0neOGAPEbVIWpwyD0CM6D+hWy1CX7olmmLIt+sWmHRBTxQS7Zl/ZPbtiF+yW/f6nVpRqJLW0yZsdLvf1kY+v13/9l+WQlzh8YD3DMCm7U9kN+0Z13W2/q1Wf7VFiH7Npb4J69VMk6drq6LSOP92vz61NRJPsnN1Rv59J+Tspu62f1pdVvnZG49L+Hs7TYLNU1FhRW/2Qn1/IBtePN3iLAk1nBvNYxgoq9OspLvEDV8qS4ihSaXVSla6MM45Hppz8AaM8qTc=</latexit> <latexit sha1_base64="4nK7zdU6Jks9oKKb01pFzDnzRmI=">AAACxXichVFNSxxBEH2OHzFqdNVLIJchi7KGuPQsAUVYED3o0a91lV0dZsZWG+drZ3oX1mHwnj8QwVOEHIKH/AMvOcQ/4MGfIB4N5JJDamYHxEhMNd1V/bre6yrK9G0RSsZuupTunt6+F/0vBwaHXg2P5EbHNkOvGVi8Ynm2F2yZRsht4fKKFNLmW37ADce0edU8Wkzeqy0ehMJzN2Tb5zuOceCKfWEZkiA9V9or1E0n8uP3iWvEU2pZrYeNQEb1sOnokShr8a6rFnxdTDd0MbVbivVcnhVZaurTQMuCPDJb8XKXqGMPHiw04YDDhaTYhoGQVg0aGHzCdhARFlAk0neOGAPEbVIWpwyD0CM6D+hWy1CX7olmmLIt+sWmHRBTxQS7Zl/ZPbtiF+yW/f6nVpRqJLW0yZsdLvf1kY+v13/9l+WQlzh8YD3DMCm7U9kN+0Z13W2/q1Wf7VFiH7Npb4J69VMk6drq6LSOP92vz61NRJPsnN1Rv59J+Tspu62f1pdVvnZG49L+Hs7TYLNU1FhRW/2Qn1/IBtePN3iLAk1nBvNYxgoq9OspLvEDV8qS4ihSaXVSla6MM45Hppz8AaM8qTc=</latexit> <latexit sha1_base64="4nK7zdU6Jks9oKKb01pFzDnzRmI=">AAACxXichVFNSxxBEH2OHzFqdNVLIJchi7KGuPQsAUVYED3o0a91lV0dZsZWG+drZ3oX1mHwnj8QwVOEHIKH/AMvOcQ/4MGfIB4N5JJDamYHxEhMNd1V/bre6yrK9G0RSsZuupTunt6+F/0vBwaHXg2P5EbHNkOvGVi8Ynm2F2yZRsht4fKKFNLmW37ADce0edU8Wkzeqy0ehMJzN2Tb5zuOceCKfWEZkiA9V9or1E0n8uP3iWvEU2pZrYeNQEb1sOnokShr8a6rFnxdTDd0MbVbivVcnhVZaurTQMuCPDJb8XKXqGMPHiw04YDDhaTYhoGQVg0aGHzCdhARFlAk0neOGAPEbVIWpwyD0CM6D+hWy1CX7olmmLIt+sWmHRBTxQS7Zl/ZPbtiF+yW/f6nVpRqJLW0yZsdLvf1kY+v13/9l+WQlzh8YD3DMCm7U9kN+0Z13W2/q1Wf7VFiH7Npb4J69VMk6drq6LSOP92vz61NRJPsnN1Rv59J+Tspu62f1pdVvnZG49L+Hs7TYLNU1FhRW/2Qn1/IBtePN3iLAk1nBvNYxgoq9OspLvEDV8qS4ihSaXVSla6MM45Hppz8AaM8qTc=</latexit> (pi qi)2 <latexit sha1_base64="mplhpuS64LOY9WppGTKT5Z6jh78=">AAACnHichVHLSsNAFL2Nr1ofrboRBCmWShUskyIoroq6EETowz4k1pLEaR2aJjFJC7X4A+7FhaAouBAX/oEbN/6Ai36CdFnBjQtv04Bosd6QzJkz95zcO1fSFWZahDRcXF//wOCQe9gzMjo27vVNTKZNrWLINCVrimZkJdGkClNpymKWQrO6QcWypNCMVNpon2eq1DCZpu5aNZ3mymJRZQUmixZSQkjPs6XjPFs4iOR9ARImdvi7Ae+AADgR03xPsA+HoIEMFSgDBRUsxAqIYOIjAA8EdORyUEfOQMTscwqn4EFtBbMoZojIlvBbxJ3gsCru256mrZbxLwq+Bir9ECSv5J60yAt5IG/k80+vuu3RrqWGq9TRUj3vPZtOfvyrKuNqwdG3qodCwuxOZQ3yiHU19xaFTM8eLSjAqt0bw151m2l3LXd8qicXreRaIlifJ7ekif3eoPMzOqvVd/kuThOXOC7+93C6QToS5kmYjy8HouvO4NwwA3MQwumsQBS2IAYp+37P4QquuVluk9vmdjqpnMvRTMGP4NJfLgiZEg==</latexit> <latexit sha1_base64="mplhpuS64LOY9WppGTKT5Z6jh78=">AAACnHichVHLSsNAFL2Nr1ofrboRBCmWShUskyIoroq6EETowz4k1pLEaR2aJjFJC7X4A+7FhaAouBAX/oEbN/6Ai36CdFnBjQtv04Bosd6QzJkz95zcO1fSFWZahDRcXF//wOCQe9gzMjo27vVNTKZNrWLINCVrimZkJdGkClNpymKWQrO6QcWypNCMVNpon2eq1DCZpu5aNZ3mymJRZQUmixZSQkjPs6XjPFs4iOR9ARImdvi7Ae+AADgR03xPsA+HoIEMFSgDBRUsxAqIYOIjAA8EdORyUEfOQMTscwqn4EFtBbMoZojIlvBbxJ3gsCru256mrZbxLwq+Bir9ECSv5J60yAt5IG/k80+vuu3RrqWGq9TRUj3vPZtOfvyrKuNqwdG3qodCwuxOZQ3yiHU19xaFTM8eLSjAqt0bw151m2l3LXd8qicXreRaIlifJ7ekif3eoPMzOqvVd/kuThOXOC7+93C6QToS5kmYjy8HouvO4NwwA3MQwumsQBS2IAYp+37P4QquuVluk9vmdjqpnMvRTMGP4NJfLgiZEg==</latexit> <latexit sha1_base64="mplhpuS64LOY9WppGTKT5Z6jh78=">AAACnHichVHLSsNAFL2Nr1ofrboRBCmWShUskyIoroq6EETowz4k1pLEaR2aJjFJC7X4A+7FhaAouBAX/oEbN/6Ai36CdFnBjQtv04Bosd6QzJkz95zcO1fSFWZahDRcXF//wOCQe9gzMjo27vVNTKZNrWLINCVrimZkJdGkClNpymKWQrO6QcWypNCMVNpon2eq1DCZpu5aNZ3mymJRZQUmixZSQkjPs6XjPFs4iOR9ARImdvi7Ae+AADgR03xPsA+HoIEMFSgDBRUsxAqIYOIjAA8EdORyUEfOQMTscwqn4EFtBbMoZojIlvBbxJ3gsCru256mrZbxLwq+Bir9ECSv5J60yAt5IG/k80+vuu3RrqWGq9TRUj3vPZtOfvyrKuNqwdG3qodCwuxOZQ3yiHU19xaFTM8eLSjAqt0bw151m2l3LXd8qicXreRaIlifJ7ekif3eoPMzOqvVd/kuThOXOC7+93C6QToS5kmYjy8HouvO4NwwA3MQwumsQBS2IAYp+37P4QquuVluk9vmdjqpnMvRTMGP4NJfLgiZEg==</latexit> <latexit sha1_base64="mplhpuS64LOY9WppGTKT5Z6jh78=">AAACnHichVHLSsNAFL2Nr1ofrboRBCmWShUskyIoroq6EETowz4k1pLEaR2aJjFJC7X4A+7FhaAouBAX/oEbN/6Ai36CdFnBjQtv04Bosd6QzJkz95zcO1fSFWZahDRcXF//wOCQe9gzMjo27vVNTKZNrWLINCVrimZkJdGkClNpymKWQrO6QcWypNCMVNpon2eq1DCZpu5aNZ3mymJRZQUmixZSQkjPs6XjPFs4iOR9ARImdvi7Ae+AADgR03xPsA+HoIEMFSgDBRUsxAqIYOIjAA8EdORyUEfOQMTscwqn4EFtBbMoZojIlvBbxJ3gsCru256mrZbxLwq+Bir9ECSv5J60yAt5IG/k80+vuu3RrqWGq9TRUj3vPZtOfvyrKuNqwdG3qodCwuxOZQ3yiHU19xaFTM8eLSjAqt0bw151m2l3LXd8qicXreRaIlifJ7ekif3eoPMzOqvVd/kuThOXOC7+93C6QToS5kmYjy8HouvO4NwwA3MQwumsQBS2IAYp+37P4QquuVluk9vmdjqpnMvRTMGP4NJfLgiZEg==</latexit>
  7. 19 Hello, SIMD CJU CJU CJU CJU   

         4*.%6OJU         CJU CJU CJU CJU CJU x = float32 * 16 y = float32 * 16 z = float32 * 16 const size_t n = 16; float *x, *y, *z; // Align with 256bit x = (float *)_mm_malloc(sizeof(float) * n, 32); y = (float *)_mm_malloc(sizeof(float) * n, 32); z = (float *)_mm_malloc(sizeof(float) * n, 32); __m256 *vz = (__m256 *)z; __m256 *vx = (__m256 *)x; __m256 *vy = (__m256 *)y; const size_t end = n / 8; for(size_t i=0; i<end; ++i) vz[i] = _mm256_add_ps(vx[i], vy[i]); _mm_free(x); _mm_free(y); _mm_free(z);
  8. Euclidean distance using SIMD and cgo 20 /* #cgo CFLAGS:

    -mavx -std=c99 #include <stdio.h> #include <math.h> #include <stdlib.h> #include <immintrin.h> float avx_euclidean_distance(const size_t n, float *x, float *y) { __m256 *vx = (__m256 *)x; __m256 *vy = (__m256 *)y; __m256 vsub = {0}; __m256 vsum = {0}; const size_t end = n / 8; for(size_t i=0; i<end; ++i) { vsub = _mm256_sub_ps(vx[i], vy[i]); vsum = _mm256_add_ps(vsum, _mm256_mul_ps(vsub, vsub)); } __attribute__((aligned(32))) float t[8] = {0}; _mm256_store_ps(t, vsum); return sqrt(t[0] + t[1] + t[2] + t[3] + t[4] + t[5] + t[6] + t[7]); } */
  9. Euclidean distance using SIMD and cgo 21 func MmMalloc(size int)

    []float32 { size_ := size size = align(size) ptr := C._mm_malloc((C.size_t)(C.sizeof_float*size), 32) hdr := reflect.SliceHeader{ Data: uintptr(unsafe.Pointer(ptr)), Len: size, Cap: size, } goSlice := *(*[]float32)(unsafe.Pointer(&hdr)) if size_ != size { for i := size_; i < size; i++ { goSlice[i] = 0.0 } } return goSlice } func MmFree(v []float32) { C._mm_free(unsafe.Pointer(&v[0])) } func align(size int) int { return int(math.Ceil(float64(size)/8.0) * 8.0) }
  10. Euclidean distance using SIMD and cgo 22 func EuclideanDistance(size int,

    x, y []float32) float32 { size = align(size) dot := C.avx_euclidean_distance((C.size_t)(size), (*C.float)(&x[0]), (*C.float)(&y[0])) return float32(dot) } BenchmarkEuclideanDistance-8 30000 59465 ns/op 0 B/op 0 allocs/op BenchmarkEuclideanDistanceGoroutine-8 2000 1034087 ns/op 479 B/op 5 allocs/op BenchmarkHypot-8 50000 34059 ns/op 0 B/op 0 allocs/op BenchmarkEuclideanDistanceAVX-8 5000000 359 ns/op 0 B/op 0 allocs/op Benchmark (2048dim)
  11. • ୯ҰίωΫγϣϯͷԾ૝ଟॏԽΛఏڙ • ΞϓϦέʔγϣϯ͸ಠࣗͷ؆ܿͳϓϩτίϧʹΑͬͯετϦʔϜܦ༝Ͱσʔλ Λૹड৴͢Δ 25 Smux (Socket Multiplexer) ʰGoݴޠͰTCP΍ιέοτ௨৴ΛଟॏԽɼߴ଎Խ͢Δsmux(ιέοτϚϧνϓϨΫα)Λͭͬͨ͘ʱ

    https://blog.monochromegane.com/blog/2018/05/03/smux/ 3FRVFTU Payload Header Frame 3FRVFTU Stream 1 Stream 2 Payload Header 3FTQPOTF Payload Header 3FTQPOTF Payload Header Payload Header Stream 1 Payload Header Stream 2 Payload Header Stream 1 Payload Header Stream 1 Payload Header Stream 2 TCP Connection Req Res
  12. Smux (Socket Multiplexer) 26 // smux server server := smux.Server{

    Network: "tcp", // or "unix" Address: "localhost:3000", // or "sockfile" Handler: smux.HandlerFunc(func(w io.Writer, r io.Reader) { io.Copy(ioutil.Discard, r) fmt.Fprint(w, "Hello, smux client!") }), } server.ListenAndServe() // smux client client := smux.Client{ Network: "tcp", // or "unix" Address: "localhost:3000", // or "sockfile" } body, _ := client.Post([]byte("Hello, smux server!")) fmt.Printf("%s\n", body) // "Hello, smux client!" Server Client
  13. ύϑΥʔϚϯε 27 • smux, HTTP/1.1, HTTP/2ͷϕϯνϚʔΫ • ίωΫγϣϯ਺ or ετϦʔϜ਺Λ૿Ճ͞

    ͤͳ͕ΒҰఆ਺ͷϦΫΤετΛࡹ࣌ؒ͘ Λܭଌ • αʔόαΠυॲཧ୅ସʹ਺ेmsͷsleep • ख๏ʹΑͬͯଟॏ౓૿Ճʹ൐͍ੑೳ͕಄ ଧͪʹͳΔͳ͔smux͕ߴ͍ੑೳΛࣔͨ͠