Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ChainerとAnnoyを使った 類似画像検索

ChainerとAnnoyを使った 類似画像検索

4b8354fc52ec9a0f83e70040710af5e7?s=128

takaaki shimbo

August 29, 2018
Tweet

Other Decks in Technology

Transcript

  1. Chainer ͱAnnoyΛ࢖ͬͨ ྨࣅը૾ݕࡧ ʙɹ΋͘΋͘੒Ռൃද + ௥هɹʙ Shimbo Takaaki 2018 /

    08 / 25 @Chainer mokumoku #2
  2. ࣗݾ঺հ • ໊લɿɹ৽อɹᠳཅ • 4݄͔Βࣾձਓ (ERPͷձࣾ ) • Twitter: @ta7uw

    • GitHub: @takaaki82 • Qiita: https://qiita.com/ta7uwtaka
  3. ྨࣅը૾ݕࡧͱ͸ • ಛ௃ϕΫτϧ Ͱදݱ͞Εͨը૾ʹ͍ۙը૾Λ σʔλϕʔεͷத͔Β୳͢͜ͱ • σʔλϕʔεͷ͢΂ͯͷը૾ͱྨࣅ౓Λൺֱ͠ ͍ͯΔͱ๲େͳܭࢉྔʹͳΓ͕͔͔࣌ؒΔͷ Ͱɺޮ཰Α͘ࣅ͍ͯΔը૾Λ୳ͩ͢͜͠ͱ͕ ٻΊΒΕΔɻ

  4. Pinterest

  5. Approach • ը૾ಛ௃ྔͷྨࣅ౓ܭࢉ ը૾͔ΒCNNͳͲͰಛ௃ྔΛநग़͠ɺίαΠϯྨࣅ౓ͳͲͷؔ਺ʹΑͬ ͯྨࣅ౓ܭࢉΛߦ͏͜ͱͰը૾ͷྨࣅ౓ΛٻΊΔ → ࠓճ͸ͪ͜ΒΛ࣮૷͢Δ • Deep LearningΛ࢖ͬͨྨࣅ౓ֶश

    ࣅ͍ͯΔը૾Ͱ͋Ε͹ɺྨࣅ౓͕େ͖ͳΔΑ͏ʹଛࣦΛઃఆֶͯ͠शΛ ߦ͏
  6. ߏ੒ ɾը૾ಛ௃ྔநग़ ֶशࡁΈը૾ೝࣝϞσϧͷதؒ૚Λ࢖͏ → Chainer ɾಛ௃ྔͷྨࣅ౓ܭࢉ ۙࣅ࠷ۙ๣୳ࡧ(ANN)Λ࢖͏ → Annoy

  7. ۙࣅ࠷ۙ๣୳ࡧϥΠϒϥϦ Annoy

  8. Annoy • ࠷ۙ๣୳ࡧ(ANN)Λ࣮૷ͨ͠ϥΠϒϥϦ • Spotify͕։ൃ͢ΔOSS • ԻָͷϨίϝϯυػೳʹ࢖ΘΕ͍ͯΔ • C++࣮૷ɹ-> PythonόΠϯσΟϯά

    • PythonͰ؆୯ʹ࢖͑Δ
  9. MNISTͰྨࣅը૾Λ୳ͯ͠ΈΔ ݕࡧʹ ࢖ͬͨը૾

  10. CIFAR-10ͷֶश • ࠓճ͸σʔληοτʹCIFAR-10Λ༻͍Δ • ௨ৗͷը૾෼ྨͱಉ༷ʹChainerͰֶशΛߦ͏ • ֶशΛߦͬͨϞσϧΛಛ௃ྔநग़ʹ࢖͏

  11. ಛ௃ྔΛAnnoyʹ௥Ճ from annoy import AnnoyIndex train, test = chainer.datasets.get_cifar10() dim

    = 1024 #தؒ૚ͷग़ྗͷཁૉ਺ annoy_model = AnnoyIndex(dim) with chainer.using_config('train', False), chainer.using_config('enable_backprop', False): for i in range(len(train)): img, _ = train[i] # numpy -> cupy x = model.xp.asarray(img[None, ...]) # ֶशσʔλͷਪ࿦݁Ռͷ͏ͪը૾෼ྨϞσϧͷ̍̌૚໨ͷग़ྗΛಘΔ x = get_hidden(model ,9 ,x=x).data x = x.reshape(-1) #cupy -> numpy x = chainer.cuda.to_cpu(x) annoy_model.add_item(i, x) # AnnoyϞσϧͷϏϧυ(Ҏޙσʔλͷ௥Ճ͸ߦ͑ͳ͍) annoy_model.build(1000) annoy_model.save("cifar-10-1000tree.ann")
  12. ۙࣅ࠷ۙ๣୳ࡧͷ࣮ߦ x = model.xp.asarray(x[None, …]) x = get_hidden(model ,9 ,x=

    x).data x = x.reshape(-1) x = chainer.cuda.to_cpu(x) # ಛ௃ϕΫτϧxΛΘͨ͢ͱɺྨࣅ౓ܭࢉΛߦ͍ɺྨࣅ౓ͷେ͖͍΋ͷΛฦ͢ɹɹɹɹɹɹɹɹɹɹɹ predict_indexes = annoy_model.get_nns_by_vector(x, 5, search_k=-1)
  13. ݁Ռ ը૾ΫΤϦ

  14. ը૾ΫΤϦ

  15. ·ͱΊ • Chainerͱۙࣅ࠷ۙ๣୳ࡧϥΠϒϥϦAnnoyΛ ࢖͏ͱྨࣅը૾ݕࡧͰ͖Δ • ࣍͸Deep Learningͷྨࣅ౓ֶशΛ࢖ͬͨྨ ࣅը૾ݕࡧΛ࣮૷ͯ͠ΈΔ

  16. ࢀߟ • Start Today Technologies TECH BLOG (https://tech.starttoday-tech.com/entry/detection_and_retrieval) • ݪాୡ໵

    (2017) , ը૾ೝࣝɹػցֶशϓϩϑΣογϣφϧ γϦʔζ • Deep metric learning using Triplet network (https://arxiv.org/pdf/1412.6622.pdf)
  17. ৄࡉ͸Qiitaʹॻ͖·ͨ͠ɻ Α͔ͬͨΒͲ͏ͧ Qiita

  18. ͝ਗ਼ௌ ͋Γ͕ͱ͏͍͟͝·ͨ͠