Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
疎行列と Jaccard 類似度の高速計算
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
na-o-ys
March 29, 2017
Programming
670
1
Share
疎行列と Jaccard 類似度の高速計算
na-o-ys
March 29, 2017
More Decks by na-o-ys
See All by na-o-ys
IoTと監視
naoys
1
830
RubyとJIT
naoys
0
180
将棋盤を画像認識したかった
naoys
0
1.6k
Rust で乗り換え案内
naoys
0
650
有理数集合の濃度
naoys
2
160
YARVの最適化について調べた
naoys
0
160
転職会議サービスのAWS移行記録
naoys
0
91
Anonymous Recursion in C++
naoys
0
440
入門AlphaGo
naoys
5
3.8k
Other Decks in Programming
See All in Programming
新規プロダクトを高速で生み出すハーネスエンジニアリング
seanchas116
15
6.9k
誰も頼んでない機能を出荷した話
zekutax
0
150
サーバーレスで作る、動画データ管理基盤
oyasumipants
0
300
TypeSpec で繋ぐ複数プロダクトの型安全
maroon8021
1
260
Transactional Change Stream Processing With Debezium and Apache Flink
gunnarmorling
1
140
3Dシーンの圧縮
fadis
1
440
RTSPクライアントを自作してみた話
simotin13
0
300
開発体験を左右するライブラリの API 設計 - GraphQL スキーマ構築ライブラリから考える #tskaigi
izumin5210
2
1.2k
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
2.9k
AIとRubyの静的型付け
ukin0k0
0
270
今さら聞けないCancellationToken
htkym
0
200
権限チェックの一貫性を型で守る TypeScript による多層防御
mnch
4
900
Featured
See All Featured
4 Signs Your Business is Dying
shpigford
187
22k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
130
Leo the Paperboy
mayatellez
7
1.8k
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
520
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
370
The SEO Collaboration Effect
kristinabergwall1
1
460
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.8k
[RailsConf 2023] Rails as a piece of cake
palkan
59
6.6k
Navigating Team Friction
lara
192
16k
The Art of Programming - Codeland 2020
erikaheidi
57
14k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.5k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.6k
Transcript
ૄߦྻ ͋Δ͍ Jaccard ྨࣅΛߴͰܭࢉ͢Δํ๏ @na_o_ys
Agenda 1. ૄߦྻͷσʔλߏ 2. Python ͱܭࢉ 3. Python ͱૄߦྻ 4.
Jaccard ྨࣅ
1. ૄߦྻͷσʔλߏ
ૄߦྻͱ ΄ͱΜͲͷཁૉ͕ 0 Ͱ͋Δߦྻ
1. ૄߦྻͷσʔλߏ (1) • ௨ৗͷߦྻ Array • ૄߦྻΛ Array Ͱѻ͏ͱϝϞϦԋࢉແବ
• 0 ϕΫτϧಉ࢜ͷࢉͱ͔໌Β͔ʹແବ
1. ૄߦྻͷσʔλߏ (2) • Compressed Sparse Row (CSR) • CSR
ಉ࢜ͷՃࢉ, ߦྻੵ͕ߴ • ߦϕΫτϧͷऔΓग़͕͠ߴ • ྻϕΫτϧͷऔΓग़͕͠ • (wikipedia)
2. Python ͱܭࢉ
2. Python ͱܭࢉ • ख़ͨ͠ܭࢉϥΠϒϥϦ • NumPy, SciPy • Scikit-learn
ͱ͜ΖͰɺPython ͍ (DEMO)
Python ͍ • 5000 ഒ ࣮ߦ࣌ؒ 1ZUIPO NT Ұ෦/VN1Z NT
શ෦/VN1Z NT
Python-loop is Evil • ߦྻϧʔϓઈରʹॻ͍͍͚ͯͳ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 2 ͔͔࣌ؒΔ
• ߦϧʔϓ/ྻϧʔϓॻ͔ͳ͍ํ͕ྑ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 1 ͔͔Δ
3. Python ͱૄߦྻ
3. Python ͱૄߦྻ • scipy.sparse.csr_matrix
ޮతͳߦྻॲཧ • ߦϕΫτϧͷऔΓग़͠ • Ճࢉࢉ, ߦྻੵ • ෦දݱΛ numpy.ndarray ͱͯ͠อ࣋
• औΓग़ͯ͠ૢ࡞Ͱ͖Δ (NumPy ͷੈք Ͱ)
4. Jaccard ྨࣅ
4. Jaccard ྨࣅ • ϕΫτϧಉ࢜ͷྨࣅ • ڠௐϑΟϧλϦϯάͱ͔Ͱ͏ • ϢʔβAͱϢʔβBͲΕ͘Β͍ࣅ͍ͯΔ͔ Jaccard(a,
b) = a・b / (a・a + b・b - a・b)
ࣄͰඞཁʹͳͬͨ͜ͱ • ૄߦྻͷߦϕΫτϧಉ࢜ͷ Jaccard ྨࣅΛ ܭࢉ͍ͨ͠
DEMO
·ͱΊ
·ͱΊ • Python ͍ • ϥΠϒϥϦΛ͏·͘͏ඞཁ͕͋Δ • ϒϩάΛॻ͍ͨ • http://na-o-ys.github.io/others/
2015-11-07-sparse-vector- similarities.html