Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
疎行列と Jaccard 類似度の高速計算
Search
na-o-ys
March 29, 2017
Programming
1
620
疎行列と Jaccard 類似度の高速計算
na-o-ys
March 29, 2017
Tweet
Share
More Decks by na-o-ys
See All by na-o-ys
IoTと監視
naoys
1
780
RubyとJIT
naoys
0
160
将棋盤を画像認識したかった
naoys
0
1.5k
Rust で乗り換え案内
naoys
0
620
有理数集合の濃度
naoys
2
130
YARVの最適化について調べた
naoys
0
130
転職会議サービスのAWS移行記録
naoys
0
69
Anonymous Recursion in C++
naoys
0
420
入門AlphaGo
naoys
5
3.8k
Other Decks in Programming
See All in Programming
🔨 小さなビルドシステムを作る
momeemt
2
620
Laravel Boost 超入門
fire_arlo
2
160
AHC051解法紹介
eijirou
0
640
Rancher と Terraform
fufuhu
2
150
Processing Gem ベースの、2D レトロゲームエンジンの開発
tokujiros
2
110
UbieのAIパートナーを支えるコンテキストエンジニアリング実践
syucream
2
790
Ruby Parser progress report 2025
yui_knk
1
250
Oracle Database Technology Night 92 Database Connection control FAN-AC
oracle4engineer
PRO
1
340
Langfuseと歩む生成AI活用推進
licux
3
320
Updates on MLS on Ruby (and maybe more)
sylph01
1
160
令和最新版手のひらコンピュータ
koba789
14
8.2k
旅行プランAIエージェント開発の裏側
ippo012
1
540
Featured
See All Featured
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.5k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
790
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
185
54k
Embracing the Ebb and Flow
colly
87
4.8k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
8
500
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
A better future with KSS
kneath
239
17k
Making Projects Easy
brettharned
117
6.4k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
Git: the NoSQL Database
bkeepers
PRO
431
66k
Designing for Performance
lara
610
69k
Transcript
ૄߦྻ ͋Δ͍ Jaccard ྨࣅΛߴͰܭࢉ͢Δํ๏ @na_o_ys
Agenda 1. ૄߦྻͷσʔλߏ 2. Python ͱܭࢉ 3. Python ͱૄߦྻ 4.
Jaccard ྨࣅ
1. ૄߦྻͷσʔλߏ
ૄߦྻͱ ΄ͱΜͲͷཁૉ͕ 0 Ͱ͋Δߦྻ
1. ૄߦྻͷσʔλߏ (1) • ௨ৗͷߦྻ Array • ૄߦྻΛ Array Ͱѻ͏ͱϝϞϦԋࢉແବ
• 0 ϕΫτϧಉ࢜ͷࢉͱ͔໌Β͔ʹແବ
1. ૄߦྻͷσʔλߏ (2) • Compressed Sparse Row (CSR) • CSR
ಉ࢜ͷՃࢉ, ߦྻੵ͕ߴ • ߦϕΫτϧͷऔΓग़͕͠ߴ • ྻϕΫτϧͷऔΓग़͕͠ • (wikipedia)
2. Python ͱܭࢉ
2. Python ͱܭࢉ • ख़ͨ͠ܭࢉϥΠϒϥϦ • NumPy, SciPy • Scikit-learn
ͱ͜ΖͰɺPython ͍ (DEMO)
Python ͍ • 5000 ഒ ࣮ߦ࣌ؒ 1ZUIPO NT Ұ෦/VN1Z NT
શ෦/VN1Z NT
Python-loop is Evil • ߦྻϧʔϓઈରʹॻ͍͍͚ͯͳ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 2 ͔͔࣌ؒΔ
• ߦϧʔϓ/ྻϧʔϓॻ͔ͳ͍ํ͕ྑ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 1 ͔͔Δ
3. Python ͱૄߦྻ
3. Python ͱૄߦྻ • scipy.sparse.csr_matrix
ޮతͳߦྻॲཧ • ߦϕΫτϧͷऔΓग़͠ • Ճࢉࢉ, ߦྻੵ • ෦දݱΛ numpy.ndarray ͱͯ͠อ࣋
• औΓग़ͯ͠ૢ࡞Ͱ͖Δ (NumPy ͷੈք Ͱ)
4. Jaccard ྨࣅ
4. Jaccard ྨࣅ • ϕΫτϧಉ࢜ͷྨࣅ • ڠௐϑΟϧλϦϯάͱ͔Ͱ͏ • ϢʔβAͱϢʔβBͲΕ͘Β͍ࣅ͍ͯΔ͔ Jaccard(a,
b) = a・b / (a・a + b・b - a・b)
ࣄͰඞཁʹͳͬͨ͜ͱ • ૄߦྻͷߦϕΫτϧಉ࢜ͷ Jaccard ྨࣅΛ ܭࢉ͍ͨ͠
DEMO
·ͱΊ
·ͱΊ • Python ͍ • ϥΠϒϥϦΛ͏·͘͏ඞཁ͕͋Δ • ϒϩάΛॻ͍ͨ • http://na-o-ys.github.io/others/
2015-11-07-sparse-vector- similarities.html