Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
疎行列と Jaccard 類似度の高速計算
Search
na-o-ys
March 29, 2017
Programming
1
580
疎行列と Jaccard 類似度の高速計算
na-o-ys
March 29, 2017
Tweet
Share
More Decks by na-o-ys
See All by na-o-ys
IoTと監視
naoys
1
750
RubyとJIT
naoys
0
150
将棋盤を画像認識したかった
naoys
0
1.5k
Rust で乗り換え案内
naoys
0
610
有理数集合の濃度
naoys
2
110
YARVの最適化について調べた
naoys
0
120
転職会議サービスのAWS移行記録
naoys
0
55
Anonymous Recursion in C++
naoys
0
400
入門AlphaGo
naoys
5
3.7k
Other Decks in Programming
See All in Programming
コミュニティ駆動 AWS CDK ライブラリ「Open Constructs Library」 / community-cdk-library
gotok365
2
260
Better Code Design in PHP
afilina
0
180
クックパッド検索システム統合/Cookpad Search System Consolidation
giga811
0
150
推しメソッドsource_locationのしくみを探る - はじめてRubyのコードを読んでみた
nobu09
2
360
バッチを作らなきゃとなったときに考えること
irof
2
560
Webフレームワークとともに利用するWeb components / JSConf.jp おかわり
spring_raining
1
140
Drawing Heighway’s Dragon- Recursive Function Rewrite- From Imperative Style in Pascal 64 To Functional Style in Scala 3
philipschwarz
PRO
0
160
ML.NETで始める機械学習
ymd65536
0
240
Learning Kotlin with detekt
inouehi
1
200
変化の激しい時代における、こだわりのないエンジニアの強さ
satoshi256kbyte
1
120
若手バックエンドエンジニアが Elasticsearch を使ってみた話
hott0mott0
1
100
TCAを用いたAmebaのリアーキテクチャ
dazy
0
220
Featured
See All Featured
It's Worth the Effort
3n
184
28k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
29
1.1k
How GitHub (no longer) Works
holman
314
140k
A better future with KSS
kneath
238
17k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Building an army of robots
kneath
303
45k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
4
390
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
13
1k
Music & Morning Musume
bryan
46
6.4k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.2k
The Cost Of JavaScript in 2023
addyosmani
47
7.5k
Adopting Sorbet at Scale
ufuk
75
9.2k
Transcript
ૄߦྻ ͋Δ͍ Jaccard ྨࣅΛߴͰܭࢉ͢Δํ๏ @na_o_ys
Agenda 1. ૄߦྻͷσʔλߏ 2. Python ͱܭࢉ 3. Python ͱૄߦྻ 4.
Jaccard ྨࣅ
1. ૄߦྻͷσʔλߏ
ૄߦྻͱ ΄ͱΜͲͷཁૉ͕ 0 Ͱ͋Δߦྻ
1. ૄߦྻͷσʔλߏ (1) • ௨ৗͷߦྻ Array • ૄߦྻΛ Array Ͱѻ͏ͱϝϞϦԋࢉແବ
• 0 ϕΫτϧಉ࢜ͷࢉͱ͔໌Β͔ʹແବ
1. ૄߦྻͷσʔλߏ (2) • Compressed Sparse Row (CSR) • CSR
ಉ࢜ͷՃࢉ, ߦྻੵ͕ߴ • ߦϕΫτϧͷऔΓग़͕͠ߴ • ྻϕΫτϧͷऔΓग़͕͠ • (wikipedia)
2. Python ͱܭࢉ
2. Python ͱܭࢉ • ख़ͨ͠ܭࢉϥΠϒϥϦ • NumPy, SciPy • Scikit-learn
ͱ͜ΖͰɺPython ͍ (DEMO)
Python ͍ • 5000 ഒ ࣮ߦ࣌ؒ 1ZUIPO NT Ұ෦/VN1Z NT
શ෦/VN1Z NT
Python-loop is Evil • ߦྻϧʔϓઈରʹॻ͍͍͚ͯͳ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 2 ͔͔࣌ؒΔ
• ߦϧʔϓ/ྻϧʔϓॻ͔ͳ͍ํ͕ྑ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 1 ͔͔Δ
3. Python ͱૄߦྻ
3. Python ͱૄߦྻ • scipy.sparse.csr_matrix
ޮతͳߦྻॲཧ • ߦϕΫτϧͷऔΓग़͠ • Ճࢉࢉ, ߦྻੵ • ෦දݱΛ numpy.ndarray ͱͯ͠อ࣋
• औΓग़ͯ͠ૢ࡞Ͱ͖Δ (NumPy ͷੈք Ͱ)
4. Jaccard ྨࣅ
4. Jaccard ྨࣅ • ϕΫτϧಉ࢜ͷྨࣅ • ڠௐϑΟϧλϦϯάͱ͔Ͱ͏ • ϢʔβAͱϢʔβBͲΕ͘Β͍ࣅ͍ͯΔ͔ Jaccard(a,
b) = a・b / (a・a + b・b - a・b)
ࣄͰඞཁʹͳͬͨ͜ͱ • ૄߦྻͷߦϕΫτϧಉ࢜ͷ Jaccard ྨࣅΛ ܭࢉ͍ͨ͠
DEMO
·ͱΊ
·ͱΊ • Python ͍ • ϥΠϒϥϦΛ͏·͘͏ඞཁ͕͋Δ • ϒϩάΛॻ͍ͨ • http://na-o-ys.github.io/others/
2015-11-07-sparse-vector- similarities.html