Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
疎行列と Jaccard 類似度の高速計算
Search
na-o-ys
March 29, 2017
Programming
1
650
疎行列と Jaccard 類似度の高速計算
na-o-ys
March 29, 2017
Tweet
Share
More Decks by na-o-ys
See All by na-o-ys
IoTと監視
naoys
1
810
RubyとJIT
naoys
0
170
将棋盤を画像認識したかった
naoys
0
1.6k
Rust で乗り換え案内
naoys
0
640
有理数集合の濃度
naoys
2
140
YARVの最適化について調べた
naoys
0
150
転職会議サービスのAWS移行記録
naoys
0
79
Anonymous Recursion in C++
naoys
0
430
入門AlphaGo
naoys
5
3.8k
Other Decks in Programming
See All in Programming
ゲームの物理 剛体編
fadis
0
400
AI前提で考えるiOSアプリのモダナイズ設計
yuukiw00w
0
210
Combinatorial Interview Problems with Backtracking Solutions - From Imperative Procedural Programming to Declarative Functional Programming - Part 2
philipschwarz
PRO
0
140
dchart: charts from deck markup
ajstarks
3
950
Graviton と Nitro と私
maroon1st
0
160
クラウドに依存しないS3を使った開発術
simesaba80
0
220
Context is King? 〜Verifiability時代とコンテキスト設計 / Beyond "Context is King"
rkaga
10
1.5k
Python札幌 LT資料
t3tra
7
1.1k
Kotlin Multiplatform Meetup - Compose Multiplatform 외부 의존성 아키텍처 설계부터 운영까지
wisemuji
0
170
Denoのセキュリティに関する仕組みの紹介 (toranoana.deno #23)
uki00a
0
220
JETLS.jl ─ A New Language Server for Julia
abap34
2
470
CSC307 Lecture 04
javiergs
PRO
0
620
Featured
See All Featured
Test your architecture with Archunit
thirion
1
2.1k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
51
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
9.3k
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
100
Writing Fast Ruby
sferik
630
62k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Utilizing Notion as your number one productivity tool
mfonobong
2
200
Context Engineering - Making Every Token Count
addyosmani
9
590
Site-Speed That Sticks
csswizardry
13
1k
Become a Pro
speakerdeck
PRO
31
5.8k
How to Think Like a Performance Engineer
csswizardry
28
2.4k
GraphQLの誤解/rethinking-graphql
sonatard
74
11k
Transcript
ૄߦྻ ͋Δ͍ Jaccard ྨࣅΛߴͰܭࢉ͢Δํ๏ @na_o_ys
Agenda 1. ૄߦྻͷσʔλߏ 2. Python ͱܭࢉ 3. Python ͱૄߦྻ 4.
Jaccard ྨࣅ
1. ૄߦྻͷσʔλߏ
ૄߦྻͱ ΄ͱΜͲͷཁૉ͕ 0 Ͱ͋Δߦྻ
1. ૄߦྻͷσʔλߏ (1) • ௨ৗͷߦྻ Array • ૄߦྻΛ Array Ͱѻ͏ͱϝϞϦԋࢉແବ
• 0 ϕΫτϧಉ࢜ͷࢉͱ͔໌Β͔ʹແବ
1. ૄߦྻͷσʔλߏ (2) • Compressed Sparse Row (CSR) • CSR
ಉ࢜ͷՃࢉ, ߦྻੵ͕ߴ • ߦϕΫτϧͷऔΓग़͕͠ߴ • ྻϕΫτϧͷऔΓग़͕͠ • (wikipedia)
2. Python ͱܭࢉ
2. Python ͱܭࢉ • ख़ͨ͠ܭࢉϥΠϒϥϦ • NumPy, SciPy • Scikit-learn
ͱ͜ΖͰɺPython ͍ (DEMO)
Python ͍ • 5000 ഒ ࣮ߦ࣌ؒ 1ZUIPO NT Ұ෦/VN1Z NT
શ෦/VN1Z NT
Python-loop is Evil • ߦྻϧʔϓઈରʹॻ͍͍͚ͯͳ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 2 ͔͔࣌ؒΔ
• ߦϧʔϓ/ྻϧʔϓॻ͔ͳ͍ํ͕ྑ͍ • 1 ඵͰऴΘΔͣͷॲཧʹ 1 ͔͔Δ
3. Python ͱૄߦྻ
3. Python ͱૄߦྻ • scipy.sparse.csr_matrix
ޮతͳߦྻॲཧ • ߦϕΫτϧͷऔΓग़͠ • Ճࢉࢉ, ߦྻੵ • ෦දݱΛ numpy.ndarray ͱͯ͠อ࣋
• औΓग़ͯ͠ૢ࡞Ͱ͖Δ (NumPy ͷੈք Ͱ)
4. Jaccard ྨࣅ
4. Jaccard ྨࣅ • ϕΫτϧಉ࢜ͷྨࣅ • ڠௐϑΟϧλϦϯάͱ͔Ͱ͏ • ϢʔβAͱϢʔβBͲΕ͘Β͍ࣅ͍ͯΔ͔ Jaccard(a,
b) = a・b / (a・a + b・b - a・b)
ࣄͰඞཁʹͳͬͨ͜ͱ • ૄߦྻͷߦϕΫτϧಉ࢜ͷ Jaccard ྨࣅΛ ܭࢉ͍ͨ͠
DEMO
·ͱΊ
·ͱΊ • Python ͍ • ϥΠϒϥϦΛ͏·͘͏ඞཁ͕͋Δ • ϒϩάΛॻ͍ͨ • http://na-o-ys.github.io/others/
2015-11-07-sparse-vector- similarities.html