Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
K Nearest Neighbourhood on GPU
Search
Ciel
July 24, 2014
Research
0
37
K Nearest Neighbourhood on GPU
K Nearest Neighbourhood using inverted list on GPU
Ciel
July 24, 2014
Tweet
Share
More Decks by Ciel
See All by Ciel
LLVM IR & Optimisation Techniques
imwithye
0
140
Other Decks in Research
See All in Research
RSJ2024「基盤モデルの実ロボット応用」チュートリアルA(河原塚)
haraduka
2
590
[第62回NLPコロキウム]「なりきり」を促すHCI設計:対話型接客ロボットの遠隔操作者へのリアルタイム変換音声フィードバックの適用
nami_ogawa
0
280
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
sosk
1
880
SSII2024 [PD] SSII、次の30年への期待
ssii
PRO
2
1.4k
日本語医療LLM評価ベンチマークの構築と性能分析
fta98
3
480
研究の進め方 ランダムネスとの付き合い方について
joisino
PRO
49
17k
第 2 部 11 章「大規模言語モデルの研究開発から実運用に向けて」に向けて / MLOps Book Chapter 11
upura
0
270
出生抑制策と少子化
morimasao16
0
410
SSII2024 [OS1] 画像認識におけるモデル・データの共進化
ssii
PRO
0
490
「Goトレ」のご紹介
smartfukushilab1
0
570
「人間にAIはどのように辿り着けばよいのか?ー 系統的汎化からの第一歩 ー」@第22回 Language and Robotics研究会
maguro27
0
540
クラウドソーシングによる学習データ作成と品質管理(セキュリティキャンプ2024全国大会D2講義資料)
takumi1001
0
200
Featured
See All Featured
WebSockets: Embracing the real-time Web
robhawkes
59
7.3k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
130k
Designing with Data
zakiwarfel
98
5.1k
Documentation Writing (for coders)
carmenintech
65
4.3k
Agile that works and the tools we love
rasmusluckow
327
21k
[RailsConf 2023] Rails as a piece of cake
palkan
49
4.7k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
1
280
Designing for Performance
lara
604
68k
GraphQLとの向き合い方2022年版
quramy
43
13k
Atom: Resistance is Futile
akmur
261
25k
Six Lessons from altMBA
skipperchong
26
3.4k
Faster Mobile Websites
deanohume
304
30k
Transcript
Genie-and- Lamp-GPU Yiwei Gong K Nearest Neighbourhood using inverted list
on GPU
K Nearest Neighbourhood Fundamental Operator in Data Mining Classification 0
5 10 15 20 0 3 6 9 12 Regression Collaborative Filtering You may like * Apple * Google * Amazon
SELECT SEX M AGE 18 SALARY 2900 Sex Age Salary
… M 20 3000 … F 17 3600 … M 18 4000 … F 19 2900 … K Nearest Neighbourhood A running example
SELECT SEX M AGE 18 SALARY 2900 K Nearest Neighbourhood
Sex Age Salary … M 20 3000 … F 17 3600 … M 18 4000 … F 19 2900 … A running example
DIM + VALUE SEX+M SEX+F AGE+18 AGE+19 … 2 0
3 1 2 Invert list: row_id SELECT SEX M AGE 18 SALARY 2900 3 How do we store the inverted list table on GPU?
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 0 0 2 0 0 3 0 0 4 0 0 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 0 0 2 1 1*0.5 3 1 1*0.5 4 0 0 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 1 1*0.5 2 1 1*0.5 3 1 1*0.5 4 1 1*0.5 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … SALARY+2500 NULL SALARY+3000
0, 3 SALARY+3500 1 SALARY+4000 2 SALARY+4500 4,5 … … SELECT SALARY 2900±1000 Row ID Count AGG … … … 1 1 0.5 2 1 0.5 3 1 0.5 4 1 0.5 … … … Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … SALARY+2500 NULL SALARY+3000
0, 3 SALARY+3500 1 SALARY+4000 2 SALARY+4500 4,5 … … Row ID Count AGG … … … 1 1 0.5 2 1 0.5 3 2 1*0.3+0.5 4 1 0.5 … … … SELECT SALARY 2900±1000 Step 1: Matching & Aggregation
Block 1 Block 2 Block 2 SEX AGE SALARY GPU
Parallel Matching
Row ID Count AGG … … … 1 1 0.5
2 1 0.5 3 2 0.8 4 1 0.5 … … … K Selection What is the fast K Selection algorithm? Step 2: K Selection
R_id R_id R_id R_id R_id R_id R_id D+V1 D+V2 D+V3
invert_list_idx invert_list_table end_index First approach to store the inverted list table on GPU GPU
Host Device Map Main Memory ! KEY GPU Memory !
VALUE
dimension + value1 dimension + value2 Invert_list_idx Invert_list_table
None
Mapping C P U ! M E M O R
Y
Mapping C P U ! M E M O R
Y
Mapping C P U ! M E M O R
Y MAP(KEY, INDEX) device_vector
Mapping C P U ! M E M O R
Y raw_pointer get(key) map(key, value) freeze() ratio()
Bucket Top K Selection Algorithm 2 4 1 5 2
1 K = 10 First 7 results Bucket_Num = (Value - MIN) / (MAX - MIN) * Number_Of_Buckets
Bucket Top K Selection Algorithm Accept Multi Queries K =
2 K = 5 K = 6 K = 3
#define NAME “YIWEI GONG” #define UNIVERSITY “NTU” #define EMAIL “
[email protected]
”
#define BLOG “http://ciel.im” #define ME “A stupid programmer” THANK YOU
Block 1 Block 2 Block 3 Block 4 Block 5
Block 6 GPU Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Block