Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
K Nearest Neighbourhood on GPU
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Ciel
July 24, 2014
Research
0
51
K Nearest Neighbourhood on GPU
K Nearest Neighbourhood using inverted list on GPU
Ciel
July 24, 2014
Tweet
Share
More Decks by Ciel
See All by Ciel
LLVM IR & Optimisation Techniques
imwithye
0
170
Other Decks in Research
See All in Research
OWASP KansaiDAY 2025.09_文系OSINTハンズオン
owaspkansai
0
110
Agentic AI フレームワーク戦略白書 (2025年度版)
mickey_kubo
1
120
ForestCast: Forecasting Deforestation Risk at Scale with Deep Learning
satai
3
390
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
350
[チュートリアル] 電波マップ構築入門 :研究動向と課題設定の勘所
k_sato
0
260
音声感情認識技術の進展と展望
nagase
0
470
2025-11-21-DA-10th-satellite
yegusa
0
110
Akamaiのキャッシュ効率を支えるAdaptSizeについての論文を読んでみた
bootjp
1
440
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
19
9.6k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
20年前に50代だった人たちの今
hysmrk
0
140
Can AI Generated Ambrotype Chain the Aura of Alternative Process? In SIGGRAPH Asia 2024 Art Papers
toremolo72
0
140
Featured
See All Featured
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
120
Designing for Performance
lara
610
70k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
0
140
How GitHub (no longer) Works
holman
316
140k
The Curse of the Amulet
leimatthew05
1
8.7k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
57
50k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
350
Transcript
Genie-and- Lamp-GPU Yiwei Gong K Nearest Neighbourhood using inverted list
on GPU
K Nearest Neighbourhood Fundamental Operator in Data Mining Classification 0
5 10 15 20 0 3 6 9 12 Regression Collaborative Filtering You may like * Apple * Google * Amazon
SELECT SEX M AGE 18 SALARY 2900 Sex Age Salary
… M 20 3000 … F 17 3600 … M 18 4000 … F 19 2900 … K Nearest Neighbourhood A running example
SELECT SEX M AGE 18 SALARY 2900 K Nearest Neighbourhood
Sex Age Salary … M 20 3000 … F 17 3600 … M 18 4000 … F 19 2900 … A running example
DIM + VALUE SEX+M SEX+F AGE+18 AGE+19 … 2 0
3 1 2 Invert list: row_id SELECT SEX M AGE 18 SALARY 2900 3 How do we store the inverted list table on GPU?
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 0 0 2 0 0 3 0 0 4 0 0 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 0 0 2 1 1*0.5 3 1 1*0.5 4 0 0 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … AGE+17 1 AGE+18
2, 3 AGE+19 4 AGE+20 9, 10 AGE+21 11 … … Row ID Count AGG … … … 1 1 1*0.5 2 1 1*0.5 3 1 1*0.5 4 1 1*0.5 … … … SELECT AGE 18±1 Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … SALARY+2500 NULL SALARY+3000
0, 3 SALARY+3500 1 SALARY+4000 2 SALARY+4500 4,5 … … SELECT SALARY 2900±1000 Row ID Count AGG … … … 1 1 0.5 2 1 0.5 3 1 0.5 4 1 0.5 … … … Step 1: Matching & Aggregation
DIM + VALUE Inverted List … … SALARY+2500 NULL SALARY+3000
0, 3 SALARY+3500 1 SALARY+4000 2 SALARY+4500 4,5 … … Row ID Count AGG … … … 1 1 0.5 2 1 0.5 3 2 1*0.3+0.5 4 1 0.5 … … … SELECT SALARY 2900±1000 Step 1: Matching & Aggregation
Block 1 Block 2 Block 2 SEX AGE SALARY GPU
Parallel Matching
Row ID Count AGG … … … 1 1 0.5
2 1 0.5 3 2 0.8 4 1 0.5 … … … K Selection What is the fast K Selection algorithm? Step 2: K Selection
R_id R_id R_id R_id R_id R_id R_id D+V1 D+V2 D+V3
invert_list_idx invert_list_table end_index First approach to store the inverted list table on GPU GPU
Host Device Map Main Memory ! KEY GPU Memory !
VALUE
dimension + value1 dimension + value2 Invert_list_idx Invert_list_table
None
Mapping C P U ! M E M O R
Y
Mapping C P U ! M E M O R
Y
Mapping C P U ! M E M O R
Y MAP(KEY, INDEX) device_vector
Mapping C P U ! M E M O R
Y raw_pointer get(key) map(key, value) freeze() ratio()
Bucket Top K Selection Algorithm 2 4 1 5 2
1 K = 10 First 7 results Bucket_Num = (Value - MIN) / (MAX - MIN) * Number_Of_Buckets
Bucket Top K Selection Algorithm Accept Multi Queries K =
2 K = 5 K = 6 K = 3
#define NAME “YIWEI GONG” #define UNIVERSITY “NTU” #define EMAIL “
[email protected]
”
#define BLOG “http://ciel.im” #define ME “A stupid programmer” THANK YOU
Block 1 Block 2 Block 3 Block 4 Block 5
Block 6 GPU Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Block