Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Matrix Multiplication
Search
Moro
November 14, 2018
Programming
0
8
Matrix Multiplication
Parallel Computing in Shared Memory using OpenMP - Matrix Multiplication problem.
Moro
November 14, 2018
Tweet
Share
More Decks by Moro
See All by Moro
MockK and Truth - Unit Tests - Android
gabrielbmoro
0
150
More Accessible Apps - Android
gabrielbmoro
0
9
Variables and Tips - Android
gabrielbmoro
0
10
Migrating an Existing App to Compose - Android
gabrielbmoro
0
11
Recycler View and Performance - Android
gabrielbmoro
0
12
Repository Pattern and Productivity - Android
gabrielbmoro
0
13
What is new in Android Jetpack?
gabrielbmoro
0
18
List Users - Android
gabrielbmoro
0
5
Working with Collections - Kotlin
gabrielbmoro
0
12
Other Decks in Programming
See All in Programming
#QiitaBash TDDで(自分の)開発がどう変わったか
ryosukedtomita
1
350
202507_ADKで始めるエージェント開発の基本 〜デモを通じて紹介〜(奥田りさ)The Basics of Agent Development with ADK — A Demo-Focused Introduction
risatube
PRO
6
1.4k
バイブコーディング超えてバイブデプロイ〜CloudflareMCPで実現する、未来のアプリケーションデリバリー〜
azukiazusa1
3
800
実践!App Intents対応
yuukiw00w
1
210
MySQL9でベクトルカラム登場!PHP×AWSでのAI/類似検索はこう変わる
suguruooki
1
290
Workers を定期実行する方法は一つじゃない
rokuosan
0
140
CLI ツールを Go ライブラリ として再実装する理由 / Why reimplement a CLI tool as a Go library
ktr_0731
3
1k
No Install CMS戦略 〜 5年先を見据えたフロントエンド開発を考える / no_install_cms
rdlabo
0
470
WebAssemblyインタプリタを書く ~Component Modelを添えて~
ruccho
1
590
Vibe coding コードレビュー
kinopeee
0
420
Bedrock AgentCore ObservabilityによるAIエージェントの運用
licux
8
570
React 使いじゃなくても知っておきたい教養としての React
oukayuka
18
5.4k
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
25
1.8k
Why Our Code Smells
bkeepers
PRO
337
57k
Producing Creativity
orderedlist
PRO
347
40k
KATA
mclloyd
32
14k
Balancing Empowerment & Direction
lara
1
530
Statistics for Hackers
jakevdp
799
220k
We Have a Design System, Now What?
morganepeng
53
7.7k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Building a Modern Day E-commerce SEO Strategy
aleyda
43
7.4k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Transcript
Matrix Multiplication Parallel Computing in Shared Memory using OpenMP Gabriel
Moro - KNOWLEDGE TRANSFER - KT, Porto Alegre - November 2018
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Matrix Multiplication A B C
Ways to improve the performance to this algorithm - Algorithm
complexity - Parallelism
Ways to improve the performance to this algorithm - Algorithm
complexity - Parallelism
Ways to improve the performance to this algorithm - Algorithm
complexity - Parallelism - Shared Memory - Distributed Memory
Ways to improve the performance to this algorithm - Algorithm
complexity - Parallelism - Shared Memory - Distributed Memory
Parallel OpenMP Model A C T1 T2 T3
Parallel OpenMP Model A C T1 T2 T3
Turing - Processor - 4 x Intel Xeon X7550 Nehalem
- 32 physical cores - HyperThreading - Memory - 128GB DDR3 - GPPD-UFRGS
Version: normal_seq for(i=0;i < size; i++) { for(j=0;j < size;
j++) { tmp=0; for(k=0; k < size; k++) tmp = tmp + A[i][k] * B[k][j]; C[i][j] = tmp; } }
Version: normal_par #pragma omp parallel for private(i,j,k,tmp) for(i=0;i < size;
i++) { for(j=0;j < size; j++) { tmp=0; for(k=0; k < size; k++) tmp = tmp + A[i][k] * B[k][j]; C[i][j] = tmp; } }
Version: continuos_seq for(i=0;i < size; i++) { for(j=0;j < size;
j++) { tmp=0; for(k=0; k < size; k++) tmp = tmp + A[i * size + k] * B[k * size + j]; C[i * size + j] = tmp; } }
Version: continuos_par #pragma omp parallel for private(i,j,k,tmp) for(i=0;i < size;
i++) { for(j=0;j < size; j++) { tmp=0; for(k=0; k < size; k++) tmp = tmp + A[i * size + k] * B[k * size + j]; C[i * size + j] = tmp; } }
Version: tiling_seq register int jj,kk,i,j,k; double tmp=0; for(jj=0;jj < size;
jj=jj+block) { for(kk=0; kk < size; kk=kk+block) { for(i=0; i < size; i++) { for(j=jj; j < min(jj+block, size); j++) { tmp=0; for(k=kk; k < min(kk+block,size); k++) { tmp = tmp + A[i][k] * B[k][j]; } R[i][j] = tmp; } } } }
Version: tiling_par register int jj,kk,i,j,k; double tmp=0; for(jj=0;jj < size;
jj=jj+block) { for(kk=0; kk < size; kk=kk+block) { #pragma omp parallel for private(i,j,k,tmp) schedule(static) for(i=0; i < size; i++) { for(j=jj; j < min(jj+block, size); j++) { tmp=0; for(k=kk; k < min(kk+block,size); k++) { tmp = tmp + A[i][k] * B[k][j]; } R[i][j] = tmp; } } } }
Links - Top 500: https://www.top500.org/lists/2018/11/ - Green 500: https://www.top500.org/green500/lists/2018/11/ -
NAS Parallel Benchmark: https://www.nas.nasa.gov/publications/npb.html
Thanks! https://github.com/tido4410/knowledge-transfer-gbmoro.git Gabriel Moro - Matrix Multiplication - OpenMP -
KT, Porto Alegre - November 2018