Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HLT CPU Consumption
Search
Sasha Mazurov
February 06, 2012
Science
0
340
HLT CPU Consumption
Sasha Mazurov
February 06, 2012
Tweet
Share
More Decks by Sasha Mazurov
See All by Sasha Mazurov
L1Calo Offline Software Status
mazurov
0
74
Performance and Regression tests for Simulation
mazurov
0
75
About v2
mazurov
0
67
L1Calo Offline Software Status
mazurov
0
99
L1Calo Offline Software Status
mazurov
0
98
LHCbPR V2
mazurov
0
130
Paper approval
mazurov
0
62
Conventions' Publications
mazurov
0
59
Ph.D final exam
mazurov
0
110
Other Decks in Science
See All in Science
Agent開発フレームワークのOverviewとW&B Weaveとのインテグレーション
siyoo
0
330
03_草原和博_広島大学大学院人間社会科学研究科教授_デジタル_シティズンシップシティで_新たな_学び__をつくる.pdf
sip3ristex
0
600
統計的因果探索: 背景知識とデータにより因果仮説を探索する
sshimizu2006
4
1k
論文紹介 音源分離:SCNET SPARSE COMPRESSION NETWORK FOR MUSIC SOURCE SEPARATION
kenmatsu4
0
310
Accelerated Computing for Climate forecast
inureyes
PRO
0
120
学術講演会中央大学学員会府中支部
tagtag
0
300
Cross-Media Technologies, Information Science and Human-Information Interaction
signer
PRO
3
31k
KH Coderチュートリアル(スライド版)
koichih
1
46k
データマイニング - グラフデータと経路
trycycle
PRO
1
210
テンソル分解による糖尿病の組織特異的遺伝子発現の統合解析を用いた関連疾患の予測
tagtag
2
240
機械学習 - ニューラルネットワーク入門
trycycle
PRO
0
840
CV_5_3dVision
hachama
0
150
Featured
See All Featured
Music & Morning Musume
bryan
46
6.8k
Embracing the Ebb and Flow
colly
87
4.8k
GitHub's CSS Performance
jonrohan
1032
460k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
111
20k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
8
520
Side Projects
sachag
455
43k
Building Applications with DynamoDB
mza
96
6.6k
What's in a price? How to price your products and services
michaelherold
246
12k
The Invisible Side of Design
smashingmag
301
51k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
44
2.5k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Transcript
HLT CPU Consumption Sasha Mazurov 6 Febrary 2012
Tool Gaudi Auditor & Intel® VTune™ Amplifier XE 2011 Can
be run on any lxplus node
Benefits ➔ Can focus on a specific sequence/algorithm(s). ➔ Skip
initialization & finalization phase. ➔ Report CPU consumption per algorithm / function / class / module. ➔ Perfect GUI & reports.
http://amazurov.ru/cern/intelprofiler/ - installation - documentation - screencasts $> intelprofiler -o
/where/to/store/profiler/output myJob.py
None
Profiler vs. HLT1 Lines (Offline )
https://github.com/mazurov/HltProfiling profiler = IntelProfilerAuditor() profiler.StartFromEventN = 5000 profiler.StopAtEventN = 15000
profiler.IncludeAlgorithms = ["Hlt1TrackAllL0", "Hlt1DiMuonHighMass", "Hlt1DiMuonLowMass"] Jop Options Moore v12r10
Hotspots
Top Hotspots
CPU/Per Function
CPU / Per Module
CPU/Per Algorithm
http://amazurov.ru/cern/hltprofilingresults/
CPU / Per Function In Algorithm
CPU / Per Source Code (debug mode)
TCMalloc vs. “new” Operator
Before: After: CPU: 238 s CPU: 222 s
Results ➔ tc_new is twice faster than “new” operator. ➔
5% total improvement for Hlt1 job.
GCC 4.3 vs. GCC 4.6
GCC 4.3 GCC 4.6 -O2 flag ~ 3.6% worth
Two profiles comparison
Result (preliminary) ➔ It's not evident, that GCC 4.6 optimize
better than GCC 4.3 (for HLT1 jobs).
Future plans ➔ Profile code compiled with GCC 4.6 and
-O3 flag. ➔ Profile code compiled with GCC 4.6's profile driven optimization. ➔ Create a web interface to display collected profiler results.
http://amazurov.ru/cern/hltprofilingpresentation