Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HLT CPU Consumption
Search
Sasha Mazurov
February 06, 2012
Science
0
340
HLT CPU Consumption
Sasha Mazurov
February 06, 2012
Tweet
Share
More Decks by Sasha Mazurov
See All by Sasha Mazurov
L1Calo Offline Software Status
mazurov
0
72
Performance and Regression tests for Simulation
mazurov
0
71
About v2
mazurov
0
65
L1Calo Offline Software Status
mazurov
0
96
L1Calo Offline Software Status
mazurov
0
98
LHCbPR V2
mazurov
0
130
Paper approval
mazurov
0
58
Conventions' Publications
mazurov
0
58
Ph.D final exam
mazurov
0
100
Other Decks in Science
See All in Science
地質研究者が苦労しながら運用する情報公開システムの実例
naito2000
0
220
オンプレミス環境にKubernetesを構築する
koukimiura
0
280
マウス肝炎ウイルス感染の遺伝子発現へのテンソル分解の適用によるSARS-CoV-2感染関連重要ヒト遺伝子と有効な薬剤の同定
tagtag
0
120
02_西村訓弘_プログラムディレクター_人口減少を機にひらく未来社会.pdf
sip3ristex
0
510
CV_5_3dVision
hachama
0
140
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
130
統計学入門講座 第4回スライド
techmathproject
0
150
データベース04: SQL (1/3) 単純質問 & 集約演算
trycycle
PRO
0
880
「美は世界を救う」を心理学で実証したい~クラファンを通じた新しい研究方法
jimpe_hitsuwari
1
140
メール送信サーバの集約における透過型SMTP プロキシの定量評価 / Quantitative Evaluation of Transparent SMTP Proxy in Email Sending Server Aggregation
linyows
0
960
2025-06-11-ai_belgium
sofievl
1
130
学術講演会中央大学学員会府中支部
tagtag
0
280
Featured
See All Featured
The Cult of Friendly URLs
andyhume
79
6.5k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
520
The Straight Up "How To Draw Better" Workshop
denniskardys
235
140k
Done Done
chrislema
184
16k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
54k
Speed Design
sergeychernyshev
32
1k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
21k
Making Projects Easy
brettharned
116
6.3k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
[RailsConf 2023] Rails as a piece of cake
palkan
55
5.7k
It's Worth the Effort
3n
185
28k
Writing Fast Ruby
sferik
628
62k
Transcript
HLT CPU Consumption Sasha Mazurov 6 Febrary 2012
Tool Gaudi Auditor & Intel® VTune™ Amplifier XE 2011 Can
be run on any lxplus node
Benefits ➔ Can focus on a specific sequence/algorithm(s). ➔ Skip
initialization & finalization phase. ➔ Report CPU consumption per algorithm / function / class / module. ➔ Perfect GUI & reports.
http://amazurov.ru/cern/intelprofiler/ - installation - documentation - screencasts $> intelprofiler -o
/where/to/store/profiler/output myJob.py
None
Profiler vs. HLT1 Lines (Offline )
https://github.com/mazurov/HltProfiling profiler = IntelProfilerAuditor() profiler.StartFromEventN = 5000 profiler.StopAtEventN = 15000
profiler.IncludeAlgorithms = ["Hlt1TrackAllL0", "Hlt1DiMuonHighMass", "Hlt1DiMuonLowMass"] Jop Options Moore v12r10
Hotspots
Top Hotspots
CPU/Per Function
CPU / Per Module
CPU/Per Algorithm
http://amazurov.ru/cern/hltprofilingresults/
CPU / Per Function In Algorithm
CPU / Per Source Code (debug mode)
TCMalloc vs. “new” Operator
Before: After: CPU: 238 s CPU: 222 s
Results ➔ tc_new is twice faster than “new” operator. ➔
5% total improvement for Hlt1 job.
GCC 4.3 vs. GCC 4.6
GCC 4.3 GCC 4.6 -O2 flag ~ 3.6% worth
Two profiles comparison
Result (preliminary) ➔ It's not evident, that GCC 4.6 optimize
better than GCC 4.3 (for HLT1 jobs).
Future plans ➔ Profile code compiled with GCC 4.6 and
-O3 flag. ➔ Profile code compiled with GCC 4.6's profile driven optimization. ➔ Create a web interface to display collected profiler results.
http://amazurov.ru/cern/hltprofilingpresentation