Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HLT CPU Consumption
Search
Sasha Mazurov
February 06, 2012
Science
0
350
HLT CPU Consumption
Sasha Mazurov
February 06, 2012
Tweet
Share
More Decks by Sasha Mazurov
See All by Sasha Mazurov
L1Calo Offline Software Status
mazurov
0
77
Performance and Regression tests for Simulation
mazurov
0
88
About v2
mazurov
0
70
L1Calo Offline Software Status
mazurov
0
100
L1Calo Offline Software Status
mazurov
0
100
LHCbPR V2
mazurov
0
140
Paper approval
mazurov
0
75
Conventions' Publications
mazurov
0
64
Ph.D final exam
mazurov
0
120
Other Decks in Science
See All in Science
Accelerating operator Sinkhorn iteration with overrelaxation
tasusu
0
140
生成検索エンジン最適化に関する研究の紹介
ynakano
2
1.5k
機械学習 - K近傍法 & 機械学習のお作法
trycycle
PRO
0
1.3k
データマイニング - コミュニティ発見
trycycle
PRO
0
190
Algorithmic Aspects of Quiver Representations
tasusu
0
140
イロレーティングを活用した関東大学サッカーの定量的実力評価 / A quantitative performance evaluation of Kanto University Football Association using Elo rating
konakalab
0
160
Accelerated Computing for Climate forecast
inureyes
PRO
0
140
【論文紹介】Is CLIP ideal? No. Can we fix it?Yes! 第65回 コンピュータビジョン勉強会@関東
shun6211
5
2.2k
baseballrによるMLBデータの抽出と階層ベイズモデルによる打率の推定 / TokyoR118
dropout009
2
640
データマイニング - グラフ埋め込み入門
trycycle
PRO
1
140
Text-to-SQLの既存の評価指標を問い直す
gotalab555
1
150
機械学習 - K-means & 階層的クラスタリング
trycycle
PRO
0
1.2k
Featured
See All Featured
Docker and Python
trallard
47
3.7k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
The Limits of Empathy - UXLibs8
cassininazir
1
200
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Technical Leadership for Architectural Decision Making
baasie
0
200
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
300
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.8k
How to train your dragon (web standard)
notwaldorf
97
6.5k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
180
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Transcript
HLT CPU Consumption Sasha Mazurov 6 Febrary 2012
Tool Gaudi Auditor & Intel® VTune™ Amplifier XE 2011 Can
be run on any lxplus node
Benefits ➔ Can focus on a specific sequence/algorithm(s). ➔ Skip
initialization & finalization phase. ➔ Report CPU consumption per algorithm / function / class / module. ➔ Perfect GUI & reports.
http://amazurov.ru/cern/intelprofiler/ - installation - documentation - screencasts $> intelprofiler -o
/where/to/store/profiler/output myJob.py
None
Profiler vs. HLT1 Lines (Offline )
https://github.com/mazurov/HltProfiling profiler = IntelProfilerAuditor() profiler.StartFromEventN = 5000 profiler.StopAtEventN = 15000
profiler.IncludeAlgorithms = ["Hlt1TrackAllL0", "Hlt1DiMuonHighMass", "Hlt1DiMuonLowMass"] Jop Options Moore v12r10
Hotspots
Top Hotspots
CPU/Per Function
CPU / Per Module
CPU/Per Algorithm
http://amazurov.ru/cern/hltprofilingresults/
CPU / Per Function In Algorithm
CPU / Per Source Code (debug mode)
TCMalloc vs. “new” Operator
Before: After: CPU: 238 s CPU: 222 s
Results ➔ tc_new is twice faster than “new” operator. ➔
5% total improvement for Hlt1 job.
GCC 4.3 vs. GCC 4.6
GCC 4.3 GCC 4.6 -O2 flag ~ 3.6% worth
Two profiles comparison
Result (preliminary) ➔ It's not evident, that GCC 4.6 optimize
better than GCC 4.3 (for HLT1 jobs).
Future plans ➔ Profile code compiled with GCC 4.6 and
-O3 flag. ➔ Profile code compiled with GCC 4.6's profile driven optimization. ➔ Create a web interface to display collected profiler results.
http://amazurov.ru/cern/hltprofilingpresentation