Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HLT CPU Consumption
Search
Sasha Mazurov
February 06, 2012
Science
0
340
HLT CPU Consumption
Sasha Mazurov
February 06, 2012
Tweet
Share
More Decks by Sasha Mazurov
See All by Sasha Mazurov
L1Calo Offline Software Status
mazurov
0
72
Performance and Regression tests for Simulation
mazurov
0
71
About v2
mazurov
0
65
L1Calo Offline Software Status
mazurov
0
96
L1Calo Offline Software Status
mazurov
0
98
LHCbPR V2
mazurov
0
130
Paper approval
mazurov
0
58
Conventions' Publications
mazurov
0
58
Ph.D final exam
mazurov
0
100
Other Decks in Science
See All in Science
Lean4による汎化誤差評価の形式化
milano0017
1
250
データベース10: 拡張実体関連モデル
trycycle
PRO
0
720
ガウス過程回帰とベイズ最適化
nearme_tech
PRO
1
450
academist Prize 4期生 研究トーク延長戦!「美は世界を救う」っていうけど、どうやって?
jimpe_hitsuwari
0
140
LayerXにおける業務の完全自動運転化に向けたAI技術活用事例 / layerx-ai-jsai2025
shimacos
2
1.2k
機械学習 - 決定木からはじめる機械学習
trycycle
PRO
0
990
機械学習 - DBSCAN
trycycle
PRO
0
920
05_山中真也_室蘭工業大学大学院工学研究科教授_だてプロの挑戦.pdf
sip3ristex
0
520
Hakonwa-Quaternion
hiranabe
1
110
データベース09: 実体関連モデル上の一貫性制約
trycycle
PRO
0
710
論文紹介 音源分離:SCNET SPARSE COMPRESSION NETWORK FOR MUSIC SOURCE SEPARATION
kenmatsu4
0
200
眼科AIコンテスト2024_特別賞_6位Solution
pon0matsu
0
420
Featured
See All Featured
What's in a price? How to price your products and services
michaelherold
246
12k
Gamification - CAS2011
davidbonilla
81
5.4k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
60k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.5k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
690
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
46
9.6k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
233
17k
Why Our Code Smells
bkeepers
PRO
336
57k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
For a Future-Friendly Web
brad_frost
179
9.8k
Transcript
HLT CPU Consumption Sasha Mazurov 6 Febrary 2012
Tool Gaudi Auditor & Intel® VTune™ Amplifier XE 2011 Can
be run on any lxplus node
Benefits ➔ Can focus on a specific sequence/algorithm(s). ➔ Skip
initialization & finalization phase. ➔ Report CPU consumption per algorithm / function / class / module. ➔ Perfect GUI & reports.
http://amazurov.ru/cern/intelprofiler/ - installation - documentation - screencasts $> intelprofiler -o
/where/to/store/profiler/output myJob.py
None
Profiler vs. HLT1 Lines (Offline )
https://github.com/mazurov/HltProfiling profiler = IntelProfilerAuditor() profiler.StartFromEventN = 5000 profiler.StopAtEventN = 15000
profiler.IncludeAlgorithms = ["Hlt1TrackAllL0", "Hlt1DiMuonHighMass", "Hlt1DiMuonLowMass"] Jop Options Moore v12r10
Hotspots
Top Hotspots
CPU/Per Function
CPU / Per Module
CPU/Per Algorithm
http://amazurov.ru/cern/hltprofilingresults/
CPU / Per Function In Algorithm
CPU / Per Source Code (debug mode)
TCMalloc vs. “new” Operator
Before: After: CPU: 238 s CPU: 222 s
Results ➔ tc_new is twice faster than “new” operator. ➔
5% total improvement for Hlt1 job.
GCC 4.3 vs. GCC 4.6
GCC 4.3 GCC 4.6 -O2 flag ~ 3.6% worth
Two profiles comparison
Result (preliminary) ➔ It's not evident, that GCC 4.6 optimize
better than GCC 4.3 (for HLT1 jobs).
Future plans ➔ Profile code compiled with GCC 4.6 and
-O3 flag. ➔ Profile code compiled with GCC 4.6's profile driven optimization. ➔ Create a web interface to display collected profiler results.
http://amazurov.ru/cern/hltprofilingpresentation