Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HLT CPU Consumption
Search
Sasha Mazurov
February 06, 2012
Science
0
350
HLT CPU Consumption
Sasha Mazurov
February 06, 2012
Tweet
Share
More Decks by Sasha Mazurov
See All by Sasha Mazurov
L1Calo Offline Software Status
mazurov
0
77
Performance and Regression tests for Simulation
mazurov
0
92
About v2
mazurov
0
70
L1Calo Offline Software Status
mazurov
0
100
L1Calo Offline Software Status
mazurov
0
100
LHCbPR V2
mazurov
0
150
Paper approval
mazurov
0
78
Conventions' Publications
mazurov
0
64
Ph.D final exam
mazurov
0
130
Other Decks in Science
See All in Science
イロレーティングを活用した関東大学サッカーの定量的実力評価 / A quantitative performance evaluation of Kanto University Football Association using Elo rating
konakalab
0
190
データベース12: 正規化(2/2) - データ従属性に基づく正規化
trycycle
PRO
0
1.1k
コミュニティサイエンスの実践@日本認知科学会2025
hayataka88
0
120
Optimization of the Tournament Format for the Nationwide High School Kyudo Competition in Japan
konakalab
0
140
学術講演会中央大学学員会府中支部
tagtag
PRO
0
350
データベース09: 実体関連モデル上の一貫性制約
trycycle
PRO
0
1.1k
【RSJ2025】PAMIQ Core: リアルタイム継続学習のための⾮同期推論・学習フレームワーク
gesonanko
0
640
あなたに水耕栽培を愛していないとは言わせない
mutsumix
1
250
データベース06: SQL (3/3) 副問い合わせ
trycycle
PRO
1
720
やるべきときにMLをやる AIエージェント開発
fufufukakaka
2
1.1k
コンピュータビジョンによるロボットの視覚と判断:宇宙空間での適応と課題
hf149
1
530
次代のデータサイエンティストへ~スキルチェックリスト、タスクリスト更新~
datascientistsociety
PRO
2
28k
Featured
See All Featured
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Documentation Writing (for coders)
carmenintech
77
5.3k
Site-Speed That Sticks
csswizardry
13
1.1k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
350
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
99
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
64
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
New Earth Scene 8
popppiees
1
1.5k
Fireside Chat
paigeccino
41
3.8k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.6k
How to Ace a Technical Interview
jacobian
281
24k
Transcript
HLT CPU Consumption Sasha Mazurov 6 Febrary 2012
Tool Gaudi Auditor & Intel® VTune™ Amplifier XE 2011 Can
be run on any lxplus node
Benefits ➔ Can focus on a specific sequence/algorithm(s). ➔ Skip
initialization & finalization phase. ➔ Report CPU consumption per algorithm / function / class / module. ➔ Perfect GUI & reports.
http://amazurov.ru/cern/intelprofiler/ - installation - documentation - screencasts $> intelprofiler -o
/where/to/store/profiler/output myJob.py
None
Profiler vs. HLT1 Lines (Offline )
https://github.com/mazurov/HltProfiling profiler = IntelProfilerAuditor() profiler.StartFromEventN = 5000 profiler.StopAtEventN = 15000
profiler.IncludeAlgorithms = ["Hlt1TrackAllL0", "Hlt1DiMuonHighMass", "Hlt1DiMuonLowMass"] Jop Options Moore v12r10
Hotspots
Top Hotspots
CPU/Per Function
CPU / Per Module
CPU/Per Algorithm
http://amazurov.ru/cern/hltprofilingresults/
CPU / Per Function In Algorithm
CPU / Per Source Code (debug mode)
TCMalloc vs. “new” Operator
Before: After: CPU: 238 s CPU: 222 s
Results ➔ tc_new is twice faster than “new” operator. ➔
5% total improvement for Hlt1 job.
GCC 4.3 vs. GCC 4.6
GCC 4.3 GCC 4.6 -O2 flag ~ 3.6% worth
Two profiles comparison
Result (preliminary) ➔ It's not evident, that GCC 4.6 optimize
better than GCC 4.3 (for HLT1 jobs).
Future plans ➔ Profile code compiled with GCC 4.6 and
-O3 flag. ➔ Profile code compiled with GCC 4.6's profile driven optimization. ➔ Create a web interface to display collected profiler results.
http://amazurov.ru/cern/hltprofilingpresentation