Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GPU-based Online Tracking for the PANDA Experiment [NV App Lab 2014]

AndiH
October 17, 2014

GPU-based Online Tracking for the PANDA Experiment [NV App Lab 2014]

Talk for the NVIDIA Application Lab workshop 2014 at the Jülich Supercomputing Centre.

AndiH

October 17, 2014
Tweet

More Decks by AndiH

Other Decks in Science

Transcript

  1. Mitglied der Helmholtz-Gemeinschaft 1 NVIDIA Application Lab Workshop 2014 8

    October 2014, Andreas Herten GPU-based Online Tracking for the PANDA Experiment
  2. Mitglied der Helmholtz-Gemeinschaft Outline • PANDA – HEP – Our

    Experiment – Online Event Filter • Algorithms – Hough Transform – Riemann Track Finder – Triplet Finder 2
  3. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe
  4. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe
  5. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe
  6. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe
  7. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe E=mc2
  8. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe E=mc2
  9. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe ✓ E=mc2
  10. Mitglied der Helmholtz-Gemeinschaft High Energy Physics • High Energy Physics

    (HEP): – 3 HEP Recipe 1. Accelerate particles (e, p,…) 2. Accelerate particles more! 3. Smash into each other 4. Look at resulting particles 5. Understand universe ✓ – GPUs are interesting for HEP • Many events due to high collision rate • Events independent, dividable into subsets • Many features extractable (computational intensive) E=mc2
  11. Mitglied der Helmholtz-Gemeinschaft FAIR • Facility for Antiproton and Ion

    Research – New accelerator complex (Darmstadt, Germany) – Next to GSI laboratory – Construction in progress, ending 2018 – Four pillars of research: 4 APPA NUSTAR CBM PANDA Atom & plasma physics Nuclear structure, astro physics Hadron physics Hadron physics
  12. Mitglied der Helmholtz-Gemeinschaft FAIR • Facility for Antiproton and Ion

    Research – New accelerator complex (Darmstadt, Germany) – Next to GSI laboratory – Construction in progress, ending 2018 – Four pillars of research: 4 APPA NUSTAR CBM PANDA Atom & plasma physics Nuclear structure, astro physics Hadron physics Hadron physics © Google Maps
  13. Mitglied der Helmholtz-Gemeinschaft FAIR • Facility for Antiproton and Ion

    Research – New accelerator complex (Darmstadt, Germany) – Next to GSI laboratory – Construction in progress, ending 2018 – Four pillars of research: 4 APPA NUSTAR CBM PANDA Atom & plasma physics Nuclear structure, astro physics Hadron physics Hadron physics
  14. Mitglied der Helmholtz-Gemeinschaft FAIR • Facility for Antiproton and Ion

    Research – New accelerator complex (Darmstadt, Germany) – Next to GSI laboratory – Construction in progress, ending 2018 – Four pillars of research: 4 APPA NUSTAR CBM PANDA Atom & plasma physics Nuclear structure, astro physics Hadron physics Hadron physics fair-center.eu
  15. Mitglied der Helmholtz-Gemeinschaft Mitglied der Helmholtz-Gemeinschaft FAIR Accelerator Complex 5

    Existing facilities New facilities Experiments
  16. Mitglied der Helmholtz-Gemeinschaft Mitglied der Helmholtz-Gemeinschaft FAIR Accelerator Complex 5

    Existing facilities New facilities Experiments Beam momentum: p = 1.5 - 15 GeV/c High resolution: L = 1031 cm-2s-1 Δp/p < 4·10-5 High luminosity: L = 2·1032 cm-2s-1 Δp/p < 10-4 High Energy Storage Ring
  17. Mitglied der Helmholtz-Gemeinschaft PANDA — The Experiment 6 13 m

  18. Mitglied der Helmholtz-Gemeinschaft PANDA — The Experiment 6 13 m

    p p
  19. Mitglied der Helmholtz-Gemeinschaft PANDA — The Experiment 6 13 m

    p p Magnet STT MVD
  20. Mitglied der Helmholtz-Gemeinschaft PANDA — The Experiment 6 13 m

    p p Magnet STT MVD
  21. Mitglied der Helmholtz-Gemeinschaft PANDA — MVD & STT Micro Vertex

    Detector • Silicon-based pixel + strip detector • 10 000 000 + 200 000 channels • Vertex resolution: < 100 μm 7 Straw Tube Tracker • 4636 small drift tubes (ø 1 cm) • Drift times: < 250 ns • 26 layers, 8 skewed • Material budget: 1.2 % radiation length
  22. Mitglied der Helmholtz-Gemeinschaft PANDA — Physics • Meson spectroscopy –

    Light mesons – Charmonium – Open charm – Exotic states • Glueballs • Hybrids • Molecules/multiquarks • Baryon production • Nucleon structure, e.m. processes • Charm in nuclei • Strangeness physics 8 0 2 4 6 8 12 15 10 p Momentum / GeV/c Mass / GeV/c2 1 2 3 4 5 6 ΛΛ ΣΣ ΞΞ Λc Λc Σc Σc Ξc Ξc Ωc Ωc ΩΩ DD Ds Ds ggg,gg light qq π,ρ,ω,f2 ,K,K* cc J/ψ, ηc , χcJ qqqq ccqq nng,ssg ccg nng,ssg ccg ggg
  23. Mitglied der Helmholtz-Gemeinschaft PANDA — Physics • Meson spectroscopy –

    Light mesons – Charmonium – Open charm – Exotic states • Glueballs • Hybrids • Molecules/multiquarks • Baryon production • Nucleon structure, e.m. processes • Charm in nuclei • Strangeness physics 8 → Broad physics program 0 2 4 6 8 12 15 10 p Momentum / GeV/c Mass / GeV/c2 1 2 3 4 5 6 ΛΛ ΣΣ ΞΞ Λc Λc Σc Σc Ξc Ξc Ωc Ωc ΩΩ DD Ds Ds ggg,gg light qq π,ρ,ω,f2 ,K,K* cc J/ψ, ηc , χcJ qqqq ccqq nng,ssg ccg nng,ssg ccg ggg
  24. Mitglied der Helmholtz-Gemeinschaft PANDA — Event Reconstruction • Continuous read

    out – Novel feature – Background & signal similar – No hardware trigger based on few sub-detectors, but online event reconstruction using full detector information 9 (Reject background events, save interesting events) Reduction Amount: Time: ~1/1000 50 ns/evt Storage space for offline analysis 3 PB/y Event: Raw data: 2 × 107/s 200 GB/s Rate
  25. Mitglied der Helmholtz-Gemeinschaft PANDA — Read Out Scheme 10

  26. Mitglied der Helmholtz-Gemeinschaft PANDA — Read Out Scheme 10

  27. Mitglied der Helmholtz-Gemeinschaft PANDA — Read Out Scheme Requirements to

    Online Tracking • Fast • Sophisticated algorithms possible; reprogrammable • Parallelism beyond single devices • Fast • Limited precision ok 10
  28. Mitglied der Helmholtz-Gemeinschaft PANDA — Read Out Scheme Requirements to

    Online Tracking • Fast • Sophisticated algorithms possible; reprogrammable • Parallelism beyond single devices • Fast • Limited precision ok 10 GPUs
  29. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example pp

    → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  30. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example pp

    → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  31. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example pp

    → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  32. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example pp

    → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  33. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example π+

    π- e+ e- ψ‘ pp → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  34. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example π+

    π- e+ e- ψ‘ pp → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  35. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example π+

    π- e+ e- ψ‘ pp → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  36. Mitglied der Helmholtz-Gemeinschaft 11 PANDA — Online Tracking Example pp

    → ψ‘→ ψ π+ π- The physics side: Antiproton-proton event e+e-
  37. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse
  38. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse
  39. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves*
  40. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* actually: 3D helices
  41. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points!
  42. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points!
  43. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits?
  44. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits?
  45. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits? Identify final particles Curvature, length …
  46. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits? Identify final particles Curvature, length … π+ π- e+ e- ?
  47. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits? Identify final particles Curvature, length … Identify intermediate particles Mass constraints Geometry … π+ π- e+ e- ? ψ‘
  48. Mitglied der Helmholtz-Gemeinschaft 12 PANDA — Online Tracking Example The

    detector side Everything in reverse Particle tracks are curves* → Find curves connecting hit points! Sort by track quality Hits well matched? How many hits? Identify final particles Curvature, length … Identify intermediate particles Mass constraints Geometry … Identify process: pp → ψ‘ → e+e- π+ π- π+ π- e+ e- ? ψ‘
  49. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering

  50. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger

  51. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition
  52. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘
  53. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ Usual HEP experiment
  54. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ Usual HEP experiment
  55. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ Usual HEP experiment
  56. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ Usual HEP experiment
  57. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ PANDA Usual HEP experiment
  58. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ PANDA Usual HEP experiment
  59. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ PANDA Usual HEP experiment
  60. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition π+ π- e+ e- ψ‘ PANDA Usual HEP experiment
  61. Mitglied der Helmholtz-Gemeinschaft 13 PANDA — Triggering Trigger Fast detector

    layer(s) Trigger data acquisition Online Tracking! π+ π- e+ e- ψ‘ PANDA Usual HEP experiment
  62. Mitglied der Helmholtz-Gemeinschaft ALGORITHMS #1 14

  63. Mitglied der Helmholtz-Gemeinschaft ALGORITHMS #1 15 Hough Transform Riemann Track

    Finder Triplet Finder
  64. Mitglied der Helmholtz-Gemeinschaft Algorithm: Hough Transform • Idea: Transform (x,y)i

    → (α,r)ij, find lines via (α,r) space • Solve rij line equation for – Lots of hits (x,y,ρ)i and – Many αj ∈ [0°,360°) each • Fill histogram • Extract track parameters 16 x y x y Mitglied der Helmholtz-Gemeinschaft Hough Transform — Princip → Bin giv r α
  65. Mitglied der Helmholtz-Gemeinschaft Algorithm: Hough Transform • Idea: Transform (x,y)i

    → (α,r)ij, find lines via (α,r) space • Solve rij line equation for – Lots of hits (x,y,ρ)i and – Many αj ∈ [0°,360°) each • Fill histogram • Extract track parameters 16 rij = cos ↵j · xi + sin ↵j · yi + ⇢i i: ~100 hits/event (STT) j: every 0.2° rij: 180 000 x y x y Mitglied der Helmholtz-Gemeinschaft Hough Transform — Princip → Bin giv r α
  66. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 17 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  67. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 17 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  68. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 18 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  69. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 18 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  70. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 18 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  71. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Lines 18 •

    Create lines going through hit point (x,y)i – Line parameterized by rij = cos(αj) ⋅ xi + sin(αj) ⋅ yi • Fill line parameters (α,r)ij into histogram – Rasterize for many αj ∈ [0°,180°) α = 0°, 10°, 20°, …
  72. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 19 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  73. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 19 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  74. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 20 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  75. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 20 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  76. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 20 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  77. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Visualization Points 20 •

    Create lines going through hit point (x,y)i • Repeat for every hit point i α = 0°, 10°, 20°, …
  78. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 21 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  79. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 21 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  80. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 21 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  81. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 21 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  82. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 21 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  83. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  84. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  85. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  86. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  87. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  88. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  89. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  90. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  91. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  92. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Granularity 22 • Choice

    of α granularity determines resolution α = 0°, 2°, 4°, … α = 0°, 2°, 4°, …
  93. ° Angle / 0 20 40 60 80 100 120

    140 160 180 Hough transformed -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 Entries 2.2356e+08 Mean x 90 Mean y 0.02905 RMS x 51.96 RMS y 0.1063 0 5 10 15 20 25 0 Entries 2.2356e+08 Mean x 90 Mean y 0.02905 RMS x 51.96 RMS y 0.1063 1800 x 1800 Grid PANDA STT+MVD Mitglied der Helmholtz-Gemeinschaft 23 68 (x,y) points r α Algorithm: Hough Transform
  94. ° Angle / 0 20 40 60 80 100 120

    140 160 180 Hough transformed -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 Entries 2.2356e+08 Mean x 90 Mean y 0.02905 RMS x 51.96 RMS y 0.1063 0 5 10 15 20 25 0 Entries 2.2356e+08 Mean x 90 Mean y 0.02905 RMS x 51.96 RMS y 0.1063 1800 x 1800 Grid PANDA STT+MVD Mitglied der Helmholtz-Gemeinschaft 23 68 (x,y) points r α Algorithm: Hough Transform
  95. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Remarks Two Implementations •

    Thrust (CUDA‘s STL) – Performance: 3 ms/evt – Reduce to set of standard routines • Fast (uses Thrust‘s optimized algorithms) • Inflexible (hard to customize) • Not yet at performance maximum • Plain CUDA – Performance: 0.5 ms/evt – Built completely for this task • Fitting for PANDA; customizable • A bit more complicated at parts • 24 Peakfinding challenging
  96. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Remarks 25 ° /

    α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 40 HoughHist Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 0 2 4 6 8 10 12 14 16 18 HoughHist Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 HT histogram Hill Climber Peakfinding challenging
  97. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Remarks 25 ° /

    α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 40 houghIt0 Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 0 2 4 6 8 10 12 14 16 18 houghIt0 Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 HT histogram ° / α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 40 houghIt1 Entries 5580 Mean x 89.6 Mean y 9.719 RMS x 51.78 RMS y 18.09 0 2 4 6 8 10 12 14 16 houghIt1 Entries 5580 Mean x 89.6 Mean y 9.719 RMS x 51.78 RMS y 18.09 HT histogram ° / α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 houghIt2 Entries 2700 Mean x 89.13 Mean y 13.79 RMS x 51.77 RMS y 14.04 0 2 4 6 8 10 12 houghIt2 Entries 2700 Mean x 89.13 Mean y 13.79 RMS x 51.77 RMS y 14.04 HT histogram -40 -30 -20 -10 0 10 20 30 40 0 5 10 15 20 25 30 Iterative Maximum Deleter Peakfinding challenging
  98. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Remarks 25 ° /

    α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 40 houghIt0 Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 0 2 4 6 8 10 12 14 16 18 houghIt0 Entries 9000 Mean x 89.33 Mean y 6.66 RMS x 51.8 RMS y 19.2 HT histogram ° / α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 40 houghIt1 Entries 5580 Mean x 89.6 Mean y 9.719 RMS x 51.78 RMS y 18.09 0 2 4 6 8 10 12 14 16 houghIt1 Entries 5580 Mean x 89.6 Mean y 9.719 RMS x 51.78 RMS y 18.09 HT histogram ° / α 0 20 40 60 80 100 120 140 160 180 r -30 -20 -10 0 10 20 30 houghIt2 Entries 2700 Mean x 89.13 Mean y 13.79 RMS x 51.77 RMS y 14.04 0 2 4 6 8 10 12 houghIt2 Entries 2700 Mean x 89.13 Mean y 13.79 RMS x 51.77 RMS y 14.04 HT histogram -40 -30 -20 -10 0 10 20 30 40 0 5 10 15 20 25 30 Iterative Maximum Deleter Peakfinding challenging current research
  99. Mitglied der Helmholtz-Gemeinschaft 26 ALGORITHMS #2 Hough Transform Riemann Track

    Finder Triplet Finder
  100. Mitglied der Helmholtz-Gemeinschaft 27 Riemann Track Finder — Method •

    Idea: Don‘t fit lines (in 2D), fit planes (in 3D)! • Create seeds – All possible three hit combinations • Grow seeds to tracks Continuously test next hit if it fits – Use mapping to Riemann paraboloid (+ s-z fit, det. layer) x x x x y z‘ x x x y x x x x y x More on: Seeds; Growing 1 2
  101. Mitglied der Helmholtz-Gemeinschaft 28 Riemann Track Finder — GPU Adaptations

    CPU GPU
  102. Mitglied der Helmholtz-Gemeinschaft 28 Riemann Track Finder — GPU Adaptations

    CPU GPU 3 loops to generate seeds serially for (int i = 0; i < hitsInLayerOne.size(); i++) { for (int j = 0; j < hitsInLayerTwo.size(); j++) { for (int k = 0; k < hitsInLayerThree.size(); k++) { /* Triplet Generation */ } } } Needed: Mapping of inherent GPU indexing variable to triplet index int ijk = threadIdx.x + blockIdx.x * blockDim.x; nLayerx = 1 2 ⇣p 8x + 1 1 ⌘ pos ( nLayerx ) = 3 pp 3 p 243x2 1 + 27x 32 / 3 + 1 3 p 3 3 pp 3 p 243x2 1 + 27x 1 1
  103. Mitglied der Helmholtz-Gemeinschaft 28 Riemann Track Finder — GPU Adaptations

    CPU GPU 3 loops to generate seeds serially for (int i = 0; i < hitsInLayerOne.size(); i++) { for (int j = 0; j < hitsInLayerTwo.size(); j++) { for (int k = 0; k < hitsInLayerThree.size(); k++) { /* Triplet Generation */ } } } Needed: Mapping of inherent GPU indexing variable to triplet index int ijk = threadIdx.x + blockIdx.x * blockDim.x; nLayerx = 1 2 ⇣p 8x + 1 1 ⌘ pos ( nLayerx ) = 3 pp 3 p 243x2 1 + 27x 32 / 3 + 1 3 p 3 3 pp 3 p 243x2 1 + 27x 1 1 2 Port of CPU code; parallelism on seed base Only easy computations; e.g. 3x3 matrices
  104. Mitglied der Helmholtz-Gemeinschaft 28 Riemann Track Finder — GPU Adaptations

    CPU GPU → 100 × faster than CPU version: ~0.6 ms/event Still needs implementing into PandaRoot 3 loops to generate seeds serially for (int i = 0; i < hitsInLayerOne.size(); i++) { for (int j = 0; j < hitsInLayerTwo.size(); j++) { for (int k = 0; k < hitsInLayerThree.size(); k++) { /* Triplet Generation */ } } } Needed: Mapping of inherent GPU indexing variable to triplet index int ijk = threadIdx.x + blockIdx.x * blockDim.x; nLayerx = 1 2 ⇣p 8x + 1 1 ⌘ pos ( nLayerx ) = 3 pp 3 p 243x2 1 + 27x 32 / 3 + 1 3 p 3 3 pp 3 p 243x2 1 + 27x 1 1 2 Port of CPU code; parallelism on seed base Only easy computations; e.g. 3x3 matrices
  105. Mitglied der Helmholtz-Gemeinschaft 29 ALGORITHMS #3 Hough Transform Riemann Track

    Finder Triplet Finder
  106. Mitglied der Helmholtz-Gemeinschaft 30 Triplet Finder • Algorithm specifically designed

    for the PANDA Straw Tube Tracker (STT) • Ported to GPU by Andrew Adinetz – CUDA, Dynamic Parallelism, Thrust – Quality of tracks comparable to CPU http://www.fz-juelich.de/ias/jsc/ Original algorithm by Marius Mertens et al 1.5 m
  107. Mitglied der Helmholtz-Gemeinschaft 31 Triplet Finder • Idea: Use only

    subset of detector as seed – Don‘t use STT isochrones (drift times) – Calculate circle from 3 points (no fit) More
  108. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

  109. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes
  110. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes Electrons drift to anode wire, ions to wall
  111. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes Electrons drift to anode wire, ions to wall Signal only when electrons arrive at wire No information about drift duration! For that, start time (t0) needed: t0 - tarrival ≈ tdrift vdrift = const → tdrift • vdrift = risochrone
  112. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes Electrons drift to anode wire, ions to wall Signal only when electrons arrive at wire No information about drift duration! For that, start time (t0) needed: t0 - tarrival ≈ tdrift vdrift = const → tdrift • vdrift = risochrone risochrone
  113. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes Resolution without t0: (0.1 cm) (rtube) Resolution with t0: (0.015 cm) Electrons drift to anode wire, ions to wall Signal only when electrons arrive at wire No information about drift duration! For that, start time (t0) needed: t0 - tarrival ≈ tdrift vdrift = const → tdrift • vdrift = risochrone risochrone
  114. Mitglied der Helmholtz-Gemeinschaft STT — Drift Tubes and t0 32

    Particle ionizes gas atoms in drift tubes Resolution without t0: (0.1 cm) (rtube) Resolution with t0: (0.015 cm) Usual HEP experiment: t0 by trigger But PANDA has no trigger… Electrons drift to anode wire, ions to wall Signal only when electrons arrive at wire No information about drift duration! For that, start time (t0) needed: t0 - tarrival ≈ tdrift vdrift = const → tdrift • vdrift = risochrone risochrone
  115. Mitglied der Helmholtz-Gemeinschaft 33 Triplet Finder • Idea: Use only

    subset of detector as seed – Don‘t use STT isochrones (drift times) – Calculate circle from 3 points (no fit) More
  116. Mitglied der Helmholtz-Gemeinschaft 33 Triplet Finder • Idea: Use only

    subset of detector as seed – Don‘t use STT isochrones (drift times) – Calculate circle from 3 points (no fit) • Features – Fast & robust algorithm, no event time needed – Many tuning possibilities More
  117. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Animation 34 Triplet Isochrone

    early Isochrone early & skewed Isochrone close Isochrone late MVD hit Track timed out Track current
  118. Mitglied der Helmholtz-Gemeinschaft 35 Triplet Finder — Times K20X in

    publication
  119. Mitglied der Helmholtz-Gemeinschaft 35 Triplet Finder — Times K20X in

    publication
  120. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36
  121. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36 Hit
  122. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36 Hit Event
  123. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36 Hit Event
  124. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36 Hit Event Bunch
  125. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Bunching Wrapper

    – Hits from one event have similar timestamps – Combine hits to sets (bunches) which occupy GPU best 36 Hit Event Bunch (N2) → (N)
  126. Mitglied der Helmholtz-Gemeinschaft 37 Triplet Finder — Bunching Performance K20X

    in publication
  127. Dynamic Parallelism Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations •

    Compare data processing strategies 38 1 thread/bunch Calling kernel 1 thread/bunch Calling kernel Triplet Finder 1 thread/bunch Calling kernel 1 block/bunch Joined kernel 1 block/bunch Joined kernel 1 block/bunch Joined kernel TF Stage #1 TF Stage #2 TF Stage #3 TF Stage #4 1 stream/bunch Combining stream 1 stream/bunch Combining stream 1 stream/bunch Calling stream Joined Kernel Host Streams Triplet Finder Triplet Finder CPU GPU TF Stage #1 TF Stage #2 TF Stage #3 TF Stage #4 TF Stage #1 TF Stage #2 TF Stage #3 TF Stage #4
  128. Mitglied der Helmholtz-Gemeinschaft 39 Triplet Finder — Bunch Sizes in

    publication
  129. Mitglied der Helmholtz-Gemeinschaft 40 Triplet Finder — Data Processing Explanation

    K20X in publication
  130. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Binning: Sector Rows 41

    • Sector Row testing – After found track: Hit association not with all hits of current window, but only with subset (first test rows of sector, then hits of row) More
  131. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Binning: Sector Rows 41

    • Sector Row testing – After found track: Hit association not with all hits of current window, but only with subset (first test rows of sector, then hits of row) More
  132. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Binning: Sector Rows 41

    • Sector Row testing – After found track: Hit association not with all hits of current window, but only with subset (first test rows of sector, then hits of row) More
  133. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Binning: Sector Rows 41

    • Sector Row testing – After found track: Hit association not with all hits of current window, but only with subset (first test rows of sector, then hits of row) More
  134. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Binning: Sector Rows 41

    • Sector Row testing – After found track: Hit association not with all hits of current window, but only with subset (first test rows of sector, then hits of row) More
  135. Mitglied der Helmholtz-Gemeinschaft 42 Triplet Finder — Binning: Sector Rows

    K20X in publication All Tubes (No Binning) Sector-Row Binning
  136. Mitglied der Helmholtz-Gemeinschaft 43 Triplet Finder — Binning: Skewlets K20X

    in publication Skewlet Binning All Skewlets (No Binning)
  137. Mitglied der Helmholtz-Gemeinschaft 44 Triplet Finder — AoS vs. SoA

    K20X in publication
  138. Mitglied der Helmholtz-Gemeinschaft 44 Triplet Finder — AoS vs. SoA

    K20X in publication → Andrew‘s talk
  139. Mitglied der Helmholtz-Gemeinschaft 45 Triplet Finder — CUDA Versions K20X

    in publication
  140. Mitglied der Helmholtz-Gemeinschaft 46 Triplet Finder — Clock Speed /

    GPU K40 3004 MHz, 745 MHz / 875 MHz K20X 2600 MHz, 732 MHz / 784 MHz Memory Clock Core Clock GPU Boost in publication
  141. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Summary • Best performance:

    14 µs/event – 14⋅10-6 s/event * 2⋅107 event/s 㱺 280 GPUs2014 – PANDA2019: Multi GPU system – (100) GPUs • Optimizations possible & needed – ε needs to be improved – Speed, €: • More float less double-cards a la K10 • Consumer-grade cards a la GTX 47
  142. Mitglied der Helmholtz-Gemeinschaft Summary • PANDA researches in using GPUs

    as part of online event reconstruction scheme • Algorithms in active evaluation and optimization – Triplet Finder performance-optimized • Data transfer to GPU in research: FairMQ 48
  143. Thank you! Andreas Herten a.herten@fz-juelich.de Mitglied der Helmholtz-Gemeinschaft Summary •

    PANDA researches in using GPUs as part of online event reconstruction scheme • Algorithms in active evaluation and optimization – Triplet Finder performance-optimized • Data transfer to GPU in research: FairMQ 48
  144. Mitglied der Helmholtz-Gemeinschaft List of Resources Used • #3: Jupiter

    icon by Nikki Rodriguez from The Noun Project • #3: Einstein icon by Roman Rusinov from The Noun Project • #6: FAIR vector logo from official FAIR website • #6: FAIR rendering from official website • #11: Flare Gun icon by Jop van der Kroef from The Noun Project • #27: STT event animation by Marius C. Mertens • #35: Graphics cards images by NVIDIA promotion • #35: GPU Specifications – Tesla K20X Specifications: http://www.nvidia.com/content/PDF/kepler/Tesla- K20X-BD-06397-001-v07.pdf – Tesla K40 Specifications: http://www.nvidia.com/content/PDF/kepler/Tesla-K40- Active-Board-Spec-BD-06949-001_v03.pdf – Tesla Familiy Overview: http://www.nvidia.com/content/tesla/pdf/NVIDIA-Tesla- Kepler-Family-Datasheet.pdf 49
  145. Mitglied der Helmholtz-Gemeinschaft BACKUP 50

  146. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 Back

  147. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    Back
  148. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    Back
  149. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * (r, α)1 rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  150. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α (r, α)1 rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  151. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α (r, α)1 rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  152. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α (r, α)1 (r, α)2 rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  153. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  154. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  155. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  156. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  157. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  158. Mitglied der Helmholtz-Gemeinschaft Hough Transform — Principle 51 x y

    → Bin with highest multiplicity gives track parameters * * r α rij = cos ↵j · xi + sin ↵j · yi + ⇢i Back
  159. Mitglied der Helmholtz-Gemeinschaft 52 Riemann Algorithm — Procedure

  160. Mitglied der Helmholtz-Gemeinschaft 52 Riemann Algorithm — Procedure • Create

    triplet of hit points – All possible three hit combinations need to become triplets 1
  161. Mitglied der Helmholtz-Gemeinschaft 52 Riemann Algorithm — Procedure • Create

    triplet of hit points – All possible three hit combinations need to become triplets • Grow triplets to tracks: Continuously test next hit if it fits to triplet track – Use Riemann paraboloid to circle fit track • Test closeness of new hit: good → add hit; bad → dismiss hit • Continue with next hit – Helix fit: arc length s vs. z position 1 2
  162. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 1

    2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  163. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 1

    2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  164. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 1

    2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  165. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 21

    11 31 1 2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  166. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 21

    11 31 31 11 41 1 2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  167. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 21

    11 31 31 11 41 31 11 32 1 2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  168. Mitglied der Helmholtz-Gemeinschaft 53 1 2 3 4 5 21

    11 31 31 11 41 31 11 32 1 2 3 4 5 Riemann Track Finder — 1 Seeds 1 Layer number Back
  169. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    Back
  170. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ Back
  171. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) Back
  172. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) Back
  173. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) Back
  174. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) Back
  175. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) Back
  176. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) x Back
  177. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) x Back
  178. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) x Back
  179. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) x Back
  180. Mitglied der Helmholtz-Gemeinschaft 54 Riemann Algorithm — 1 Expansion 2

    x x x x y z‘ Expand to z‘ x x x y x Riemann Surface (paraboloid) x Back
  181. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method 55 STT More

  182. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method 55 STT More

  183. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method 55 STT More

  184. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method 55 STT More

  185. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw 55 STT More
  186. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) 55 STT More
  187. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 55 STT More
  188. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 55 STT More
  189. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 55 STT More
  190. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 55 STT More
  191. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 2.Interaction point 55 Interaction Point STT More
  192. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 2.Interaction point • Calculate circle through three points 55 Interaction Point STT More
  193. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Method • STT hit

    in pivot straw • Find surrounding hits → Create virtual hit (triplet) at center of gravity (cog) • Combine with 1.Second STT pivot-cog virtual hit 2.Interaction point • Calculate circle through three points → Track Candidate 55 Interaction Point STT More
  194. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Optimizations • Sector Row

    testing – Thicken track; shrink sector row layer to line – Find intersection 56 Sector-Row Testing Track Sector-Row Track Sector-Row Back
  195. Mitglied der Helmholtz-Gemeinschaft Triplet Finder — Kernel Launch Strategies •

    Joined Kernel (JK): slowest – High # registers → low occupancy • Dynamic Parallelism (DP) / Host Streams (HS): comparable performance – Performance • HS faster for small # processed hits, DP faster for > 45000 hits • HS stagnates there, while DP continues rising – Limiting factor • High # of required kernel calls • Kernel launch latency • Memcopy – HS more affected by this, because • More PCI-E transfers (launch configurations for kernels) • Less launch throughput, kernel launch latency gets more important • False dependencies of launched kernels – Single CPU thread handles all CUDA streams (Multi-thread possible, but synchronization overhead too high for good performance) – Grid scheduling done on hardware (Grid Management Unit) (DP: software) » False dependencies when N(streams) > N(device connections)=323.5 57 Back
  196. Mitglied der Helmholtz-Gemeinschaft 58 Triplet Finder — Host Stream Connections

    in publication