Slide 1

Slide 1 text

Heterogeneous Computing Trends and Business Value Creation Thayaparan Sripavan MillenniumIT June 2017

Slide 2

Slide 2 text

End of ‘Free Lunch’ ~2005 - 2007 Frequency Scaling

Slide 3

Slide 3 text

End of ‘Free Lunch’ ~2005 - 2007 Frequency Scaling Core Replication NUMA NU I/O

Slide 4

Slide 4 text

Design for ‘concurrency’ ~2005 - 2007 Frequency Scaling Core Replication Exploit Concurrency NUMA NU I/O

Slide 5

Slide 5 text

Parallel Processors Gaming / Graphics Reconfigurable Digital Systems

Slide 6

Slide 6 text

Fabrication Technology 0 10 20 30 40 50 60 70 80 90 100 2006 2008 2011 2013 2016 CPU FPGA GPU Process Node (nm)

Slide 7

Slide 7 text

Fabrication Technology 0 10 20 30 40 50 60 70 80 90 100 2006 2008 2011 2013 2016 CPU FPGA GPU “Parity on process node!” Roadmap visibility till – 2030 (IEEE) CPU : 35x Cores FPGA : 20x Estate GPU : 50x FLOPS Process Node (nm)

Slide 8

Slide 8 text

Evolution More Cores , Integrated Memory, I/O & Graphics… Models, Languages, Libs … Hard IP, Soft IP, SoC, High Level Synthesis… {Scaling / Efficiency} {Transformation} 2006 – 2016 + +

Slide 9

Slide 9 text

Interconnects 2006 – 2016 Parity on process technology increasing interconnect bandwidth Interconnect latencies Synchronization overhead Source : National Instruments 8000x BW ; Latency Wall @ ~300 ns

Slide 10

Slide 10 text

Goldilocks Problem Dense – Homogeneous & Distributed Distributed Heterogeneous & Distributed {Extension} {Transformation}

Slide 11

Slide 11 text

Challenges • Heterogeneous design space – Continuously Expanding Horizon • Constrained by interconnect speeds / synchronization – Inefficient Boundaries • Coarse-grained design partitioning – Compromises in Solutions • Deal with a mix of tools and methodologies – Complexity / Skill Specificity Heterogeneous Approach - Differentiator or Killer?

Slide 12

Slide 12 text

Value Creation Non-Functional Enhancement Functional Enhancement Paradigm Shifts Business Model Transformation Legacy Heterogeneous Technology Business Model Existing New Transformation Base Constrained by Technology (Prohibitive) Constrained by Business Model (Sub-optimal) Enhancement

Slide 13

Slide 13 text

Value Creation Non-Functional Enhancement Functional Enhancement Paradigm Shift Business Model Transformation Predictably high performance Space efficiency Energy efficiency

Slide 14

Slide 14 text

Case : Real Time Market Data • Coarse – grained mapping to Processors • Realtime – FPGA / Rest CPU • Sub-component on FPGA • Matching scalability • Boundary inefficiency • Use case based optimization Consistent (>10x) performance at best space efficiency. • Best Execution Transparency • Open Access • Increased liquidity • Balanced Performance & Configurability

Slide 15

Slide 15 text

The Convergence NVLink Multi-Chip P2P PCIe (PCIe 3) 3D FPGAs Direct Cache Access (PCIe 3) NVLink (IBM) Direct Cache Access (PCIe 3) CAPI (IBM) FPGA On-Die (Intel) P2P PCIe (PCIe 3) GPU Direct (P2P PCIe)

Slide 16

Slide 16 text

Goldilocks Problem Gets Complex Coarse-Grained {Transformation} Fine-Grained (Sub-Optimal ) {Extension} Fine-Grained (Optimal)

Slide 17

Slide 17 text

Tackling the Challenge “Fine-Grained Modular Designs” “Dynamic / Late Mapping” “User Scheduled” “Domain Specific Descriptions”

Slide 18

Slide 18 text

Value Creation Non-Functional Enhancement Functional Enhancement Paradigm Shifts Business Model Transformation Brute force solutions in real- time for complex problems Sensitive to emerging data & predictive analytics Deep learning, speculation & statistical processing

Slide 19

Slide 19 text

Value Creation Non-Functional Enhancement Functional Enhancement Paradigm Shifts Business Model Transformation Real time visibility into risk, margin and portfolio optimization Self healing systems Technology agnostic Post-Trade Description Language (PDL)

Slide 20

Slide 20 text

Case : Real Time Risk

Slide 21

Slide 21 text

Thank you Thayaparan Sripavan [email protected] June 2017