$30 off During Our Annual Pro Sale. View Details »

SC03 Sun Microsystems Keynote

Adrian Cockcroft
November 19, 2022
83

SC03 Sun Microsystems Keynote

I wrote and presented this deck at Supercomputing 2003 in Phoenix AZ when I was the Chief Architect for the High Performance Technical Computing team at Sun. Since then some things have changed, but also a lot has stayed the same...

Adrian Cockcroft

November 19, 2022
Tweet

Transcript

  1. Adrian Cockcroft
    HPTC Chief Architect
    Sun Microsystems Inc.
    11/24/03
    High Performance
    Technical Computing Solutions
    “Grid Everywhere”
    www.sun.com/hptc

    View Slide

  2. Agenda
    Agenda
    Organization
    Organization
    Technical Strategy
    Technical Strategy
    Architecture
    Architecture
    Solutions
    Solutions
    Execution
    Execution

    View Slide

  3. Challenges

    Accelerate innovation

    Accelerate competitive advantage
    – For Sun
    – For customers

    Accelerate research to solution

    Accelerate solution to mainstream
    – Reproducible results
    – Lower delivered cost
    Not just technology challenges . . .
    Organizational challenges . . .

    View Slide

  4. Organizational Evolution
    Customer
    Leading
    Innovation
    +
    Mainstream Products
    Sun Labs
    Bleeding
    Innovation
    HPTC
    Seeding
    Innovation

    View Slide

  5. HPTC Go To Market Program:
    “Grid Everywhere” Campaign
    Sun-Wide Alignment and Focus
    Sun-Wide Alignment and Focus
    Demand Creation
    Demand Creation
    Marketing Program
    Marketing Program
    Solutions
    Solutions
    Development
    Development
    Stay tuned for “Grid Everywhere” – rolling out during FY04.....

    View Slide

  6. Technical Strategy
    Technical Strategy
    What are the trends?
    What are the trends?

    View Slide

  7. HPTC Technical Strategy
    Make it easier to sell and use HPTC solutions
    Hardware – Open, commodity, flexible, scalable
    Software – Linux nodes, Solaris infrastructure
    Grid - Evolve from Enterprise Grid to Global Grid
    Interconnect – Map out many alternatives
    Developer – added focus on Java Web/Grid Services
    Early Adopter - streamlined solutions development
    DARPA HPCS - internal mindshare and funding
    Leverage Sun and Partner Products into Solutions

    View Slide

  8. HPTC As an Early Adopter Market
    Willingness to engage to solve problems
    Characteristics
    Technically advanced end users and developers
    Deep understanding of technology
    Ability to figure out solutions to problems
    Ability to optimize applications to the system
    Sales and support interactions need very
    experienced Sun staff
    Requirement to partner and co-develop

    View Slide

  9. Early Adopter Example: Interval Math
    Coordinate research, partner and take it to market
    Interval Components
    Sun Fortran has Interval Datatype
    Solver libraries are under development
    Interval algorithms exist
    The “Interval Problem”
    The textbooks say you can't solve nonlinear
    optimization problems – true for point
    solutions only!!
    Hard to convince/explain what this means....

    View Slide

  10. Hill Climb Optimization Example
    Start by looking at the conventional optimization method for
    an over-simplified one-dimensional example
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Hill
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Sample the Hill

    View Slide

  11. Hill Climb Optimization Example
    Finds a high point on the hill, but not always the highest using a
    naiive algorithm
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Narrow to required resolution
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Evaluate either side and walk to top of hill

    View Slide

  12. Hill Climb Optimization Example
    Try the same algorithm on a more difficult dataset, with very narrow
    and specific optimal solutions – it's no longer a robust solution
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Telegraph poles on the prarie
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Sample Prarie

    View Slide

  13. Hill Climb Optimization Example
    Zoom in on a local high point, but it doesn't find a good solution
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Narrow to required resolution
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Evaluate either side and walk to top of hill

    View Slide

  14. Interval Optimization Example
    Interval solver tells you the range of the possible solution over
    an interval, here we split the data into four intervals and evaluate
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Telegraph poles on the prarie
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Prarie Interval ranges

    View Slide

  15. Interval Optimization Example
    Keep zooming in until the result is a small enough interval, and you
    end up with a deterministic and correctly bounded solution
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Zoom in on Maximum
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Interval Driven True Solution
    The Answer!

    View Slide

  16. Architecture
    Architecture
    It's not just the technology....
    It's not just the technology....
    It's how you put it together....
    It's how you put it together....

    View Slide

  17. Workload Characteristics
    Workloads are vastly different, which drives different solutions
    High Performance for Technical
    Streaming, low variance, high utilization
    Measure peak CPU Gflops, SPECfp, Gbytes/sec
    Efficient, runs at the peak capacity of the system
    High Performance for Commercial
    More transactions, faster sub-second response
    Bursty, queueing effects, high variance, low utilization
    CPU bound benchmarks (e.g. TPC) not relevant to real world
    Mostly I/O latency bound on disk or network
    Very inefficient use of system capacity
    100%
    100%
    Technical
    Commercial

    View Slide

  18. Capability and Capacity Computing
    Proc
    Memory Switch
    Proc
    Mem
    I/O
    Mem I/O
    Proc
    Network Switch
    Proc
    Mem
    I/O
    Mem
    I/O
    Proc
    Mem
    I/O
    Cache-coherent shared-
    memory multi-processors (SMP)

    Tightly-coupled: highest
    bandwidth, lowest latency

    Large, workloads: ad-hoc
    transaction processing,
    data warehousing

    Shared pool to 100 processors

    Single Terabyte scale memory
    Cluster multi-processor

    Loosely coupled

    Standard H/W & S/W

    Highly parallel (web, some HPTC)
    Scale Vertically (Capability)
    Single OS
    Instance
    Multiple
    OS Instances
    Scale Horizontally (Capacity)
    Cluster Mgmt.

    View Slide

  19. Vertical vs. Horizontal Workloads
    Scale Vertically
    • Commercial Workloads
    — Large databases
    — Transactional databases
    — Data warehouses

    HPTC Workloads
    — Climate modeling
    — Data mining
    — Signal Processing
    — Cryptanalysis
    — Nuclear simulation
    — Some structural analysis
    — EDA full assembly simulation
    Scale Horizontally
    • Commercial Workloads
    — Web servers, Firewalls
    — Proxy servers, Directories
    — SSL, VPN
    — Media streaming
    — XML processing
    • HPTC Workloads
    — Seismic analysis
    — Genomics
    — Computational Fluid Dynamics
    — EDA sub-assembly simulation
    — Some Structural Analysis
    — Crash Testing

    View Slide

  20. Workload Performance Factors

    Processor speed, capacity and throughput

    Memory capacity

    System interconnect
    latency & bandwidth

    Network and storage I/O

    Operating system scalability

    Visualization performance and quality

    Optimized applications

    Network service availability
    #1 issue
    for real world
    cluster
    performance
    and scaling

    View Slide

  21. Solutions
    Solutions
    Partners and Expertise makes
    Partners and Expertise makes
    the difference
    the difference

    View Slide

  22. Grid Solutions Overview
    Visualization
    Storage
    Integration
    Global Grid Services & Portal
    Compute
    Grid
    Data
    Grid
    Visual
    Grid
    Applications & Users

    View Slide

  23. Sun Grid Engine Portal & Sun ONE Portal Server
    Storage Systems
    Desktops and Information Appliances
    Solaris/Linux Operating Environment
    N1
    Sun Mgmnt Center
    Sun Control Station
    Sun Grid Engine
    Sun ONE Developer Studio
    Sun HPC Cluster Tools
    Sun Grid Services Environment
    Web User Interface
    Throughput and HPC Clusters, Enterprise Servers
    Global Grid Layer
    SysAdmin Tools
    Distributed Workload Management
    Development Tools
    Sun ONE Web Services
    Globus/Avaki
    OGSA

    View Slide

  24. Data Grid
    Sun StorEdge™
    Sun StorEdge™
    Performance Suite
    Performance Suite
    Sun™ Cluster
    Sun™ Cluster
    Heterogeneous
    Heterogeneous
    Client
    Client
    Sun StorEdge™
    Sun StorEdge™
    Utilization Suite
    Utilization Suite
    Sun StorEdge™ 3900 Series
    Sun StorEdge™ 3900 Series
    Sun StorEdge QFS Shared File Systems
    Sun StorEdge QFS Shared File Systems
    Solaris Linux IRIX AIX Solaris
    Win2K, NT HP-UX Future
    Achieved
    3 GB/Sec!
    HPC SAN
    HPC SAN
    Professional
    Professional
    Services
    Services

    View Slide

  25. Graphics Grid:
    Access for More Users to Visualization Services at
    Required Visual Quality and Performance Levels
    Storage
    Storage Compute
    Compute Display
    Display
    Clients
    Clients
    Visualization
    Visualization
    SAN/
    NAS
    Graphics
    InterConnect
    Digital
    Video
    Delivery
    Compute Cluster
    Compute Cluster
    Visualization
    Services
    Over
    LAN/WAN

    View Slide

  26. Sun Fire Link
    4.8 GB/s
    4 µs
    latency
    Interconnect Components
    Scale Vertically (Capability) or Scale Horizontally (Capacity)?
    GBE
    100 MB/s
    100µs
    latency

    Parallel applications: OpenMP

    Large Shared Memory

    Top Performance

    Higher acquisition cost

    Lower development and
    management complexity

    Serial and parallel applications: MPI

    Throughput

    Lower acquisition cost

    Higher development and management complexity
    Myrinet
    400 MB/s
    4 µs
    latency
    Infiniband
    800 MB/s
    8 µs
    latency
    V480
    V210
    V60X
    SF4800
    V1280
    V880
    V480
    SF15K
    SF12K
    SF6800
    Interdependent Threads
    Cluster
    Performance
    The Deciding
    Factor
    What do the
    workloads
    require?

    View Slide

  27. CMT On-Chip
    100 – x00 GB/s
    0.1 - 0.01 µs
    $xxx?
    Interconnect Components
    Mapping Out Bandwidth and Latency
    Ethernet
    0.1 GB/s
    100 - 10µs
    $xxx
    Myrinet/IB/FL
    0.4 – 4.8 GB/s
    10 - 1 µs
    $x,xxx
    Memory
    9.6 - 57 GB/s
    1 - 0.1 µs
    $xx,xxx
    0.1 1 10 100 1000
    Gigabytes/sec Bandwidth (on logarithmic scale)
    Latency
    (inverted
    log scale)
    10ns
    100ns
    1us
    10us
    100us
    Proximity
    System
    Call
    Library
    Call
    Load/Store
    Instruction

    View Slide

  28. A Complete Compute Platform Solution
    Proven and Repeatable Reference Architectures
    Servers
    Workstations
    Control Network (Gigabit Ethernet)
    Data Network (Gigabit Ethernet)
    Sun StorEdge storage solutions
    (Direct-attached, NAS, HA-NFS, HPTC SAN)
    Sun ONE
    Grid Engine
    Sun Compute
    Grid rack systems
    Sun Cluster
    Grid Manager

    View Slide

  29. Infrastructure Partners

    Grid partners
    – Altair Engineering
    – Avaki
    – Engineous
    – Globus
    – GridIron
    – GridXpert
    – Meiosys
    – Platform

    Interconnect partners
    – Force10
    – Infinicon
    – Mellanox
    – Myrinet
    – Topspin

    Data grid partners
    – Instrumental
    – Precision I/O
    – Qlogic

    View Slide

  30. ISV Application Partners

    Energy
    – CGG
    – Landmark
    – Paradigm Geotechnology
    – Schlumberger/Geoquest

    Life Sciences
    – Accelrys
    – Gene Logic
    – MDL
    – Oracle
    – Spotfire

    Manufacturing
    – Ansys
    – Computational Dynamics
    – ESI
    – Fluent
    – LSTC
    – MSC.Software

    Visualization / Analysis
    – AVS
    – CEI
    – EDS
    – ICEM
    – Multigen/Paradigm
    – VNI

    View Slide

  31. A Complete Solution

    Focused on key business and technical challenges

    Based on proven and repeatable
    reference architectures
    – Optimized for specific
    industries and applications

    Adoption and deployment
    assistance at all levels of
    Grid Computing
    – Cluster and
    enterprise grids up
    to global grids
    Hardware, Software, and Services
    Services
    Services
    Achitecture,assesment,
    Achitecture,assesment,
    implementation, and training
    implementation, and training
    Grid Computing
    Grid Computing
    Reference Architectures
    Reference Architectures
    Proven and repeatable methodologies
    Proven and repeatable methodologies
    Hardware
    Hardware
    Workstations, servers,
    Workstations, servers,
    complete rack systems,
    complete rack systems,
    and storage
    and storage
    Software
    Software
    Sun
    Sun ONE Grid Engine Family,
    ONE Grid Engine Family,
    Sun
    Sun
    Control Station,
    Control Station,
    Grid Engine Portal
    Grid Engine Portal
    Sun-Certified Grid Computing Partners
    Sun-Certified Grid Computing Partners

    View Slide

  32. Execution
    Execution
    From slideware to reality
    From slideware to reality

    View Slide

  33. Execution Strategy
    Clear ambitious vision
    Identify and push back on obstacles
    Efficient engagement across Sun
    Use SunShot, SunCAP, SunSigma toolsets
    Clear scope and ownership
    HPTC "owns" technical markets, Grid, and early
    adopter commercial markets for Sun

    View Slide

  34. Call to Action

    Take advantage of Sun's new focus on Grid to build
    partnerships between early adopters and HPTC

    Learn to speak the new languages of Grid, e.g. Java
    based Open Grid Services Architecture (OGSA) and
    Globus, Avaki and other Web Services

    Feedback: Tell Sun's HPTC group what works, what
    doesn't, where the opportunities lie...

    View Slide

  35. Grid Everywhere
    www.sun.com/hptc
    www.sun.com/grid
    Adrian Cockcroft
    [email protected]

    View Slide