Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Who is afraid of I/O? – Exploring I/O Challenges and Opportunities at the Exascale

SciTech
August 02, 2017

Who is afraid of I/O? – Exploring I/O Challenges and Opportunities at the Exascale

Clear trends in the past and current petascale systems (i.e., Jaguar and Titan) and the new generation of systems that will transition us toward exascale (i.e., Aurora and Summit) outline how concurrency and peak performance are growing dramatically, however, I/O bandwidth remains stagnant. Next-generation systems are expected to deliver 7 to 10 times higher peak floating-point performance with only 1 to 2 times higher PFS bandwidth compared to the current generation.

Data intensive applications, especially those exhibiting bursty I/O, must take this aspect into consideration and be more selective about what data is written to disk and how the data is written. In addressing the needs of these applications, can we take advantage of a rapidly changing technology landscape, including containerized environments, burst buffers, and in-situ/in-transit analytics? Are these technologies ready to transition these applications to exascale? In general, existing software components managing these technologies are I/O-ignorant, resulting in systems running the data intensive applications that exhibit contentions, hot spots, and poor performance.

In this talk, we explore challenges when dealing with I/O-ignorant high performance computing systems and opportunities for integrating I/O awareness in these systems. Specifically, we present solutions that use I/O awareness to reduce contentions in scheduling policies managing under provisioned systems with burst buffers, and to mitigate data movements in data-intensive simulations. Our proposed solutions go beyond high performance computing and develop opportunities for interdisciplinary collaborations.

SciTech

August 02, 2017
Tweet

More Decks by SciTech

Other Decks in Technology

Transcript

  1. Who is afraid of I/O? Exploring I/O Challenges and Opportuni9es

    at the Exascale Michela Taufer Computer and Informa5on Sciences University of Delaware Newark, Delaware, USA
  2. Acknowledgements Travis J. Boyu Z. Trilce E. Adam L. Silvia

    C. Jim G. Marc S. Tom S. Becky S. Dong A. Don L. … and Mark G. Sponsors: The GCLab@UD
  3. Challenges at the Extreme Scale 2 From a talk of

    Lucy Nowell, DoE Program Director (DoE Workflow Workshop, Rockville, MD, USA. April 20-21, 2015) Simula5ons today: •  Save all the data to analyze later! Simula5ons at exascale: •  Analyze data as they are generated •  Save only what is really needed! Peak FLOPS Peak PFS I/O Bandwidth We must change how we run our simulaitons at the exascale
  4. Perspec5ve The scien5st: “Storage technologies are advancing […] and it

    is really not clear at all [to me] that especially distributed storage pla<orms would not be able to handle […] petabyte data sets” The computer architect: “[…] there will be burst buffers on the DOE machines which will give applicaCons much faster I/O […]” 3 Anonymous Feedback Anonymous Feedback
  5. Burst Buffers Many have heard about it, few have seen

    real machines with it, even fewer have ran applicaCons on those machines … 4
  6. 5 Tradi5onal System CN CN Parallel File System CN IB

    ION Based on: hLp://www.nersc.gov/users/computaConal- systems/cori/burst-buffer/burst-buffer/ Application 1 Application 2 Application 3 Bursty I/O patterns (PFS-side) CN Parallel File System PFS BW (1 TB/s) Based on: Liu, N, Cope, J, Carns, P, Carothers, C, Ross, R, Grider, G, Crume, A, Maltzahn, C . “On the Role of Burst Buffers in Leadership-class Storage Systems” MSST/SNAPI 2012
  7. 6 CN CN Parallel File System CN IB ION BB

    SSD BB SSD Burst Buffer System Based on: hLp://www.nersc.gov/users/computaConal- systems/cori/burst-buffer/burst-buffer/ BB SSD CN Parallel File System BB BW (10s TB/s) Application 1 Application 2 Application 3 Stream I/O paCerns (App-side) Based on: Liu, N, Cope, J, Carns, P, Carothers, C, Ross, R, Grider, G, Crume, A, Maltzahn, C . “On the Role of Burst Buffers in Leadership-class Storage Systems” MSST/SNAPI 2012
  8. Streamed I/O paCerns (PFS-side) 7 CN CN Parallel File System

    CN IB ION BB SSD BB SSD Burst Buffer System Based on: hLp://www.nersc.gov/users/computaConal- systems/cori/burst-buffer/burst-buffer/ BB SSD CN Parallel File System BB BW (10s TB/s) Application 1 Application 2 Application 3 Based on: Liu, N, Cope, J, Carns, P, Carothers, C, Ross, R, Grider, G, Crume, A, Maltzahn, C . “On the Role of Burst Buffers in Leadership-class Storage Systems” MSST/SNAPI 2012 PFS BW (100s GB/s)
  9. 8 CN CN Parallel File System CN IB ION BB

    SSD BB SSD BB BW > PFS BW PFS BoTleneck CN Parallel File System BB BW (10s TB/s) PFS BW (100s GB/s) Applica5ons’ cumula5ve average bandwidths can exceed the PFS bandwidth, causing I/O conten5on Based on: hLp://www.nersc.gov/users/computaConal- systems/cori/burst-buffer/burst-buffer/ BB SSD BB SSD BB SSD
  10. Challenges •  Burst Buffers are not the magic I/O silver

    bullet §  I/O conten5on s5ll a problem if we exceed the burst buffer capability §  Burst buffers improve offloading bandwidth but do NOT help uploading data from storage for analysis and visualiza5on 9
  11. In-situ and In-transit Analysis Core 1 Core 4 Core 7

    Core 8 Core 1 Core 1 Core 5 Core 6 Simula5on Node 1 Analysis Simula5on Node 1 Node 2 Node 3 Network Interconnect Analysis Shared Memory 10 Example of tools: •  DataSpaces (Rutgers U.) •  DataStager (GeorgiaTech)
  12. MD simula5ons are alive and kicking! 13 XSEDE SUs used

    by type of targeted science over the past 6 months (March 1, 2016 - August 31, 2016) Four of the top 10 XSEDE users run molecular simulations (i.e., Schulten at UIUC, Feig at Michigan State U, Voth at U Chicago, and Case at Rutgers U)
  13. Analysis Requirements •  We want to capture what is going

    on in each frame without: §  Disrup5ng the simula5on (e.g., stealing CPU and memory on the node) §  Moving all the frames to a central file system and analyzing them once the simula5on is over §  Comparing each frame with past frames of the same job §  Comparing each frame with frames of other jobs 14 Frames of an MD trajectory: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80
  14. MD Simula5ons == Ensemble of Jobs 15 A MD job

    generates a sequence of conforma5onal frames A MD simula9on comprises of hundreds of thousands of MD job From: hTp://images.slideplayer.com/18/5667960/slides/slide_1.jpg MD trajectory frame frame frame By Vincent Voelz - Sent to the uploader personally, CC BY-SA 3.0, hTps://commons.wikimedia.org/w/index.php?curid=218912120 MD trajectory
  15. MD Simula5ons == Ensemble of Black Boxes? 16 A MD

    job generates a sequence of conforma5onal frames A MD simula5on comprises of hundreds of thousands MD jobs MD job MD job MD job MD job MD job MD job MD job MD job MD job MD job MD job MD job
  16. Modeling Molecules From: By Tomixdf (talk) - Own work (Original

    text: self-made), Public Domain, hTps://commons.wikimedia.org/w/index.php?curid=23662306 Use Cα atoms
  17. Capturing Secondary Structures Measure the distance between Cα j and

    Cα i λmax Compute largest eigenvalue Cα i Cα j d Cα i Cα j Build the substructure Euclidean Distance Matrix (D)
  18. Capturing Ter5ary Structures Cβ i Cα j d Measure the

    distance between Cα j and Cβ i Build a bipar9te distance matrix by comparing two substructures i j λmax Compute largest eigenvalue
  19. Proxy for Conforma5ons’ Changes 23 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 λ55
  20. Proxy for Conforma5ons’ Changes 24 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 Frame 65 λ55
  21. Proxy for Conforma5ons’ Changes 25 λ55 λ60 λ65 λ70 λ75

    λ85 Frames of an MD job: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80
  22. Proxy for Conforma5ons’ Changes 26 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80 λ55
  23. Proxy for Conforma5ons’ Changes 27 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80 λ55
  24. Proxy for Conforma5ons’ Changes 28 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80 λ55 Can the distance between two max eigenvalues serve as a proxy for distance between the two associated conforma<ons?
  25. Proxy for Conforma5ons’ Changes •  Euclidean distance matrix D is

    symmetric •  Eigenvalues of symmetric, real matrices are stable §  Small perturba5ons of D result in only small changes in the eigenvalues §  Euclidean distance matrix is insensi5ve to rigid transforma5on •  Use only largest eigenvalue in distance matrix λmax = λ1 < λ2 < λ3 < λ4 < λ5 = λmin λ1 + λ2 + λ3 + λ4 + λ5 = 0 λ1 >> λ2 ~ λ3 ~ λ4 ~ 0 λmax = λ1 ~ - λ5 = - λmin α-carbon α-carbon Can the distance between two max eigenvalues serves as a proxy for distance between the two associated conforma<ons? “In-Situ Data Analysis and Indexing of Protein Trajectories,” Travis Johnston, Buyu Zhang, Adam Liwo, Silvia Crivelli, and Michela Taufer. JCC 2017.
  26. Proxy for Conforma5ons’ Changes 30 λ60 λ65 λ70 λ75 λ85

    Frames of an MD job: Frame 55 Frame 60 Frame 65 Frame 70 Frame 75 Frame 80 λ55 Yes, the distance between two max eigenvalues serves as a proxy for distance between the two associated conforma<ons!
  27. Mapping Largest Eigenvalues to Structures PDB dataset: 3,197 different proteins

    including 22,898 helices and 32,894 strands Cα atoms Cα atoms Cα atoms 22,898 helices 32,894 strands 31 Largest eigenvalue PDB Dataset Number of helices Number of strands
  28. Case Study I: 2MQ8 Protein Frame 7686 Frame 8925 • 

    Canonical simula5on of 2MQ8 protein including both α helices and β strands §  Auer ~9M steps α helices pack 5ghter and change into β strands Can the eigenvalue analysis capture the conforma<onal change? 32
  29. Case Study I: 2MQ8 Protein 33 Compute largest eigenvalue of

    3rd strand (10 amino acids) for each trajectory frame
  30. Case Study I: 2MQ8 Protein 34 Compute largest eigenvalue of

    3rd strand (10 amino acids) for each trajectory frame
  31. Case Study I: 2MQ8 Protein 35 Compute largest eigenvalue of

    3rd strand (10 amino acids) for each trajectory frame
  32. Case Study II: Capturing Movement of α-helices 36 Can the

    eigenvalue analysis capture the movement of helices ? Capture movement of structures with respect to each other 1330 1360 1390
  33. Case Study II: Capturing Movement of α-helices Something is changing

    Monitor largest eigenvalue of en5re protein
  34. Case Study II: Capturing Movement of α-helices Individual α-helices (Helix

    1, Helix 2, and Helix 3) appear stable Monitor largest eigenvalue of single helices
  35. Case Study II: Capturing Movement of α-helices Monitor largest eigenvalue

    of bipar5te distance matrix First and second α-helices appear stable; third helix moves
  36. 1330 1360 1390 Case Study II: Capturing Movement of α-helices

    Large rela5ve change between two pairs of α-helices
  37. “Storage technologies are advancing […] and it is really not

    clear at all [to me] that especially distributed storage pla<orms would not be able to handle […] petabyte data sets” Anonymous Feedback 42 Yes, new technologies will be able to handle data at the extreme scale but only if we integrate new software paradigms. In-situ and in-transit analysis are here to stay!