Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Red Cloud and the MATLAB Distributed Computing ...

Red Cloud and the MATLAB Distributed Computing Server (MDCS)

Presentation courtesy of Steve Lantz.

Red Cloud is Cornell’s university-wide private cloud for research computing, maintained by the Center for Advanced Computing. Subscribers can meet their on-demand computing needs by launching one or more virtual server instances of up to 28 CPU cores each from two pools comprising 472 total cores. Red Cloud is based on HPE Helion Eucalyptus; this opens up the possibility for extra-large workloads to burst out to the Amazon cloud, as Eucalyptus is fully compatible with Amazon Web Services. Red Cloud instances are commonly managed through a Web-based user console, which also allows Elastic Block Storage (EBS) volumes to be defined and attached to instances.

This presentation focuses on one potential use of Red Cloud, namely, to act as a host for MDCS clusters. MDCS is probably the ultimate strategy for accelerating MATLAB computations. The first step to parallelize a MATLAB script through the mechanisms provided in the Parallel Computing Toolbox. If local CPU cores do not offer enough speedup for your PCT computations, then it may be possible to scale out to Red Cloud and bring more resources to bear. I will demo the process of launching a Red Cloud MDCS instance, connecting to it from a MATLAB R2016a client, and speeding up a PCT-enabled computation through the use of multiple workers in the cloud. I will also talk about the various modes of PCT usage, and discuss when it might make sense to introduce one or more of these into your MATLAB scripts.

Presented at SSW: https://cornell-ssw.github.io/meetings/2017-05-08

CUSSW Hosted

May 08, 2017
Tweet

More Decks by CUSSW Hosted

Other Decks in Technology

Transcript

  1. Red Cloud and the MATLAB Distributed Computing Server (MDCS) www.cac.cornell.edu

    Steve Lantz Senior Research Associate Center for Advanced Computing (CAC) [email protected] Cornell Scientific Software Club, May 8, 2017
  2. What Is Red Cloud? • Infrastructure-as-a-Service (IaaS) for research computing

    • Extra compute cycles and memory via on-demand VMs • Subscription model for buying computer time • Easy way to clone a software environment across servers • Local source of computing power on Cornell networks • Complementary resource to CAC’s disk storage offerings • Where to go if your MATLAB runs outgrow your laptop! 5/8/2017 www.cac.cornell.edu 2
  3. What Is MDCS? Quoting from The MathWorks:* • MATLAB Distributed

    Computing Server™ lets you run computationally intensive MATLAB® programs and Simulink models on computer clusters, clouds, and grids. • You develop your program or model on a multicore desktop computer using Parallel Computing Toolbox™ and then scale up to many computers by running it on MDCS. • The server supports batch jobs, parallel computations, and distributed large data. The server includes a built-in cluster job scheduler [...] *https://www.mathworks.com/products/distriben.html 5/8/2017 www.cac.cornell.edu 3
  4. Red Cloud and MDCS Demo The main steps from the

    CAC wiki doc will be shown: • Working with the Eucalyptus User Console (website) – Logging in to Red Cloud – Launching an MDCS instance – Setting up the security group • Connecting to MDCS from MATLAB R2016a – Running the built-in validation test – Initiating a parpool – Trying a couple of quick checks using spmd 5/8/2017 www.cac.cornell.edu 5
  5. PCT Opens Up Parallel Possibilities • MATLAB has multithreading built

    into core libraries – Mostly aids big array operations; not within user control • PCT enables user-directed, interactive parallelism – Parallel for-loops: parfor – Single program, multiple data: spmd, pmode – Array partitioning for big-data parallelism: (co)distributed • PCT also enables batch-style parallelism – Multiple independent runs of a serial function: createJob – Single run of parallelized code: createCommunicatingJob 5/8/2017 www.cac.cornell.edu 6
  6. Two Ways to Use PCT 5/8/2017 www.cac.cornell.edu 7 Start local

    or remote parpool, run PCT commands (scripted) Submit jobs, task functions to local or remote parcluster MATLAB Client MATLAB Workers MATLAB Client Interactive - vs. - Batch-style MATLAB Workers (maybe via Distributed Computing Server) Scheduler (file transfer)
  7. Interactive PCT: Major Concepts 5/8/2017 www.cac.cornell.edu 8 • parpool: pool

    of distinct MATLAB processes = “labs” – Differs from multithreading! No shared address space – Ultimately allows same concepts to work on MDCS clusters • parfor: parallel for-loop, iterations are independent – Labs (workers) split up iterations; load balancing is built in • spmd: single program, multiple data – All labs execute every command; labs can communicate • (co)distributed: array is partitioned among workers – “Multiple data” for spmd; one array to MATLAB functions
  8. Batch-Style PCT: Jobs and Tasks 5/8/2017 www.cac.cornell.edu 9 • parcluster

    creates a cluster object, which allows you to create Jobs. In PCT, Jobs are containers for Tasks, which are where the actual work is defined. clust Cluster Object Jobs(24) Jobs(25) j=createJob(clust); j=createCommunicatingJob(clust); Tasks(1) myFunction(z) Tasks(1) someFunction(x) Tasks(2) otherFunction(y) createTask(j,…); createTask(j,…); createTask(j,…);
  9. Batch-Style PCT: Types of Jobs • PCT has 3 types

    of jobs: independent, SPMD, and pool • Independent: createJob() – Can contain many tasks; workers run the tasks one by one • SPMD: createCommunicatingJob(...,'Type','SPMD',...) – Has ONE task to be run by ALL workers, like a spmd block • Pool: createCommunicatingJob(...,'Type','Pool',...) – Has ONE task which is run by ONE worker – Other workers run spmd blocks or parfor loops in the task – Mimics the interactive mode of using PCT 5/8/2017 www.cac.cornell.edu 10
  10. More on SPMD Jobs and spmd Blocks • The SPMD

    task function, like a spmd block, is responsible for implementing parallelism using “labindex” logic • The lab* functions allow workers (labs) to communicate; they act just like MPI message-passing methods – labSend(data,dest,[tag]); % point-to-point – labReceive(source,tag); % datatype, size are implicit – labReceive(); % take any source – labBroadcast(source); labBarrier; gop(f,x); % collectives • (Co)distributed arrays are sliced across workers so huge matrices can be operated on; collect slices with gather 5/8/2017 www.cac.cornell.edu 11
  11. When Is File Transfer Needed? • If your workers do

    not share a disk with your client, and they will require custom functions or datafiles • Example: j = createJob(sched); createTask(j,@rand,1,{3,3}); createTask(j,@myfunction,1,{3,3}); submit(j); – The rand function is no problem at all, it’s built in – But myfunction.m does not exist on the remote computer – We’ll want to transfer this file and get it added to the path 5/8/2017 www.cac.cornell.edu 12
  12. MATLAB Can Copy Files… Or You Can • Setting the

    AutoAttachFiles property tells MATLAB to copy files containing your function definitions • Use AttachedFiles to copy any data files or directories the task will need; directory structures are preserved – Not very efficient, though: file transfer occurs separately for each worker running a task for that particular job – OK for small projects with a couple of files • A better-scaling alternative is to copy your files to disk(s) on the remote server(s) in advance – Use AdditionalPaths to make the files available at run time 5/8/2017 www.cac.cornell.edu 13
  13. Distributing Work with parfeval, batch • createJob() isn’t the only

    way to run independent tasks... • parfeval() requests the given function be excuted on one worker in a parpool, asynchronously • batch() does the same on one worker NOT in a parpool – It creates a one-task job and submits it to a parcluster – It can also be a one-line method for initiating a pool job – It works with either a function or a script • Either can easily be called in a loop over a list of tasks – Use fetchNext() to collect results as they become available 5/8/2017 www.cac.cornell.edu 14
  14. Distributing Work Without PCT, MDCS • Create a MATLAB .m

    file that takes one or more input parameters (such as the name of an input file). • Apply the MATLAB C/C++ compiler (mcc), which converts the script to C, then to a standalone executable. • Run N copies of the executable on an N-core batch node or a cluster, each with a different input parameter – mpirun can launch non-MPI processes, too • Matlab runtimes (free!) must be available on all nodes • For process control, write a master script in Python, say 5/8/2017 www.cac.cornell.edu 15
  15. GPGPU in MATLAB PCT: Fast and Easy • Many functions

    are overloaded to call CUDA code automatically if objects are declared with gpuArray type • Benchmarking with large 1D and 2D FFTs shows excellent acceleration on NVIDIA GPUs • MATLAB code changes are trivial – Move data to GPU by declaring a gpuArray – Call method in the usual way 5/8/2017 www.cac.cornell.edu 16 g = gpuArray(r); f = fft2(g);
  16. Are GPUs Really That Simple? • No. Your application must

    meet four important criteria. 1. Nearly all required operations must be implemented natively for type GPUArray. 2. The computation must be arranged so the data seldom have to leave the GPU. 3. The overall working dataset must be large enough to exploit 100s of thread processors 4. On the other hand, the overall working dataset must be small enough that it does not exceed GPU memory. 5/8/2017 www.cac.cornell.edu 17
  17. PCT and MDCS: The Bottom Line • PCT can greatly

    speed up large-scale computations and the analysis of large datasets – GPU functionality is a nice addition to the arsenal • MDCS allows parallel workers to run on cluster and cloud resources beyond one’s laptop, e.g., Red Cloud • Yes, a learning curve must be climbed… – General knowledge of how to restructure code so that parallelism is exposed – Specific knowledge of PCT functions • But speed often matters!