Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed computing in UQLab using the HPC Di...

Distributed computing in UQLab using the HPC Dispatcher module

This is a presentation given at the seminar organized by the Chair of Risk, Safety and Uncertainty Quantification ETH Zurich on 28.10.2020 about the upcoming feature of UQLab to uncertainty quantification tasks to distributed computing resources.

Damar Wicaksono

October 28, 2020
Tweet

More Decks by Damar Wicaksono

Other Decks in Programming

Transcript

  1. Distributed computing in UQLab using the HPC Dispatcher module Damar

    Wicaksono Chair of Risk, Safety and Uncertainty Quantification – ETH Z¨ urich
  2. Introduction Background Background Some problems You, UQLab users, might have:

    • You need to run long computations; you only have your laptops → freeing up local computing resources • You need to run computations with large memory or CPU requirements → exceptional resources (CPU, memory) requirement • Your simulation code only runs in Linux with, possibly, special licensing → compatibility and licensing issues D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 2 / 32
  3. Introduction Background Background Some problems You, UQLab users, might have:

    • You need to run long computations; you only have your laptops → freeing up local computing resources • You need to run computations with large memory or CPU requirements → exceptional resources (CPU, memory) requirement • Your simulation code only runs in Linux with, possibly, special licensing → compatibility and licensing issues D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 2 / 32
  4. Introduction Background Background Some problems You, UQLab users, might have:

    • You need to run long computations; you only have your laptops → freeing up local computing resources • You need to run computations with large memory or CPU requirements → exceptional resources (CPU, memory) requirement • Your simulation code only runs in Linux with, possibly, special licensing → compatibility and licensing issues Distributed computing resources might help: • They are highly available round the clock • They have massive amount of high-performance CPUs, memory, and storage • They come with expensive (shared) licensed software installed D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 2 / 32
  5. Introduction Background Background Distributed computing resources come in many forms

    and flavors Credit: Wikipedia, GPL v3 Credit:Argonne National Lab.,CC BY-SA2.0 Credit: Zamurovic Brothers (Noun Project) Distributed computing resources might help: • They are highly available round the clock (maybe) • They have massive amount of high-performance CPUs, memory, and storage (maybe) • They come with expensive (shared) licensed software installed (maybe) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 2 / 32
  6. Introduction Distributed computing resources Distributed computing resources might help EULER:

    The large HPC infrastructure of the ETHZ Credit: Olivier Byrde, ETH Zurich (2015) Your institution provides distributed computing resources, so you: 1 ask or look around how to get an access 2 get an access (username, password, computing time, storage space) 3 read the Wiki 4 are ready for some distributed computing D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 3 / 32
  7. Introduction Distributed computing resources Distributed computing resources might help, but...

    EULER: The large HPC infrastructure of the ETHZ Credit: Olivier Byrde, ETH Zurich (2015) Your institution provides distributed computing resources, so you: 1 ask or look around how to get an access 2 get an access (username, password, computing time, storage space) 3 read the Wiki 4 are ready for some distributed computing (right?) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 3 / 32
  8. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  9. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  10. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  11. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote 3 Create a job script D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  12. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote 3 Create a job script 4 Submit the job to the queues D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  13. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote 3 Create a job script 4 Submit the job to the queues 5 Check the queues until job is finished D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  14. Introduction Distributed computing Distributed computing in 7 easy steps The

    steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote 3 Create a job script 4 Submit the job to the queues 5 Check the queues until job is finished 6 Transfer the output files back (scp, rsync) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  15. Introduction Distributed computing Distributed computing in 7 (not so) easy

    steps The steps: 1 Log in to the remote machine 2 Create an analysis script to run on the remote 3 Create a job script 4 Submit the job to the queues 5 Check the queues until job is finished 6 Transfer the output files back (scp, rsync) 7 Analyze the outputs in the local machine D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 4 / 32
  16. Introduction Distributed computing Users vs. Machine A user Remote machine

    • laptops • desktops • computing servers • HPC clusters • etc. the gap is real Users are used to: • synchronous execution: immediate execution • local storage: results are available on completion • think of serial programs: easier to debug Now, users have to get used to: • asynchronous execution: schedule and submit an execution • remote storage: results must be retrieved • think of parallel programs: harder to debug (e.g., race conditions) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 5 / 32
  17. Introduction The HPC Dispatcher module Introducing the HPC Dispatcher module

    A user Remote machine • laptops • desktops • computing servers • HPC clusters • etc. HPC Dispatcher module The HPC dispatcher module is an attempt to bridge the gap • It allows user to dispatch and retrieve some computations to remote distributed computing resources • All from a local UQLab session • Emphasis on some computations, i.e., certain computations that are relevant in uncertainty quantification (UQ) with UQLab
  18. Introduction The HPC Dispatcher module Introducing the HPC Dispatcher module

    A user Remote machine • laptops • desktops • computing servers • HPC clusters • etc. HPC Dispatcher module This presentation is about: • The HPC dispatcher module features • Its basic usage and a couple of more advanced use cases D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 6 / 32
  19. Introduction The HPC Dispatcher module Introducing the HPC Dispatcher module

    A user Remote machine • laptops • desktops • computing servers • HPC clusters • etc. HPC Dispatcher module This presentation is about: • The HPC dispatcher module features • Its basic usage and a couple of more advanced use cases ...and less about (if at all): • Distributed computing system and organization • Parallel algorithms and programming D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 6 / 32
  20. Dispatcher in action Model evaluation Dispatcher in action: Problem setup

    Create a Model object from an m-file: 1 ModelOpts . mFile = u q i s h i g a m i ; 2 3 myModel = uq createModel ( ModelOpts ) ; D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 7 / 32
  21. Dispatcher in action Model evaluation Dispatcher in action: Model evaluation

    Create a Model object from an m-file: 1 ModelOpts . mFile = u q i s h i g a m i ; 2 3 myModel = uq createModel ( ModelOpts ) ; Evaluate the Model on a single input point: > > uq evalModel ( [ 0 . 5 ∗ p i 0.5∗ p i 0.5∗ p i ] ) ans = 8.6088 D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 7 / 32
  22. Dispatcher in action Model evaluation Dispatcher in action: Model evaluation

    Create a Model object from an m-file: 1 ModelOpts . mFile = u q i s h i g a m i ; 2 3 myModel = uq createModel ( ModelOpts ) ; Let’s assume a Dispatcher object has been created and stored in a variable myDispatcher (more on this later). Dispatch the same Model evaluation to the remote machine: > > uq evalModel ( [ 0 . 5 ∗ p i 0.5∗ p i 0.5∗ p i ] , HPC ) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 7 / 32
  23. Dispatcher in action Model evaluation Dispatcher in action: Model evaluation

    Create a Model object from an m-file: 1 ModelOpts . mFile = u q i s h i g a m i ; 2 3 myModel = uq createModel ( ModelOpts ) ; Let’s assume a Dispatcher object has been created and stored in a variable myDispatcher (more on this later). Dispatch the same Model evaluation to the remote machine: > > uq evalModel ( [ 0 . 5 ∗ p i 0.5∗ p i 0.5∗ p i ] , HPC ) ans = [ ] Whoa, what happened? D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 7 / 32
  24. Dispatcher in action Dispatched computation Dispatcher in action: Dispatch a

    computation > > uq evalModel ( myModel , X, HPC ) ans = [ ] Local client Remote machine • laptops • desktops • computing servers • HPC clusters • etc. SSH Dispatch a computation dispatch package D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 8 / 32
  25. Dispatcher in action Dispatched computation Dispatcher in action: asynchronous execution

    > > uq evalModel ( myModel ,X, HPC ) ans = [ ] Local client Remote machine • laptops • desktops • computing servers • HPC clusters • etc. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 9 / 32
  26. Dispatcher in action Keeping track of dispatched computation Dispatcher in

    action: Get status > > u q g e t S t a t u s ( myDispatcher ) ans = running Local client Remote machine Get status • laptops • desktops • computing servers • HPC clusters • etc. SSH Return status • ’submitted’: is in the queuing system • ’running’: is being executed • ’complete’: has been successfully finished • ’canceled’: has been deliberately canceled • ’failed’: has exited with errors SSH ’running’ D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 10 / 32
  27. Dispatcher in action Keeping track of dispatched computation Dispatcher in

    action: remote execution is finished > > u q g e t S t a t u s ( myDispatcher ) ans = complete results Local client Remote machine • laptops • desktops • computing servers • HPC clusters • etc. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 11 / 32
  28. Dispatcher in action Fetching the results Dispatcher in action: Fetch

    the results > > r e s u l t s = u q f e t c h R e s u l t s ( myDispatcher ) r e s u l t s = 8.6088 results Local client Remote machine Fetch results results • laptops • desktops • computing servers • HPC clusters • etc. SSH SSH D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 12 / 32
  29. Dispatcher in action Setting up a Dispatcher object Backtracking: Setting

    up a Dispatcher object What you need to set up and use a Dispatcher object • An access to a remote machine; it must run a Linux OS and has an MPI implementation • A directory with write access permission in the remote machine • A passwordless SSH connection to the remote machine • A profile file that stores required information about the remote machine An example of a (minimum) remote machine profile file (myProfile.m): 1 Hostname = e u l e r . ethz . ch ; 2 Username = wdamar ; 3 PrivateKey = ˜/. ssh / i d r s a d i s p a t c h e r ; 4 RemoteFolder = /home/wdamar/temp ; In a UQLab session: DispatcherOpts . P r o f i l e = m y P r o f i l e ; myDispatcher = u q c r e a t e D i s p a t c h e r ( DispatcherOpts ) ; D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 13 / 32
  30. Dispatcher in action Setting up a Dispatcher object Backtracking: Setting

    up a Dispatcher object What you need to set up and use a Dispatcher object • An access to a remote machine; it must run a Linux OS and has an MPI implementation • A directory with write access permission in the remote machine • A passwordless SSH connection to the remote machine • A profile file that stores required information about the remote machine An example of a (minimum) remote machine profile file (myProfile.m): 1 Hostname = e u l e r . ethz . ch ; 2 Username = wdamar ; 3 PrivateKey = ˜/. ssh / i d r s a d i s p a t c h e r ; 4 RemoteFolder = /home/wdamar/temp ; In a UQLab session: DispatcherOpts . P r o f i l e = m y P r o f i l e ; myDispatcher = u q c r e a t e D i s p a t c h e r ( DispatcherOpts ) ; D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 13 / 32
  31. Dispatcher in action Setting up a Dispatcher object Backtracking: Setting

    up a Dispatcher object What you need to set up and use a Dispatcher object • An access to a remote machine; it must run a Linux OS and has an MPI implementation • A directory with write access permission in the remote machine • A passwordless SSH connection to the remote machine • A profile file that stores required information about the remote machine An example of a (minimum) remote machine profile file (myProfile.m): 1 Hostname = e u l e r . ethz . ch ; 2 Username = wdamar ; 3 PrivateKey = ˜/. ssh / i d r s a d i s p a t c h e r ; 4 RemoteFolder = /home/wdamar/temp ; In a UQLab session: DispatcherOpts . P r o f i l e = m y P r o f i l e ; myDispatcher = u q c r e a t e D i s p a t c h e r ( DispatcherOpts ) ; D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 13 / 32
  32. Dispatcher in action Setting up a Dispatcher object Backtracking: Setting

    up a Dispatcher object No MATLAB/UQLab in the remote machine is required for UQLink model evaluation An example of a (minimum) remote machine profile file (myProfile.m): 1 Hostname = e u l e r . ethz . ch ; 2 Username = wdamar ; 3 PrivateKey = ˜/. ssh / i d r s a d i s p a t c h e r ; 4 RemoteFolder = /home/wdamar/temp ; In a UQLab session: DispatcherOpts . P r o f i l e = m y P r o f i l e ; myDispatcher = u q c r e a t e D i s p a t c h e r ( DispatcherOpts ) ; D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 13 / 32
  33. Dispatcher in action Setting up a Dispatcher object Backtracking: Setting

    up a Dispatcher object Additional settings in the remote profile file: • For generic UQLab computations, you’d need MATLAB and UQLab: . . . MATLABCommand = / usr / l o c a l / bin / matlab ; RemoteUQLabPath = ˜/ uqlab ; • The remote machine might also employ a job scheduler: . . . Scheduler = slurm ; % or l s f , pbs , torque • It might also employ a module system to load software and set up their environment: . . . EnvSetup = module load open mpi ; % only on the l o g i n node PrevCommands = module load matlab ; % a l s o on the compute nodes • And some other settings (e.g., custom scheduler, MPI settings); see the Reference List of the Dispatcher module user manual. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 14 / 32
  34. uq map Dispatching generic functions How about dispatching an evaluation

    of a generic function? D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 15 / 32
  35. uq map Introduction Introducing uq map output = uq map

    ( fun , i n p u t s ) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 16 / 32
  36. uq map Introduction Introducing uq map output = uq map

    ( fun , i n p u t s ) inputs uq map(@fun,inputs) )} {fun( )} {fun( )} {fun( )} {fun( )} {fun( output In plain English Evaluate fun for each element of inputs; The output is always a cell array. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 16 / 32
  37. uq map Introduction Introducing uq map output = uq map

    ( fun , i n p u t s ) inputs uq map(@fun,inputs) )} {fun( )} {fun( )} {fun( )} {fun( )} {fun( output Kind of arrayfun, cellfun, or structfun but with focus on inputs as a sequence/collection instead of its data types. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 16 / 32
  38. uq map Mapping function The basic ingredients: mapping function uq

    map (fun , i n p u t s ) Supported functions as the mapping function: • All built-in matlab functions • User-defined functions (stored in m-files) • Anonymous functions • System commands Many matlab functions are vectorized. Consider using uq map if there is no such a function for your purpose and you’re thinking of using for-loop . D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 17 / 32
  39. uq map Mapping function The basic ingredients: mapping function Some

    examples: • built-in functions >> uq_map(@sum , {[1 2 3], [4 5 6 7], "a"}) ans = 1x3 cell array {[6]} {[15]} {[ NaN ]} • anonymous functions >> fun = @(X) X * sin(X); >> uq_map(fun , linspace(-pi , pi , 3) ) ans = 1x3 cell array {[3.8473e -16]} {[ 0]} {[3.8473e -16]} D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 17 / 32
  40. uq map Input sequence The basic ingredients: input sequence uq

    map ( fun , inputs) Supported types of inputs sequence: • structure arrays • cell arrays • vectors and matrices D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 18 / 32
  41. uq map Input sequence The basic ingredients: input sequence •

    uq map(fun,S) • uq map(fun,C) Structure and cell arrays can contain most types of data users need. Matrices and vectors are supported as a shortcut. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 18 / 32
  42. uq map Dispatched uq map Dispatched uq map uq map

    ( fun , inputs , DispatcherObj ) uq map is a dispatcher-aware function • Its computation can be dispatched to a remote machine • It only requires remote UQLab if UQLab’s functionalities are used • Support advanced functionalities, e.g., attached files, remote sequence generator, custom error handler for a specific computation • Same workflow as a dispatched uq evalModel D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 19 / 32
  43. uq map Dispatched uq map Dispatched uq map • Dispatch

    a uq map computation: >> inputs = {linspace (1 ,10); linspace (1 ,100) ;... linspace (1 ,1000); rand (10 ,3) }; >> uq_map(@sum , inputs , myDispatcher ) • Get the status of a dispatched computation: >> uq_getStatus ( myDispatcher ) ans = complete • Fetch the results: >> Y = uq_fetchResults ( myDispatcher ) Y = 4x1 cell array {[ 550]} {[ 5050]} {[ 50050]} {1x3 double} D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 19 / 32
  44. Notes on advanced feature of the Dispatcher module Notes on

    more advanced features of the dispatcher module D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 20 / 32
  45. Notes on advanced feature of the Dispatcher module Synchronization Notes

    on synchronization Users can wait for a dispatched computation to finish: >> uq_waitForJob ( myDispatcher ) % this will block the session Checking the status of the remote execution ... Checking the status of the remote execution ... Job Status: complete reached. Local client Remote machine Get status • laptops • desktops • computing servers • HPC clusters • etc. SSH Return status SSH Finished? (’completed’ or ’failed’) no yes Unblock session (back to matlab prompt) Time-out? no yes D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 21 / 32
  46. Notes on advanced feature of the Dispatcher module Synchronization Notes

    on synchronization Users can also dispatch a computation in a synchronized mode, e.g.: >> Y = uq_map(fun , inputs , DispatcherObj ,... Synchronized , true) Y = ... Local client Remote machine Get status • laptops • desktops • computing servers • HPC clusters • etc. SSH Return status uq waitForJob finished? no yes SSH Fetch results SSH Completed? yes Throw an error no results results SSH D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 22 / 32
  47. Notes on advanced feature of the Dispatcher module Synchronization Notes

    on synchronization Users can also dispatch a computation in a synchronized mode, e.g.: >> Y = uq_map(fun , inputs , DispatcherObj ,... Synchronized , true) Y = ... Local client Remote machine Get status • laptops • desktops • computing servers • HPC clusters • etc. SSH Return status uq waitForJob finished? no yes SSH Fetch results SSH Completed? yes Throw an error no results SSH results D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 22 / 32
  48. Notes on advanced feature of the Dispatcher module Parallel execution

    Notes on parallel execution uq evalModel and uq map are parallelizable • They are of naively parallel type (from data parallelism) • Data are chunked and processes are spawned to deal with each chunk • When fetched, the chunked results are automatically merged Local client Remote machine Dispatch a computation • laptops • desktops • computing servers • HPC clusters • etc. SSH Data chunking spawned parallel processes dispatch package D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 23 / 32
  49. Notes on advanced feature of the Dispatcher module Parallel execution

    Notes on parallel execution uq evalModel and uq map are parallelizable • They are of naively parallel type (from data parallelism) • Data are chunked and processes are spawned to deal with each chunk • When fetched, the chunked results are automatically merged Local client Remote machine Fetch results Merge results • laptops • desktops • computing servers • HPC clusters • etc. SSH SSH results 1 results 2 results 3 results Data chunking spawned parallel processes D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 23 / 32
  50. Notes on advanced feature of the Dispatcher module Parallel execution

    Notes on parallel execution A parallel execution does not mean faster execution • Spawning matlab processes has some overhead • The size of data coupled with I/O and network performances may become a bottleneck Local client Remote machine Fetch results Merge results • laptops • desktops • computing servers • HPC clusters • etc. SSH SSH results 1 results 2 results 3 results Data chunking spawned parallel processes D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 23 / 32
  51. Notes on advanced feature of the Dispatcher module Multiple dispatched

    computation Notes on multiple dispatched computations Users may dispatch multiple computations to the remote machine before any of the executions have been finished uq_evalModel (myModel , X1 , HPC ); uq_evalModel (myModel , X2 , HPC ); uq_evalModel (myModel , X3 , HPC ); uq_map(@sum , X4 , myDispatcher ); To list of all the dispatched computations associated with a dispatcher object: >> uq_listJobs ( myDispatcher ) No. Job ID Status Tag ... ----------------------------------------------------------- 1 2574 complete uq_evalModel of <Model 1> on <25... 2 2576 submitted uq_evalModel of <Model 1> on <26... 3 2577 submitted uq_evalModel of <Model 1> on <26... 4 2578 submitted uq_map of <sum > on <26-Oct -2020 ... D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 24 / 32
  52. Notes on advanced feature of the Dispatcher module Multiple dispatched

    computation Notes on multiple dispatched computations Users may dispatch multiple computations to the remote machine before any of the executions have been finished uq_evalModel (myModel , X1 , HPC ); uq_evalModel (myModel , X2 , HPC ); uq_evalModel (myModel , X3 , HPC ); uq_map(@sum , X4 , myDispatcher ); Without a job scheduler, the remote machine can be flooded! D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 24 / 32
  53. Notes on advanced feature of the Dispatcher module Retrieving remote

    computations Notes on retrieving remote computations Users may exit the current uqlab session and retrieve the results later Option 1: Save (before exiting) and load the dispatcher object • Use uq saveDispatcher to save the object to a file • Use uq loadDispatcher to load the object from a file Option 2: Recreate the object and retrieve remote computations • Use the same remote machine profile file to create a new object • Use uq retrieveJobs to search through a remote directory and re-attach any remote computations to the current object As long as the directory in the remote machine remains intact D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 25 / 32
  54. Use cases Dispatched UQLink Model evaluation Use case: UQLink model

    evaluation in a cloud cluster An example of distributed computing on a (not high-performance) cluster • Input files are created locally • 3rd-party code is executed on the remote (must be installed there) • Output files are parsed locally Local client Virtual Private Cloud • laptops • desktops SSH SSH master node (VM instance) worker nodes (VM instances) shared storage (NFS) D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 27 / 32
  55. Use cases Dispatched UQLink Model evaluation Use case: UQLink model

    evaluation in a cloud cluster An example of distributed computing on a (not high-performance) cluster • Input files are created locally • 3rd-party code is executed on the remote (must be installed there) • Output files are parsed locally A minor modification to uqlink model options: % Location of the external executable in the remote machine EXECPATH = /home/cluster/code/simply -supported -beam ; ModelOpts. ExecutablePath = EXECPATH; % No MATLAB in the remote ModelOpts. RemoteMATLAB = false; Transferring input/output files of 3rd-party code over the network can be time consuming! D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 27 / 32
  56. Use cases Parametric study Use case: parametric study of a

    metamodel construction Create PCE metamodels on the 10-dimensional Truss data set • Methods: 'OLS', 'LARS', 'OMP' • Degrees: 1, 2, 3, 4, 5 • QoI: LOO and validation errors D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 28 / 32
  57. Use cases Parametric study Use case: parametric study of a

    metamodel construction Create PCE metamodels on the 10-dimensional Truss data set • Methods: 'OLS', 'LARS', 'OMP' • Degrees: 1, 2, 3, 4, 5 • QoI: LOO and validation errors There are multiple ways to do this with uq map. The following is just one possibility. D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 28 / 32
  58. Use cases Parametric study Use case: parametric study of a

    metamodel construction The call: uq map ( @myWrapper , ParamSets , myDispatcher , . . . Parameters , Params , . . . UQLab , t r u e ) ; The components: • myWrapper: an ad-hoc wrapper function to get the QoIs and provide basic error handling • ParamSets: the set of all possible configuration options • myDispatcher: the Dispatcher object • 'Parameters': a named argument to specify parameters in order to avoid duplications of data • 'UQLab': a flag to load UQLab in the remote machine D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 29 / 32
  59. Use cases Parametric study Use case: parametric study of a

    metamodel construction • The wrapper function 1 function Error = myWrapper(MetaOpts ,Params) 2 3 rng (100 , twister ) 4 5 MetaOpts.ExpDesign.X = Params.ExpDesign.X; 6 MetaOpts.ExpDesign.Y = Params.ExpDesign.Y; 7 MetaOpts. ValidationSet .X = Params. ValidationSet .X; 8 MetaOpts. ValidationSet .Y = Params. ValidationSet .Y; 9 MetaOpts.Input = Params.Input; 10 MetaOpts.Display = quiet ; 11 12 try 13 ModelObj = uq_createModel (MetaOpts , -private ); 14 Error.Val = ModelObj.Error.Val; 15 Error.LOO = ModelObj.Error.LOO; 16 catch ME 17 Error = ME; 18 end 19 20 end D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 30 / 32
  60. Use cases Parametric study Use case: parametric study of a

    metamodel construction • The set of (independent) configuration options 1 ParametricOpts .Type = Metamodel ; 2 ParametricOpts .MetaType = PCE ; 3 4 ParametricOpts .Degree = {1 ,2 ,3 ,4 ,5}; 5 ParametricOpts .Method = { OLS , LARS , OMP }; 6 7 ParamSets = uq_createParameterSets ( ParametricOpts ) ParamSets = 15x1 struct array with fields: Type MetaType Degree Method Name D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 30 / 32
  61. Use cases Parametric study Use case: parametric study of a

    metamodel construction • The parameters (i.e., the constant) 1 Params.ExpDesign.X = X; 2 Params.ExpDesign.Y = Y; 3 Params. ValidationSet .X = Xval; 4 Params. ValidationSet .Y = Yval; 5 Params.Input = myInput; % INPUT object already created D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 30 / 32
  62. Use cases Parametric study Use case: parametric study of a

    metamodel construction • Dispatch the computation: >> uq_map(@myWrapper , ParamSets , myDispatcher ,... Parameters , Params ,... UQLab ,true); • Wait for the job: >> uq_waitForJob ( myDispatcher ); • Fetch the results: >> myErrorsPCE = uq_fetchResults ( myDispatcher ); • Find the best configuration: >> ParamSets(bestIdx) ans = struct with fields: Type: Metamodel MetaType: PCE Degree: 4 Method: OMP Name: Model 12 D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 30 / 32
  63. Use cases Parametric study Use case: parametric study of a

    metamodel construction With the same approach, we can tackle the same but much larger problem (2’400 configuration options). Create Kriging metamodels on the 10-dimensional Truss data set • Trend type: 'polynomial' • Trend degree: 0, 1, 2 • Correlation type: 'ellipsoidal', 'separable' • Correlation family; 'linear', 'exponential', 'gaussian', 'matern−3 2', 'matern−5 2' • Isotropic correlation: true, false • Correlation matrix nugget: 0, 10−10 • Estimation method: 'ML' (maximum likelihood), 'CV' (cross validation) • GP regression: true, false • Optimization method: 'BFGS', 'GA', 'HGA', 'CMAES', 'HCMAES' • QoI: LOO and validation errors D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 31 / 32
  64. Use cases Parametric study Use case: parametric study of a

    metamodel construction Set the Dispatcher property NumCPUs to run the computation in parallel: myDispatcher .NumCPUs = 8; % for example Create Kriging metamodels on the 10-dimensional Truss data set • Trend type: 'polynomial' • Trend degree: 0, 1, 2 • Correlation type: 'ellipsoidal', 'separable' • Correlation family; 'linear', 'exponential', 'gaussian', 'matern−3 2', 'matern−5 2' • Isotropic correlation: true, false • Correlation matrix nugget: 0, 10−10 • Estimation method: 'ML' (maximum likelihood), 'CV' (cross validation) • GP regression: true, false • Optimization method: 'BFGS', 'GA', 'HGA', 'CMAES', 'HCMAES' • QoI: LOO and validation errors D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 31 / 32
  65. Summary Summary The HPC Dispatcher module and its capabilities have

    been introduced • The module can help bridging the gap in using distributed computing resources (HPC or otherwise) for some computations in UQ with uqlab • The module’s main goal is to dispatch a computation, whether it makes sense to do that depends on the cases • A new worfklow for a dispatched computation (dispatch-and-fetch) • The dispatcher-aware uq evalModel and uq map are introduced A couple of use cases where dispatching makes sense have been presented • UQLink model evaluation • large-scale parameteric studies of UQLab metamodel (can also be analysis) objects construction D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 32 / 32
  66. Summary Summary The HPC Dispatcher module and its capabilities have

    been introduced • The module can help bridging the gap in using distributed computing resources (HPC or otherwise) for some computations in UQ with uqlab • The module’s main goal is to dispatch a computation, whether it makes sense to do that depends on the cases • A new worfklow for a dispatched computation (dispatch-and-fetch) • The dispatcher-aware uq evalModel and uq map are introduced A couple of use cases where dispatching makes sense have been presented • UQLink model evaluation • large-scale parameteric studies of UQLab metamodel (can also be analysis) objects construction Thank you very much for your attention! D. Wicaksono (RSUQ, ETH Z¨ urich) 28.10.2020 32 / 32