Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Life-cycle of a grid computing job

Nuno L Ferreira
January 27, 2010
140

Life-cycle of a grid computing job

Nuno L Ferreira

January 27, 2010
Tweet

Transcript

  1. 1/24 Outline  Grid & Science - EGEE  Virtual

    Organizations  enmr.eu architecture  Grid Job Life Cycle  Hello Grid!  CNS tutorial  Web Portals
  2. 2/24 The Grid “Coordinated resource sharing and problem solving in

    dynamic, multi-institutional virtual organizations”. Foster, I. et al., Int. J. Superc. Appli. (2000)15:3
  3. 3/24 Why do scientists need the Grid? High-energy physics (15

    PB/year) 15 PB ~ 20*10^6 CD’s Genome projects, data mining, Tackling the protein folding, Protein structure, …
  4. 4/24 Enabling Grids for E-science GStat (Jan 2010) : http://goc.grid.sinica.edu.tw/gstat/

    Infrastructure  317 sites  58 countries  ~ 140K CPU’s 24/7  ~ 69 PB disk Users  182 registered VO’s  ~ 12K registered users  > 300K jobs / day
  5. 5/24 Registered EGEE Virtual Organizations Application domain Active VO’s Users

    High-energy Physics 41 4737 Infrastructures 28 2365 Life Sciences 10 519 ... ... ... Total 182 11908 http://cic.gridops.org/index.php?section=home&page=volist VO name Scope Registered Users (20090210) Registered Users (20100125) biomed Gobal 223 257 enmr.eu Global 54 155
  6. Authentication and Authorization (1/2) 11/24 [nuno@ui-enmr ~]$ ll ~/.globus total

    16 -rw-r--r-- 1 nuno users 2189 Nov 14 17:18 usercert.p12 -rw-r--r-- 1 nuno users 4947 Nov 14 17:19 usercert.pem -rw------- 1 nuno users 963 Nov 14 17:20 userkey.pem [nuno@ui-enmr ~]$ voms-proxy-init --voms enmr.eu Cannot find file or dir: /home/nuno/.glite/vomses Enter GRID pass phrase: Your identity: /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira Creating temporary proxy ........................... Done Contacting voms-02.pd.infn.it:15014 [/C=IT/O=INFN/OU=Host/L=Padova/CN=voms-02.pd.infn.it] "enmr.eu" Done Creating proxy .......................... Done Your proxy is valid until Wed Jan 27 03:44:48 2010 [nuno@ui-enmr ~]$ grid-cert-info -s -i -sd -ed /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira /C=NL/O=NIKHEF/CN=NIKHEF medium-security certification auth Oct 23 00:00:00 2009 GMT Oct 23 15:15:43 2010 GMT
  7. Authentication and Authorization (2/2) 12/24 [nuno@ui-enmr ~]$ voms-proxy-init --voms enmr.eu

    Cannot find file or dir: /home/nuno/.glite/vomses Enter GRID pass phrase: Your identity: /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira Creating temporary proxy ............................................... Done Contacting voms2.cnaf.infn.it:15014 [/C=IT/O=INFN/OU=Host/L=CNAF/CN=voms2.cnaf.infn.it] "enmr. Creating proxy ............................................. Done Your proxy is valid until Wed Jan 27 03:54:00 2010 [nuno@ui-enmr ~]$ voms-proxy-info subject : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira/CN=pr issuer : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira identity : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Nuno Loureiro Ferreira type : proxy strength : 1024 bits path : /tmp/x509up_u500 timeleft : 11:56:19
  8. Available resources 14/24 [nuno@ui-enmr bcbr]$ lcg-infosites --vo enmr.eu ce all

    #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 399 20 85 57 28 grid-ce-01.ba.infn.it:2119/jobmanager-lcgpbs-short 16 7 9 9 0 ce-enmr.chem.uu.nl:2119/jobmanager-lcgpbs-medium 88 88 0 0 0 glite-ce.grid.uj.ac.za:8443/cream-pbs-long 2460 906 103 103 0 trekker.nikhef.nl:2119/jobmanager-pbs-medium 1632 1584 45 45 0 deimos.htc.biggrid.nl:2119/jobmanager-pbs-medium 200 0 0 0 0 t2-ce-05.lnl.infn.it:8443/cream-lsf-enmr1 … snip … Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 2444576886 555136905 n.a prod-se-01.pd.infn.it 3127661680 1371977164 n.a prod-se-02.pd.infn.it 1858674692 106001211 n.a se-enmr.chem.uu.nl 13828076063 21152016643 n.a se01.dur.scotgrid.ac.uk … snip …
  9. Submit a job 15/24 [nuno@ui-enmr bcbr]$ glite-wms-job-submit -a -o jid

    hello.jdl Connecting to the service https://wms-enmr.chem.uu.nl:7443/glite_wms_wmproxy_server ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://lb-enmr.chem.uu.nl:9000/gOtqQuG4ebqpz3m5z8_2Eg The job identifier has been saved in the following file: /home/nuno/grid/hello/bcbr/jid ==========================================================================
  10. Query Job Status 16/24 [nuno@ui-enmr bcbr]$ glite-wms-job-status -i jid *************************************************************

    BOOKKEEPING INFORMATION: Status info for the Job : https://lb-enmr.chem.uu.nl:9000/gOtqQuG4ebqpz3m5z8_2Eg Current Status: Scheduled Status Reason: Job successfully submitted to Globus Destination: pbs-enmr.cerm.unifi.it:2119/jobmanager-lcgpbs-verylong Submitted: Tue Jan 26 16:26:07 2010 CET ************************************************************* [nuno@ui-enmr bcbr]$ glite-wms-job-status -i jid ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://lb-enmr.chem.uu.nl:9000/gOtqQuG4ebqpz3m5z8_2Eg Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: pbs-enmr.cerm.unifi.it:2119/jobmanager-lcgpbs-verylong Submitted: Tue Jan 26 16:26:07 2010 CET *************************************************************
  11. Retrieve Job Output 17/24 [nuno@ui-enmr bcbr]$ glite-wms-job-output -i jid --dir

    ./out Connecting to the service https://wms-enmr.chem.uu.nl:7443/glite_wms_wmproxy_server ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://lb-enmr.chem.uu.nl:9000/gOtqQuG4ebqpz3m5z8_2Eg have been successfully retrieved and stored in the directory: /home/nuno/grid/hello/bcbr/out ================================================================================ [nuno@ui-enmr bcbr]$ ll ./out/ total 4 -rw-r--r-- 1 nuno users 0 Jan 26 17:31 hello.err -rw-r--r-- 1 nuno users 48 Jan 26 17:31 hello.out [nuno@ui-enmr bcbr]$ more ./out/hello.out Hello Grid! I was here : wn3-enmr.cerm.unifi.it
  12. CNS example (1/3) 18/24 [nuno@ui-enmr cns-example]$ ll total 160 -rw-r--r--

    1 nuno users 144884 Mar 18 2009 cns-input.tgz -rw-r--r-- 1 nuno users 1529 Mar 18 2009 README -rwxr-xr-x 1 nuno users 134 Mar 18 2009 run-cns -rw-r--r-- 1 nuno users 229 Jan 17 17:58 run-cns.jdl [nuno@ui-enmr cns-example]$ tar tvzf cns-input.tgz -rw-r--r-- abonvin/staff 30070 2008-05-06 12:42:33 CaMM13Tmpcs1.tbl -rw-r--r-- abonvin/staff 16946 2008-05-06 12:42:33 CaMM13Tmrdc1.tbl -rw-r--r-- abonvin/staff 912 2008-05-06 12:44:53 README -rw-r--r-- abonvin/staff 208142 2008-05-06 12:42:33 calmodulin-MM13.pdb -rw-r--r-- abonvin/staff 341327 2008-05-06 12:42:33 calmodulin-MM13.psf -rw-r--r-- abonvin/staff 4982 2008-05-06 12:42:33 ion.param -rw-r--r-- abonvin/staff 158398 2008-05-06 12:42:33 noes.tbl -rw-r--r-- abonvin/staff 548 2008-05-06 12:42:33 par_axis.pro -rw-r--r-- abonvin/staff 74090 2008-05-06 12:42:33 parallhdg5.3.pro -rw-r--r-- abonvin/staff 16549 2008-05-06 12:42:33 phipsi.tbl -rw-r--r-- abonvin/staff 9571 2008-05-06 12:42:33 sa-test.inp -rw-r--r-- abonvin/staff 273 2008-05-06 12:42:33 tensor.pdb -rw-r--r-- abonvin/staff 1181 2008-05-06 12:42:33 tensor.psf -rw-r--r-- abonvin/staff 57 2008-05-06 12:42:33 tensor.tbl http://www.enmr.eu/eNMR-tutorials
  13. CNS example (2/3) 19/24 [nuno@ui-enmr cns-example]$ more run-cns source $VO_ENMR_EU_SW_DIR/BCBR/cns/1.2-para/set_cns.bash

    tar xfz cns-input.tgz cns < sa-test.inp > sa-test.out tar cvfz cns-output.tgz * [nuno@ui-enmr cns-example]$ more run-cns.jdl Executable = "run-cns"; StdOutput = "std.out"; StdError = "std.err"; InputSandbox = {"cns-input.tgz","run-cns"}; OutputSandbox = {"std.out", "std.err","cns-output.tgz"}; Requirements = RegExp ("chem.uu.nl",other.GlueCEUniqueId);
  14. CNS example (3/3) 20/24 [nuno@ui-enmr cns-example]$ glite-wms-job-submit -a -o jid

    run-cns.jdl [nuno@ui-enmr cns-example]$ glite-wms-job-output -i jid –dir ./ [nuno@ui-enmr cns-example]$ ll total 24464 -rw-r--r-- 1 nuno users 144884 Mar 18 2009 cns-input.tgz -rw-r--r-- 1 nuno users 24854174 Jan 26 18:24 cns-output.tgz -rw-r--r-- 1 nuno users 79 Jan 26 17:13 jid -rw-r--r-- 1 nuno users 1529 Mar 18 2009 README -rwxr-xr-x 1 nuno users 137 Jan 26 17:12 run-cns -rw-r--r-- 1 nuno users 229 Jan 17 17:58 run-cns.jdl [nuno@ui-enmr out]$ more sa_1.pdb REMARK FILENAME="/home/enmr016/globus-tmp.wn23-enmr.25892.0/https_3a_2f_2flb-" … snip … REMARK DATE:26-Jan-2010 17:29:14 created by user: enmr016 REMARK VERSION:1.2 ATOM 1 HA ALA 1 1.868 27.047 -8.664 1.00 15.00 A ATOM 2 CB ALA 1 0.511 28.488 -7.902 1.00 15.00 A ATOM 3 HB1 ALA 1 0.379 28.981 -8.854 1.00 15.00 A … snip …
  15. Zwartkijken / Idées Noires - Franquin “Life cycle of a

    GRID computing job? That's something like: conception.., abortion.., conception.., birth.., premature death.., reanimation.., etc? :p T.” 20100127 – 11AM 23/24
  16. 24/24 Big-Picture layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin Rolf Boleans Hardware-layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van

    der Zwan Middleware layer /C=IT/O=INFN/OU=Personal Certificate/L=Padova/ CN=Cristina Aiftimiei Application layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/ CN=Marc van Dijk /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/ CN=Sjoerd De Vries /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/ CN=Tsjerk Wassenaar User layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/ CN=*.* Acknowlegments
  17. Building a Grid 27/24 27/24 1. The architecture 2. The

    hardware 3. The middleware Network Resources Middleware Application User-centric