Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The grid: a primer

Nuno L Ferreira
February 11, 2009
28

The grid: a primer

Nuno L Ferreira

February 11, 2009
Tweet

Transcript

  1. 1/24 Outline  The Grid concept  Grid architecture 

    Middleware – the core  Interact with the Grid : first steps  Virtual Organizations  enmr.eu VO  Site grid administration
  2. 3/24 One step further … The Grid Network infrastructure ON

    Global Sharing resources ON “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations”.* Foster, I. et al., Int. J. Superc. Appli. (2000)15:3
  3. 4/24 Why do scientists need the Grid? High-energy physics (15

    PB/year) 15 PB ~ 20*10^6 CD’s Complex problems !! Many iterations !! Virtual cooperation !! Genome projects, data mining, Tackling the protein folding, Protein structure, …
  4. 7/24 Building a Grid - Grid Fabric (I) Delivery of

    Advanced Network Technology to Europe State-of-the-art (1985) = 56 Kbps Network characterization  Size  Throughput
  5. 8/24 Building a Grid - Grid fabric (II) Computer performance

    # Syst. Family Rmax. (GFps) 1 IBM cluster 1105 52 IBM pSeries 48.9 75 IBM BlueGene 35.1 93 IBM BlueGene 27.5 A flop is a basic computational operation
  6. 9/24 Building a Grid - middleware "Middleware" is the software

    that organizes and integrates the resources in a grid. https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310 gLite*
  7. 10/24 How to interact with the Grid The UI service

    3 ways to access the Grid – UI service, Web portal, or UI PnP
  8. 11/24 Enabling Grids for E-science  > 140 institutions 

    > 300 sites  50 countries  > 10.000 users  > 80.000 CPU cores 24/7 WOULD YOU TRUST YOUR COMPUTER TO A COMPLETE STRANGER? Worldwide LHC Computing Grid (WLCG)
  9. 12/24 Registered EGEE Virtual Organizations Application domain Active VOs Users

    High-energy Physics 36 7994 Life Sciences 8 333 ... ... ... Total 155 16263 Stats : 10 Fev 2009 VO name Scope Registered Users biomd Gobal 223 bio Regional - Italy 57 enmr.eu Global 54
  10. 19/24 Fill the form AND ! Pay attention ! 1.

    Organization 2. Organizational Unit 3. Certificate Level: medium 4. Check & re-check! RA: A. Bonvin http://ca.dutchgrid.nl/request/ + ID card DN : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Your Name Proof of Possession
  11. 22/24 Wise sentence … “If you think this is cumbersome…

    it is nothing compared to get the grid running.” van der Zwan, J. ; 26-01-2009 14:43
  12. 23/24 Site Grid Administration A glimpse Goal:  keep the

    grid running 24/7 Facts:  more than 30 middleware updates/year  Bugs, bugs, and more bugs  … nevertheless grid is running How to deal with:  Test b4 putting a service on production  Any more ideas? Sandbox:  Pre-production: test, destroy, and re-build  The art of computer virtualization*: takes 2 min. http://www.xen.org/
  13. 24/24 Hardware-centric /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van der Zwan Application layer

    /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Marc van Dijk /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Sjoerd De Vries /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Tsjerk Wassenaar User abstraction *.* Middleware layer Acknowlegments
  14. 25/24 Questões colocadas/ Comentários  Protecção contra vírus. Existem mecanismos?

     Sistema de prioridades na utilização dos recursos?  Panos gostou da apresentação, excepto do slide acerca da Grid admin (diz que estava for a do contexto)  Num sistema heterogéneo, obtém-se resultados diferentes para o mesmo problema inicial. No entanto, isto tb ocorre na laboratório. È possível no entanto escolher que máquinas usar na grid e que máquinas não usar.  Klartje perguntou se é possível colocar outros programas na Grid. Bonvin respondeu que é possivel enviar o programa junto com o dados.  Dirk perguntou se as comunicações entre os computadores é encriptada.
  15. 26/24 Moore’s Law Some STATS : 1. Computer power doubles

    every … 18 months 2. Network performance doubles every … 9 months 3. Data storage density is doubling every … 12 months “The number of transistors that could be squeezed on to a silicon chip was doubling every year.” Moore, G. 1965 Every year that passes, The Grid concept becomes more feasible  Distributed processors can be more tightly integrated  Computer grids are increasingly able to solve increasingly complex problems
  16. 27/24 gLite INFNGRID – deployment status Update Date 40 ?

    (04 Fev 2009,CERN) 38-39 23 Jan 2009 35-37 05 Dez 2008 32-34 07 Nov 2008 30-31 23 Set 2008 ... ... 13 19 Fev 2008 ... ... INFNGRID gLite 3.1 (SL4)
  17. 29/24  The GRID is a collection of geographically distributed

    resources  GRID users:  Organized in Virtual Organizations  Need to run programs without the need to know  Where to run a job  Where to get the input data from  Where to store the output data to  The GRID consists of  An Authorisation and Authentication System  An Information System  A Workload Management System  A Data Management System  An Accounting System  Various monitoring services  Various installation services The GRID architecture: general view
  18. 30/24  The Authentication and Authorization System:  Contains the

    list of all the people authorized to use the GRID divided by VO  all machines running Grid services verify the users credentials map the GRID users to the local users of the machine  The Information System:  provides information about gLite resources and their statuses.  Information published by the individual resources and copied into central databases.  Used by: WMS: match resources against job requirements and to rank them DMS: choose storage resources monitoring systems The GRID architecture: general view
  19. 31/24  The Workload Management System:  manages jobs submitted

    by users matches the job requirements to the available resources schedules the job for execution on an appropriate computing cluster tracks the job status allows the user to retrieve the job output when ready  The Data Management System:  Allows users to move files in and out of the Grid replicate files among different locations locate files.  This is achieved: transferring data via a number of protocols GridFTP is the most commonly used interacting with a central file catalog The GRID architecture: general view
  20. 32/24  Monitoring Services:  GridICE: monitors the usage of

    Grid resources # jobs running, the storage space available …  R-GMA allows users to monitor application store results in a relational database  Some Monitoring Systems check status of Grid services  more intended for the GRID operations staff  Dedicated Fabric Management Services:  manage installation, upgrade and maintenance local Grid services LCFGng (dismissed) Quattor YAIM (semi automatic tool based on APT/YUM and shell scripts) The GRID architecture: general view
  21. 33/24 Grid analogy Electrical Power-Grid The Grid You never worry

    about where the electricity you are using comes from. You would never worry about where the computer power you are using comes from The infrastructure that makes this possible is called "the power grid". The infrastructure that makes this possible is called "the Grid". The power grid is pervasive: electricity is available essentially everywhere and you can imply access it through a standard wall socket The Grid is be pervasive: remote computing resources would be accessible from different platforms, and you will simply access the Grid through your web browser. The power grid is a utility: you ask for electricity, and you get it. You also pay for what you get. The Grid is a utility: you ask for computer power or storage capacity and you get it. You also pay for what you get. "The Grid" doesn't yet exist in this form; however, the world already has hundreds of smaller grids...
  22. 35/24 Evolution of the HDD Morris, R.J.T. et al ,

    IBM Systems Journal, y.2003, v.42, n.2, pg.205