Slide 1

Slide 1 text

[email protected] 11 February 2009 The Grid : a primer

Slide 2

Slide 2 text

1/24 Outline  The Grid concept  Grid architecture  Middleware – the core  Interact with the Grid : first steps  Virtual Organizations  enmr.eu VO  Site grid administration

Slide 3

Slide 3 text

2/24 Our world … today! Network infrastructure ON Global Sharing resources “OFF”

Slide 4

Slide 4 text

3/24 One step further … The Grid Network infrastructure ON Global Sharing resources ON “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations”.* Foster, I. et al., Int. J. Superc. Appli. (2000)15:3

Slide 5

Slide 5 text

4/24 Why do scientists need the Grid? High-energy physics (15 PB/year) 15 PB ~ 20*10^6 CD’s Complex problems !! Many iterations !! Virtual cooperation !! Genome projects, data mining, Tackling the protein folding, Protein structure, …

Slide 6

Slide 6 text

5/24 Building a Grid 1. The architecture 2. The hardware 3. The middleware

Slide 7

Slide 7 text

6/24 Building a Grid - architecture Network Resources Middleware Application User-centric

Slide 8

Slide 8 text

7/24 Building a Grid - Grid Fabric (I) Delivery of Advanced Network Technology to Europe State-of-the-art (1985) = 56 Kbps Network characterization  Size  Throughput

Slide 9

Slide 9 text

8/24 Building a Grid - Grid fabric (II) Computer performance # Syst. Family Rmax. (GFps) 1 IBM cluster 1105 52 IBM pSeries 48.9 75 IBM BlueGene 35.1 93 IBM BlueGene 27.5 A flop is a basic computational operation

Slide 10

Slide 10 text

9/24 Building a Grid - middleware "Middleware" is the software that organizes and integrates the resources in a grid. https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310 gLite*

Slide 11

Slide 11 text

10/24 How to interact with the Grid The UI service 3 ways to access the Grid – UI service, Web portal, or UI PnP

Slide 12

Slide 12 text

11/24 Enabling Grids for E-science  > 140 institutions  > 300 sites  50 countries  > 10.000 users  > 80.000 CPU cores 24/7 WOULD YOU TRUST YOUR COMPUTER TO A COMPLETE STRANGER? Worldwide LHC Computing Grid (WLCG)

Slide 13

Slide 13 text

12/24 Registered EGEE Virtual Organizations Application domain Active VOs Users High-energy Physics 36 7994 Life Sciences 8 333 ... ... ... Total 155 16263 Stats : 10 Fev 2009 VO name Scope Registered Users biomd Gobal 223 bio Regional - Italy 57 enmr.eu Global 54

Slide 14

Slide 14 text

13/24 New application web portal http://haddock.chem.uu.nl/enmr

Slide 15

Slide 15 text

14/24 www.enmr.eu

Slide 16

Slide 16 text

15/24 The eyes of the Grid http://gridice-enmr.cerm.unifi.it/site/site.php

Slide 17

Slide 17 text

16/24 How to become an enmr.eu user www.gridcafe.org

Slide 18

Slide 18 text

17/24 Trust is the key!

Slide 19

Slide 19 text

18/24 http://ca.dutchgrid.nl/request/

Slide 20

Slide 20 text

19/24 Fill the form AND ! Pay attention ! 1. Organization 2. Organizational Unit 3. Certificate Level: medium 4. Check & re-check! RA: A. Bonvin http://ca.dutchgrid.nl/request/ + ID card DN : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Your Name Proof of Possession

Slide 21

Slide 21 text

20/24 Wait a couple of days …

Slide 22

Slide 22 text

21/24 https://voms2.cnaf.infn.it:8443/voms/enmr.eu/Login.do Problems ? Alexandre Johan Nuno Follow email instructions !!

Slide 23

Slide 23 text

22/24 Wise sentence … “If you think this is cumbersome… it is nothing compared to get the grid running.” van der Zwan, J. ; 26-01-2009 14:43

Slide 24

Slide 24 text

23/24 Site Grid Administration A glimpse Goal:  keep the grid running 24/7 Facts:  more than 30 middleware updates/year  Bugs, bugs, and more bugs  … nevertheless grid is running How to deal with:  Test b4 putting a service on production  Any more ideas? Sandbox:  Pre-production: test, destroy, and re-build  The art of computer virtualization*: takes 2 min. http://www.xen.org/

Slide 25

Slide 25 text

24/24 Hardware-centric /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van der Zwan Application layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Marc van Dijk /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Sjoerd De Vries /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Tsjerk Wassenaar User abstraction *.* Middleware layer Acknowlegments

Slide 26

Slide 26 text

25/24 Questões colocadas/ Comentários  Protecção contra vírus. Existem mecanismos?  Sistema de prioridades na utilização dos recursos?  Panos gostou da apresentação, excepto do slide acerca da Grid admin (diz que estava for a do contexto)  Num sistema heterogéneo, obtém-se resultados diferentes para o mesmo problema inicial. No entanto, isto tb ocorre na laboratório. È possível no entanto escolher que máquinas usar na grid e que máquinas não usar.  Klartje perguntou se é possível colocar outros programas na Grid. Bonvin respondeu que é possivel enviar o programa junto com o dados.  Dirk perguntou se as comunicações entre os computadores é encriptada.

Slide 27

Slide 27 text

26/24 Moore’s Law Some STATS : 1. Computer power doubles every … 18 months 2. Network performance doubles every … 9 months 3. Data storage density is doubling every … 12 months “The number of transistors that could be squeezed on to a silicon chip was doubling every year.” Moore, G. 1965 Every year that passes, The Grid concept becomes more feasible  Distributed processors can be more tightly integrated  Computer grids are increasingly able to solve increasingly complex problems

Slide 28

Slide 28 text

27/24 gLite INFNGRID – deployment status Update Date 40 ? (04 Fev 2009,CERN) 38-39 23 Jan 2009 35-37 05 Dez 2008 32-34 07 Nov 2008 30-31 23 Set 2008 ... ... 13 19 Fev 2008 ... ... INFNGRID gLite 3.1 (SL4)

Slide 29

Slide 29 text

28/24

Slide 30

Slide 30 text

29/24  The GRID is a collection of geographically distributed resources  GRID users:  Organized in Virtual Organizations  Need to run programs without the need to know  Where to run a job  Where to get the input data from  Where to store the output data to  The GRID consists of  An Authorisation and Authentication System  An Information System  A Workload Management System  A Data Management System  An Accounting System  Various monitoring services  Various installation services The GRID architecture: general view

Slide 31

Slide 31 text

30/24  The Authentication and Authorization System:  Contains the list of all the people authorized to use the GRID divided by VO  all machines running Grid services verify the users credentials map the GRID users to the local users of the machine  The Information System:  provides information about gLite resources and their statuses.  Information published by the individual resources and copied into central databases.  Used by: WMS: match resources against job requirements and to rank them DMS: choose storage resources monitoring systems The GRID architecture: general view

Slide 32

Slide 32 text

31/24  The Workload Management System:  manages jobs submitted by users matches the job requirements to the available resources schedules the job for execution on an appropriate computing cluster tracks the job status allows the user to retrieve the job output when ready  The Data Management System:  Allows users to move files in and out of the Grid replicate files among different locations locate files.  This is achieved: transferring data via a number of protocols GridFTP is the most commonly used interacting with a central file catalog The GRID architecture: general view

Slide 33

Slide 33 text

32/24  Monitoring Services:  GridICE: monitors the usage of Grid resources # jobs running, the storage space available …  R-GMA allows users to monitor application store results in a relational database  Some Monitoring Systems check status of Grid services  more intended for the GRID operations staff  Dedicated Fabric Management Services:  manage installation, upgrade and maintenance local Grid services LCFGng (dismissed) Quattor YAIM (semi automatic tool based on APT/YUM and shell scripts) The GRID architecture: general view

Slide 34

Slide 34 text

33/24 Grid analogy Electrical Power-Grid The Grid You never worry about where the electricity you are using comes from. You would never worry about where the computer power you are using comes from The infrastructure that makes this possible is called "the power grid". The infrastructure that makes this possible is called "the Grid". The power grid is pervasive: electricity is available essentially everywhere and you can imply access it through a standard wall socket The Grid is be pervasive: remote computing resources would be accessible from different platforms, and you will simply access the Grid through your web browser. The power grid is a utility: you ask for electricity, and you get it. You also pay for what you get. The Grid is a utility: you ask for computer power or storage capacity and you get it. You also pay for what you get. "The Grid" doesn't yet exist in this form; however, the world already has hundreds of smaller grids...

Slide 35

Slide 35 text

34/24 Cryptography  A Scytale

Slide 36

Slide 36 text

35/24 Evolution of the HDD Morris, R.J.T. et al , IBM Systems Journal, y.2003, v.42, n.2, pg.205