Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Beowulf Cluster 1

Beowulf Cluster 1

More Decks by वेणु गोपाल

Other Decks in Technology

Transcript

  1. Moore’s Law  “The complexity for minimum component costs has

    increased at a rate of roughly a factor of two per year ... Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000. I believe that such a large circuit can be built on a single wafer.”
  2. Moore’s Law implies?  “There is no room left to

    squeeze anything out by being clever. Going forward from here we have to depend on the size factors – bigger dies and finer dimensions.”
  3. The “Algorithm”  The word algorithm is defined as the

    "sequence of steps" necessary to carry out a computation.  This definition leads us to believe that programs are sequential in nature  And only a small subset of them lent themselves to parallelism.
  4. Thinking Parallel ?  For 30 years programs were run

    sequentially  Until Dr. Geoffrey Fox published a book Parallel Computing Works!  This led to a reversal in thinking.  It is now accepted that programs are parallel in nature, & only a small subset are sequential.  But still programming is taught sequentially in schools and universities.
  5. The Problem  It would be a lot easier to

    program a single processor chip running at 1 PHz  than a million processors running at 2 GHz  but we don't know how to build a 1 Phz processor  and even if we did, someone would still want to strap a bunch of them together!
  6. Gustafson's law  S (P) = P – α .

    (P - 1)  where P is the number of processors, S is the speedup, and α the non-parallelizable part of the process
  7. A Driving Metaphor - Amdahl’s Vs Gustafson’s Suppose a car

    is traveling between two cities 60 miles apart, and has already spent one hour traveling half the distance at 30 mph.  Amdahl's Law says “ No matter how fast you drive the last half, it is impossible to achieve 90 mph average before reaching the second city. Since it has already taken you 1 hour and you only have a distance of 60 miles total; going infinitely fast you would only achieve 60 mph. ”  Gustafson's Law says “ Given enough time and distance to travel, the car's average speed can always eventually reach 90mph, no matter how long or how slowly it has already traveled. i.e., in this two cities case this could be achieved by driving at 150 mph for an additional hour.”
  8. NP Complete Problems  There are more than 3000 known

    NP- complete problems.  Here are some of the more commonly known problems that are NP-complete  Garey and Johnson's - Computers and Intractability: A Guide to the Theory of NP-Completeness
  9.  Graph theory ◦ Covering and partitioning ◦ Sub graphs

    and super graphs ◦ Vertex ordering ◦ Iso- and other morphisms  Network design ◦ Spanning trees ◦ Cuts and connectivity ◦ Routing problems ◦ Flow problems  Sets and partitions ◦ Covering, hitting, and splitting ◦ Weighted set problems ◦ Set partitions  Storage and retrieval ◦ Data storage ◦ Compression and representation ◦ Database problems  Sequencing and scheduling ◦ Sequencing on one processor ◦ Multiprocessor scheduling ◦ Shop scheduling  Mathematical programming  Algebra and number theory ◦ Divisibility problems ◦ Solvability of equations  Logic ◦ Propositional logic  Automata and language theory ◦ Automata theory ◦ Formal languages  Program optimization ◦ Code generation ◦ Programs and schemes
  10. The Beowulf Recipie 1. Buy a pile of MMCOTS PC„s.

    2. Buy a nice, cheap Ethernet Switch. 3. Interconnect each PC to the switch. 4. Install Linux and parallel computing packages on each. 5. Blow your code away by running it in parallel.
  11. What is a Beowulf? 1. The nodes are dedicated to

    the beowulf and serve no other purpose. 2. The network is dedicated to the beowulf and serves no other purpose. 3. The nodes are inexpensive MMCOTS computers. 4. The network is made with COTS Switches and Routers. 5. The nodes all run open source software. 6. Generally one node is made the master node.
  12. Beowulf Cluster  Parallel computer built from commodity hardware, and

    open source software  Beowulf Cluster characteristics ◦ Internal high speed network ◦ Commodity of the shelf hardware ◦ Open source software and OS ◦ Support parallel programming such as MPI, PVM
  13. Beowulf Project  Originating from Center of Excellence and Information

    Systems Sciences(CESDIS) at NASA Goddard Space Center by Dr. Thomas Sterling, Donald Becker “Beowulf is a project to produce the software for off-the-shelf clustered workstations based on commodity PC-class hardware, a high-bandwidth internal network, and the Linux operating system.”
  14. Why Is Beowulf Good?  Low initial implementation cost ◦

    Inexpensive PCs ◦ Standard components and Networks ◦ Free Software: Linux, GNU, MPI, PVM  Scalability: can grow and shrink  Familiar technology, easy for user to adopt the approach, use and maintain system.
  15. Biggest Beowulf?  1000 nodes Beowulf Cluster System  Used

    for genetic algorithm research by John Coza, Stanford University  http://www.genetic- programming.com/