Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Meeting app performance needs – scaling up vs s...

Avatar for Cloud Genius Cloud Genius
December 27, 2013

Meeting app performance needs – scaling up vs scaling out

Avatar for Cloud Genius

Cloud Genius

December 27, 2013
Tweet

More Decks by Cloud Genius

Other Decks in Education

Transcript

  1. Do you know how to eat an elephant? 2 ¨ 

    One bite at a time! ¨  Divide and Conquer.
  2. A practical problem 3 ¨  Coca Cola needs to analyze

    consumer sentiment on Diet Coke brand across popular social networks ¤  What type of machine would they need? ¤  Will all the data even fit on the biggest most expensive machine you can buy today?
  3. The Need for Speed 4 ¨  High Performance Architectures need

    more and more resources as demand grows ¨  Methods of adding more resources for a particular application fall into two categories: ¤  Scale up (vertical) VERSUS Scale Out (horizontal) ¤  Get a bigger machine VERSUS add more small machines
  4. Scale Up (scale vertically) 5 ¨  Get a bigger machine

    ¨  Add resources to a single node in a system ¤  involving the addition of CPUs or memory to a single computer. ¨  Vertical scaling of existing systems ¤  enables effective virtualization ¤  provides more resources for the hosted set of operating system and application modules to share. ¨  Taking advantage of such resources in a single computer can also be called "scaling up“ ¤  such as expanding the number of Apache daemon processes currently running.
  5. Scale Out (scale horizontally) 6 ¨  Add more nodes to

    a collection of machines ¤  such as adding a new computer to a distributed software application. ¤  An example might be scaling out from one Web server system to three. ¨  Large number of low cost "commodity" systems ¤  As computer prices drop and ¤  performance continues to increase ¨  Several (Hundreds or thousands) of small computers configured in a cluster to obtain aggregate computing power that often exceeds that of single traditional RISC processor based scientific computers ¨  Scaling out fueled by availability of high performance interconnects (e.g., Myrinet and InfiniBand)
  6. Trade offs 7 ¨  Larger numbers of computers means ¤ 

    increased management complexity, ¤  more complex programming model ¤  throughput and latency between nodes ¤  some applications do not lend themselves to a distributed computing model ¨  Configuring an existing idle system has always been less expensive than buying, installing, and configuring a new one, regardless of the model.
  7. Choosing between Scale up/Scale Out ¨  Scale up: ¤  You

    have a hard limit ¤  the size of the machine on which you are running ¨  Scale out: ¤  Not limited to the capacity of a single unit ¤  Combine the power of multiple machines into a single pool 9 Scale Up Scale Out
  8. Scale Up versus Scale Out 10 ¨ In Concept: ¤ In both

    cases we break a sequential piece of logic into smaller pieces that can be executed in parallel. ¨ In Practice: ¤ Two models are fairly different from an implementation and performance perspective.
  9. Scale Up versus Scale Out ¨  Concurrent programming on multi-

    core machines is often done through multi-threading and in- process message passing. ¨  Single large multi-core machines are best utilized in a context of a single application through concurrent programming ¨  Distributed programming does something similar by distributing jobs across machines over the network ¨  Patterns used are: ¤  MapReduce – Google (2004) ¤  Master/Worker ¤  Tuple Spaces ¤  BlackBoard 11 Scale Up Scale Out
  10. Scale Up versus Scale Out ¨  Existence of a shared

    address space ¨  Data sharing and message passing can be done simply by passing a reference. ¨  Lack of a shared address space ¨  Makes sharing, passing or updating data significantly more complex ¨  Deal with passing of copies of the data which involves additional network and serialization and de-serialization overhead ¨  Once you cross the boundaries of a single process you need to deal with partial failure and consistency 12 Scale Up Scale Out
  11. Why Scale Out 13 ¨ Cost/Performance Flexibility: ¤ Optimize cost/performance by selecting

    the optimal configuration setup at any time ¤ If your system is designed for scale-up only, then you are pretty much locked into a certain minimum price driven by the hardware that you are using. ¤ In a competitive situation, the lack of flexibility could actually kill your business
  12. Why Scale Out 14 ¨ Continuous Availability/Redundancy: ¤ Failure is inevitable. ¤ One

    big system is a single point of failure ¤ The recovery process could be long ¤ Extended down-time needed to restore one big machine
  13. Why Scale Out 15 ¨ Continuous Upgrades: ¤ Building an application as

    one big unit makes it harder or even impossible to add or change pieces of code individually without bringing the entire system down. ¤ Better to decouple your application into concrete sets of services that can be maintained independently.
  14. Why Scale Out 16 ¨ Geographical Distribution: ¤ There are cases where

    an application needs to be spread across data centers or geographical location to handle disaster recovery scenarios or to reduce geographical latency. ¤ Its better to distribute your application so putting in a single box won’t work.
  15. Scaling out is non trivial 17 ¨  Scale out apps

    need a rewrite as the programming model is different ¨  Scale out gains are not linear ¤  have to deal with network overhead, transactions, and replication into operations that were previously done just by passing object references ¨  Beyond a few obvious cases, choosing between scale up and scale out is fairly hard
  16. Further reading 18 ¨  MapReduce: Simplified Data Processing on Large

    Clusters: Dean, Jeff and Ghemawat, Sanjay. ¤  http://research.google.com/archive/mapreduce.html ¤  Open Source Implementation of MapReduce ¤  http://hadoop.apache.org/