Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Meeting app performance needs – scaling up vs scaling out

Cloud Genius
December 27, 2013

Meeting app performance needs – scaling up vs scaling out

Cloud Genius

December 27, 2013
Tweet

More Decks by Cloud Genius

Other Decks in Education

Transcript

  1. Do you know how to eat an elephant? 2 ¨ 

    One bite at a time! ¨  Divide and Conquer.
  2. A practical problem 3 ¨  Coca Cola needs to analyze

    consumer sentiment on Diet Coke brand across popular social networks ¤  What type of machine would they need? ¤  Will all the data even fit on the biggest most expensive machine you can buy today?
  3. The Need for Speed 4 ¨  High Performance Architectures need

    more and more resources as demand grows ¨  Methods of adding more resources for a particular application fall into two categories: ¤  Scale up (vertical) VERSUS Scale Out (horizontal) ¤  Get a bigger machine VERSUS add more small machines
  4. Scale Up (scale vertically) 5 ¨  Get a bigger machine

    ¨  Add resources to a single node in a system ¤  involving the addition of CPUs or memory to a single computer. ¨  Vertical scaling of existing systems ¤  enables effective virtualization ¤  provides more resources for the hosted set of operating system and application modules to share. ¨  Taking advantage of such resources in a single computer can also be called "scaling up“ ¤  such as expanding the number of Apache daemon processes currently running.
  5. Scale Out (scale horizontally) 6 ¨  Add more nodes to

    a collection of machines ¤  such as adding a new computer to a distributed software application. ¤  An example might be scaling out from one Web server system to three. ¨  Large number of low cost "commodity" systems ¤  As computer prices drop and ¤  performance continues to increase ¨  Several (Hundreds or thousands) of small computers configured in a cluster to obtain aggregate computing power that often exceeds that of single traditional RISC processor based scientific computers ¨  Scaling out fueled by availability of high performance interconnects (e.g., Myrinet and InfiniBand)
  6. Trade offs 7 ¨  Larger numbers of computers means ¤ 

    increased management complexity, ¤  more complex programming model ¤  throughput and latency between nodes ¤  some applications do not lend themselves to a distributed computing model ¨  Configuring an existing idle system has always been less expensive than buying, installing, and configuring a new one, regardless of the model.
  7. Choosing between Scale up/Scale Out ¨  Scale up: ¤  You

    have a hard limit ¤  the size of the machine on which you are running ¨  Scale out: ¤  Not limited to the capacity of a single unit ¤  Combine the power of multiple machines into a single pool 9 Scale Up Scale Out
  8. Scale Up versus Scale Out 10 ¨ In Concept: ¤ In both

    cases we break a sequential piece of logic into smaller pieces that can be executed in parallel. ¨ In Practice: ¤ Two models are fairly different from an implementation and performance perspective.
  9. Scale Up versus Scale Out ¨  Concurrent programming on multi-

    core machines is often done through multi-threading and in- process message passing. ¨  Single large multi-core machines are best utilized in a context of a single application through concurrent programming ¨  Distributed programming does something similar by distributing jobs across machines over the network ¨  Patterns used are: ¤  MapReduce – Google (2004) ¤  Master/Worker ¤  Tuple Spaces ¤  BlackBoard 11 Scale Up Scale Out
  10. Scale Up versus Scale Out ¨  Existence of a shared

    address space ¨  Data sharing and message passing can be done simply by passing a reference. ¨  Lack of a shared address space ¨  Makes sharing, passing or updating data significantly more complex ¨  Deal with passing of copies of the data which involves additional network and serialization and de-serialization overhead ¨  Once you cross the boundaries of a single process you need to deal with partial failure and consistency 12 Scale Up Scale Out
  11. Why Scale Out 13 ¨ Cost/Performance Flexibility: ¤ Optimize cost/performance by selecting

    the optimal configuration setup at any time ¤ If your system is designed for scale-up only, then you are pretty much locked into a certain minimum price driven by the hardware that you are using. ¤ In a competitive situation, the lack of flexibility could actually kill your business
  12. Why Scale Out 14 ¨ Continuous Availability/Redundancy: ¤ Failure is inevitable. ¤ One

    big system is a single point of failure ¤ The recovery process could be long ¤ Extended down-time needed to restore one big machine
  13. Why Scale Out 15 ¨ Continuous Upgrades: ¤ Building an application as

    one big unit makes it harder or even impossible to add or change pieces of code individually without bringing the entire system down. ¤ Better to decouple your application into concrete sets of services that can be maintained independently.
  14. Why Scale Out 16 ¨ Geographical Distribution: ¤ There are cases where

    an application needs to be spread across data centers or geographical location to handle disaster recovery scenarios or to reduce geographical latency. ¤ Its better to distribute your application so putting in a single box won’t work.
  15. Scaling out is non trivial 17 ¨  Scale out apps

    need a rewrite as the programming model is different ¨  Scale out gains are not linear ¤  have to deal with network overhead, transactions, and replication into operations that were previously done just by passing object references ¨  Beyond a few obvious cases, choosing between scale up and scale out is fairly hard
  16. Further reading 18 ¨  MapReduce: Simplified Data Processing on Large

    Clusters: Dean, Jeff and Ghemawat, Sanjay. ¤  http://research.google.com/archive/mapreduce.html ¤  Open Source Implementation of MapReduce ¤  http://hadoop.apache.org/