Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Highly available, fault tolerant, distributed apache for the masses

Martin Smith
October 17, 2012

Highly available, fault tolerant, distributed apache for theĀ masses

Martin Smith

October 17, 2012


  1. What we're trying to do ("The Goal") - Insulate customers

    from each other - Provide a fault tolerant environment - Create the most performant environment - Provide bulk hosting - Create an architecture that is easy to administer, with fewer special cases
  2. How do we determine success - Websites are available, fast,

    reliable, consistent - Efficient utilization of resources - Keep system load averages below 5 - Protect against massive resource consumption while still allowing burst - Maintainability is also an important factor
  3. Despite what you may have heard, Apache is still popular

    and de-facto Source: http://news.netcraft.com/archives/2012/10/02/october-2012-web- server-survey.html
  4. Starting point: Most basic Apache configuration - One server, multiple

    virtual hosts - Local disk, local logs - mod_php, mod_perl, mod_foo for CGI - suexec for anything that isn't one of the above
  5. Enhancement: Load balancing - Explain different load balancing technologies -

    Explain some of our previous and current decisions for load balancing - Explain the effects on customers, difficulty of clustering different software
  6. Enhancement: Document Roots - NFS shares, replication - Giant directory

    vs. home directories - Pitfalls and Problems with this approach - Fun Apache behavior
  7. Enhancement: CGI - mod_php, mod_perl - Suexec - FastCGI for

    PHP, Perl - other enhancements for speed - Pros and cons with these approaches - What do large scale hosting providers do?
  8. httpd.conf (threading model) # worker MPM # StartServers: initial number

    of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves <IfModule worker.c> StartServers 2 ServerLimit 32 MaxClients 2048 MinSpareThreads 32 MaxSpareThreads 128 ThreadsPerChild 64 MaxRequestsPerChild 0 </IfModule>
  9. mod_status (server-status) is cool! Total accesses: 519889 - Total Traffic:

    6.1 GB CPU Usage: u129.93 s40.41 cu0 cs0 - 1.17% CPU load 35.8 requests/sec - 439.7 kB/second - 12.3 kB/request 101 requests currently being processed, 91 idle workers RR___RR_R_RRRCRW____R_W_R_R_____RR_R_RRR_R__RR_RRRR___RRRRR_R_RW RRRRRR__RR_RRWRCR_R______R__RRR_R_R__R___RC_RRR__R__R_____RRR___ RRR_C_R_R_R_R___RRRR_RR_RR__R__R_____WRR_RR_WRRRR__R__RR_R____R_ ................................................................ <snip> Scoreboard Key: "_" Waiting for Connection, " S" Starting up, "R" Reading Request," W" Sending Reply, "K" Keepalive (read), " D" DNS Lookup, "C" Closing connection, " L" Logging, "G" Gracefully finishing, " I" Idle cleanup of worker, "." Open slot with no current process Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-0 23585 0/218/2420 R 20.03 39 7 0.0 2.48 32.84 ? ? .. reading..
  10. httpd.conf (server-status) LoadModule status_module modules/mod_status.so # ExtendedStatus controls whether Apache

    will generate "full" status # information (ExtendedStatus On) or just basic information (ExtendedStatus # Off) when the "server-status" handler is called. The default is Off. # ExtendedStatus On <Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from [some list of networks or IP addresses] </Location>
  11. Enhancement: Caching - PHP provides APC (file and opcode cache)

    - File System cache (VFS, backing systems) - Database server cache (maybe) - Applications frequently provide caching (e.g. WP-Cache, WP Super Cache, Joomla System Cache, Drupal Boost, ...) - Others (memcached, redis.... hard in bulk hosting environments) - Bottom line: It depends on the customer ->
  12. APC (continued) - APC does not really help in an

    suexec environment - We have seen APC cause issues due to... caching! - Plan on "doing something" when you deploy new PHP code to flush the cache.
  13. Show some data!!! Number of Customers: ~70 Number of apache

    servers: 18 prod (including Gatorlink webmail) Number of vhosts: ~375 Busy site: http://www.ufl.edu www.ufl.edu visits per month: 1.3 million
  14. Server Hardware - All of our Apache servers are virtual

    machines (VMware VSphere 5) - Each virtual machine has modest resources: 2 processor cores 4 GB RAM 18 GB disk (plus NFS-mounted home dirs)
  15. Versions - koolaid or homebrew? Apache 2.2.x, PHP 5.2.x -

    Worth building packages? * security, maintainability, sustainability - Bulk hosting has unique challenges - Panels and 1-clicks and backups