Slide 1

Slide 1 text

(Attempts at) Highly Available, Fault Tolerant, Distributed Apache for the Masses Martin Smith @martinb3 [email protected] Dan Stoner @thatlinuxbox [email protected]

Slide 2

Slide 2 text

Sysadmins, programmers, software engineers, system architects, customer service representatives, on-call staff, marketing, technical support, teachers Who we are

Slide 3

Slide 3 text

What we're trying to do ("The Goal") - Insulate customers from each other - Provide a fault tolerant environment - Create the most performant environment - Provide bulk hosting - Create an architecture that is easy to administer, with fewer special cases

Slide 4

Slide 4 text

How do we determine success - Websites are available, fast, reliable, consistent - Efficient utilization of resources - Keep system load averages below 5 - Protect against massive resource consumption while still allowing burst - Maintainability is also an important factor

Slide 5

Slide 5 text

Despite what you may have heard, Apache is still popular and de-facto Source: http://news.netcraft.com/archives/2012/10/02/october-2012-web- server-survey.html

Slide 6

Slide 6 text

Busy sites still run Apache Source: http://news.netcraft.com/archives/2012/10/02/october-2012-web- server-survey.html

Slide 7

Slide 7 text

Starting point: Most basic Apache configuration - One server, multiple virtual hosts - Local disk, local logs - mod_php, mod_perl, mod_foo for CGI - suexec for anything that isn't one of the above

Slide 8

Slide 8 text

Enhancement: Load balancing - Explain different load balancing technologies - Explain some of our previous and current decisions for load balancing - Explain the effects on customers, difficulty of clustering different software

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Graphs - Apache Requests per Second (all nodes total)

Slide 11

Slide 11 text

Graphs - Apache Requests per Second (single node)

Slide 12

Slide 12 text

Graphs - System Load Average (single node)

Slide 13

Slide 13 text

Graphs - System Memory Profile (single node)

Slide 14

Slide 14 text

CNS Managed Apache Hosting Overview

Slide 15

Slide 15 text

Enhancement: Document Roots - NFS shares, replication - Giant directory vs. home directories - Pitfalls and Problems with this approach - Fun Apache behavior

Slide 16

Slide 16 text

httpd.conf - Default virtual hosts get important - CGI is the hardest problem

Slide 17

Slide 17 text

Enhancement: CGI - mod_php, mod_perl - Suexec - FastCGI for PHP, Perl - other enhancements for speed - Pros and cons with these approaches - What do large scale hosting providers do?

Slide 18

Slide 18 text

httpd.conf (threading model) # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves StartServers 2 ServerLimit 32 MaxClients 2048 MinSpareThreads 32 MaxSpareThreads 128 ThreadsPerChild 64 MaxRequestsPerChild 0

Slide 19

Slide 19 text

mod_status (server-status) is cool! Total accesses: 519889 - Total Traffic: 6.1 GB CPU Usage: u129.93 s40.41 cu0 cs0 - 1.17% CPU load 35.8 requests/sec - 439.7 kB/second - 12.3 kB/request 101 requests currently being processed, 91 idle workers RR___RR_R_RRRCRW____R_W_R_R_____RR_R_RRR_R__RR_RRRR___RRRRR_R_RW RRRRRR__RR_RRWRCR_R______R__RRR_R_R__R___RC_RRR__R__R_____RRR___ RRR_C_R_R_R_R___RRRR_RR_RR__R__R_____WRR_RR_WRRRR__R__RR_R____R_ ................................................................ Scoreboard Key: "_" Waiting for Connection, " S" Starting up, "R" Reading Request," W" Sending Reply, "K" Keepalive (read), " D" DNS Lookup, "C" Closing connection, " L" Logging, "G" Gracefully finishing, " I" Idle cleanup of worker, "." Open slot with no current process Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-0 23585 0/218/2420 R 20.03 39 7 0.0 2.48 32.84 ? ? .. reading..

Slide 20

Slide 20 text

httpd.conf (server-status) LoadModule status_module modules/mod_status.so # ExtendedStatus controls whether Apache will generate "full" status # information (ExtendedStatus On) or just basic information (ExtendedStatus # Off) when the "server-status" handler is called. The default is Off. # ExtendedStatus On SetHandler server-status Order deny,allow Deny from all Allow from [some list of networks or IP addresses]

Slide 21

Slide 21 text

Enhancement: Caching - PHP provides APC (file and opcode cache) - File System cache (VFS, backing systems) - Database server cache (maybe) - Applications frequently provide caching (e.g. WP-Cache, WP Super Cache, Joomla System Cache, Drupal Boost, ...) - Others (memcached, redis.... hard in bulk hosting environments) - Bottom line: It depends on the customer ->

Slide 22

Slide 22 text

APC - Alternative PHP Cache

Slide 23

Slide 23 text

APC (continued) - APC does not really help in an suexec environment - We have seen APC cause issues due to... caching! - Plan on "doing something" when you deploy new PHP code to flush the cache.

Slide 24

Slide 24 text

Show some data!!! Number of Customers: ~70 Number of apache servers: 18 prod (including Gatorlink webmail) Number of vhosts: ~375 Busy site: http://www.ufl.edu www.ufl.edu visits per month: 1.3 million

Slide 25

Slide 25 text

Server Hardware - All of our Apache servers are virtual machines (VMware VSphere 5) - Each virtual machine has modest resources: 2 processor cores 4 GB RAM 18 GB disk (plus NFS-mounted home dirs)

Slide 26

Slide 26 text

Versions - koolaid or homebrew? Apache 2.2.x, PHP 5.2.x - Worth building packages? * security, maintainability, sustainability - Bulk hosting has unique challenges - Panels and 1-clicks and backups

Slide 27

Slide 27 text

Enhancement: Miscellaneous - Other misc enhancements like Redhat upgrades, configuration changes, automation, etc

Slide 28

Slide 28 text

Questions? Thank you! Please send Dan Stoner all of your questions.

Slide 29

Slide 29 text

Image sources http://hikaru.tea-nifty.com/robo/cat1176097/index.html http://vi.sualize.us/customer_customer_suggestion_box_picture_3Ukw.html http://www.imdb.com/title/tt0046719/ http://profitablegrowth.com/you-must-understand-we-all-wear-many-hats-here/ http://www.rosehosting.com/blog/how-to-install-lamp-linux-apache-mysql-and-php-on-centos-6-with-phpmyadmin-and- apc-cache/ http://koldfusion.ca/wp/2007/06/trivia-for-hackers-the-movie/