$30 off During Our Annual Pro Sale. View Details »

Highly available, fault tolerant, distributed apache for the masses

Martin Smith
October 17, 2012
140

Highly available, fault tolerant, distributed apache for the masses

Martin Smith

October 17, 2012
Tweet

Transcript

  1. (Attempts at)
    Highly Available,
    Fault Tolerant,
    Distributed Apache
    for the Masses
    Martin Smith
    @martinb3
    [email protected]
    Dan Stoner
    @thatlinuxbox
    [email protected]

    View Slide

  2. Sysadmins, programmers, software
    engineers, system architects, customer
    service representatives, on-call staff,
    marketing, technical support, teachers
    Who we are

    View Slide

  3. What we're trying to do
    ("The Goal")
    - Insulate customers from each other
    - Provide a fault tolerant environment
    - Create the most performant environment
    - Provide bulk hosting
    - Create an architecture that is easy to
    administer, with fewer special cases

    View Slide

  4. How do we determine success
    - Websites are available, fast, reliable,
    consistent
    - Efficient utilization of resources
    - Keep system load averages below 5
    - Protect against massive resource
    consumption while still allowing burst
    - Maintainability is also an important factor

    View Slide

  5. Despite what you may have heard,
    Apache is still popular and de-facto
    Source: http://news.netcraft.com/archives/2012/10/02/october-2012-web-
    server-survey.html

    View Slide

  6. Busy sites still run Apache
    Source: http://news.netcraft.com/archives/2012/10/02/october-2012-web-
    server-survey.html

    View Slide

  7. Starting point:
    Most basic Apache configuration
    - One server, multiple virtual hosts
    - Local disk, local logs
    - mod_php, mod_perl, mod_foo for CGI
    - suexec for anything that isn't one of the above

    View Slide

  8. Enhancement: Load balancing
    - Explain different load balancing technologies
    - Explain some of our previous and current
    decisions for load balancing
    - Explain the effects on customers, difficulty of
    clustering different software

    View Slide

  9. View Slide

  10. Graphs -
    Apache Requests per Second (all nodes total)

    View Slide

  11. Graphs -
    Apache Requests per Second (single node)

    View Slide

  12. Graphs -
    System Load Average (single node)

    View Slide

  13. Graphs -
    System Memory Profile (single node)

    View Slide

  14. CNS Managed Apache Hosting Overview

    View Slide

  15. Enhancement: Document Roots
    - NFS shares, replication
    - Giant directory vs. home
    directories
    - Pitfalls and Problems
    with this approach
    - Fun Apache behavior

    View Slide

  16. httpd.conf
    - Default virtual hosts get important
    - CGI is the hardest problem

    View Slide

  17. Enhancement: CGI
    - mod_php, mod_perl
    - Suexec
    - FastCGI for PHP, Perl
    - other enhancements for speed
    - Pros and cons with these approaches
    - What do large scale hosting providers do?

    View Slide

  18. httpd.conf (threading model)
    # worker MPM
    # StartServers: initial number of server processes to start
    # MaxClients: maximum number of simultaneous client connections
    # MinSpareThreads: minimum number of worker threads which are kept spare
    # MaxSpareThreads: maximum number of worker threads which are kept spare
    # ThreadsPerChild: constant number of worker threads in each server process
    # MaxRequestsPerChild: maximum number of requests a server process serves

    StartServers 2
    ServerLimit 32
    MaxClients 2048
    MinSpareThreads 32
    MaxSpareThreads 128
    ThreadsPerChild 64
    MaxRequestsPerChild 0

    View Slide

  19. mod_status (server-status) is cool!
    Total accesses: 519889 - Total Traffic: 6.1 GB
    CPU Usage: u129.93 s40.41 cu0 cs0 - 1.17% CPU load
    35.8 requests/sec - 439.7 kB/second - 12.3 kB/request
    101 requests currently being processed, 91 idle workers
    RR___RR_R_RRRCRW____R_W_R_R_____RR_R_RRR_R__RR_RRRR___RRRRR_R_RW
    RRRRRR__RR_RRWRCR_R______R__RRR_R_R__R___RC_RRR__R__R_____RRR___
    RRR_C_R_R_R_R___RRRR_RR_RR__R__R_____WRR_RR_WRRRR__R__RR_R____R_
    ................................................................

    Scoreboard Key:
    "_" Waiting for Connection, " S" Starting up,
    "R" Reading Request," W" Sending Reply,
    "K" Keepalive (read), " D" DNS Lookup,
    "C" Closing connection, " L" Logging,
    "G" Gracefully finishing, " I" Idle cleanup of worker,
    "." Open slot with no current process
    Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request
    0-0 23585 0/218/2420 R 20.03 39 7 0.0 2.48 32.84 ? ? ..
    reading..

    View Slide

  20. httpd.conf (server-status)
    LoadModule status_module modules/mod_status.so
    # ExtendedStatus controls whether Apache will generate "full" status
    # information (ExtendedStatus On) or just basic information (ExtendedStatus
    # Off) when the "server-status" handler is called. The default is Off.
    #
    ExtendedStatus On

    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from [some list of networks or IP addresses]

    View Slide

  21. Enhancement: Caching
    - PHP provides APC (file and opcode cache)
    - File System cache (VFS, backing systems)
    - Database server cache (maybe)
    - Applications frequently provide caching (e.g.
    WP-Cache, WP Super Cache, Joomla System
    Cache, Drupal Boost, ...)
    - Others (memcached, redis....
    hard in bulk hosting environments)
    - Bottom line: It depends on the customer
    ->

    View Slide

  22. APC - Alternative PHP Cache

    View Slide

  23. APC (continued)
    - APC does not really help in an suexec
    environment
    - We have seen APC cause issues due to...
    caching!
    - Plan on "doing something" when you deploy
    new PHP code to flush the cache.

    View Slide

  24. Show some data!!!
    Number of Customers: ~70
    Number of apache servers: 18 prod (including
    Gatorlink webmail)
    Number of vhosts: ~375
    Busy site: http://www.ufl.edu
    www.ufl.edu visits per month: 1.3 million

    View Slide

  25. Server Hardware
    - All of our Apache servers are virtual machines
    (VMware VSphere 5)
    - Each virtual machine has modest resources:
    2 processor cores
    4 GB RAM
    18 GB disk (plus NFS-mounted home dirs)

    View Slide

  26. Versions - koolaid or homebrew?
    Apache 2.2.x, PHP 5.2.x
    - Worth building packages?
    * security, maintainability, sustainability
    - Bulk hosting has unique challenges
    - Panels and 1-clicks and backups

    View Slide

  27. Enhancement: Miscellaneous
    - Other misc enhancements like Redhat
    upgrades, configuration changes, automation,
    etc

    View Slide

  28. Questions? Thank you!
    Please send Dan Stoner all of your questions.

    View Slide

  29. Image sources
    http://hikaru.tea-nifty.com/robo/cat1176097/index.html
    http://vi.sualize.us/customer_customer_suggestion_box_picture_3Ukw.html
    http://www.imdb.com/title/tt0046719/
    http://profitablegrowth.com/you-must-understand-we-all-wear-many-hats-here/
    http://www.rosehosting.com/blog/how-to-install-lamp-linux-apache-mysql-and-php-on-centos-6-with-phpmyadmin-and-
    apc-cache/
    http://koldfusion.ca/wp/2007/06/trivia-for-hackers-the-movie/

    View Slide