Some simple Apache optimisations

Posted by Steve on Mon 18 Jul 2005 at 10:02

Apache is the world's most popular webserver, powering over half the websites on the internet. It is a stable and reliable platform, but sometimes it struggles under a lot of load. Here we'll look at a couple of simple changes to increase performance when handling a lot of traffic.

None of these tips are revolutionary, but combined they have allowed this site to stay up under two slashdottings. If you've not heard the term before a Slashdotting is what happens when a popular website such as Slashdot links to a smaller site - suddenly there are thousands of visitors all coming to your site. The sudden and sustained increase in incoming requests can frequently overload many servers.

Frequently the Slashdot effect will knock a site over, either because there's insufficient bandwidth to handle the incoming connections, or because the webserver isn't setup to handle such a large load. This site has survived two such links, most recently a single article received 16,000 readers in the space of a couple of hours.

So, what can we do to tune Apache? Well there are several small and large changes that can be made - depending upon your server some or all of these may not be appropriate, but they've worked for me both here and on other sites I've setup. (At times like this I feel like pimping out my server handholding and remote maintainence services .. ;)

DNS Lookups

The single biggest source of slowdown in most webservers is the time required to perform DNS lookups.

Typically a webserver will record the full host name of each incoming client connection in it's access.log. This resolving can eat a significant chunk of time, even with a DNS cache.

Disabling DNS lookups by ensuring your Apache setup contains "HostnameLookups Off" inside either /etc/apache/httpd.conf, or /etc/apache2/apache2.conf can immediately make your server capable of handling more traffic.

You might be concerned that this will make your server log files less readable, and affect any log file analysis you might wish to perform. But thankfully the Debian Apache package ships with the logresolve tool - this will perform hostname lookups upon your log file, and output a new one as output.

If you use webalizer or Awstats you can use the logresolve tool to add in the host names before the stats are generated.

I use webalizer to produce my site's statistics and simply instruct it to read it's logfile from access.log.resolved instead of the more typical access.log. I produce this file once a day, just before producing the statistics with the following small script:

#!/bin/sh


cd /home/www/www.site1.com/logs
logresolve < access.log > access.log.resolved
/usr/bin/webalizer -q 

cd /home/www/www.site2.com/logs
logresolve < access.log > access.log.resolved
/usr/bin/webalizer -q 
MaxClients

When Apache starts up it will create a number of listening processes, each of which will handle a given number of clients then exit.

(This process is complicated somewhat by the different MPM models available in Apache2 - but in general it's a fair statement.)

If you have a lot of incoming clients you can immediately handle more just by increasing the relevant counts.

If your server has reached the limit of what it can handle you'll see something like this in your error.log file:

[error] server reached MaxClients setting, consider raising the MaxClients setting

The settings look like this, although if you're using Apache2 you'll discover that your apache2.conf file has multiple versions of these settings, one for each of the process models available:

StartServers         5
MinSpareServers      5
MaxSpareServers     10
MaxClients          35
MaxRequestsPerChild  0

The way to adjust these is to increase each number upwards by a small amount. This should allow you to handle more simultaneous clients, at the expense of running more processes. There's a fine balance to be maintained between running enough processes to handle the traffic, and running so many that your server slows down due to increased load.

Adjusting these settings appropriately will almost certainly be the single most useful change you can make to your server, but it's hard to give appropriate numbers. It really will depend upon your server, and what else you're running.

KeepAlive

Using KeepAlive is closely related to the MaxClients setting above.

Essentially KeepAlive keeps each listening connection alive for a short time to receive a potential followup request. Assuming that a client wishes to make several requests to your server it can do so en masse without having to make multiple distinct connections.

In this scenario KeepAlive is a useful optimisation, but it can mean that you have a lot of connections open uselessly waiting for followup requests which never occur.

A possible solution here is to allow KeepAlive, but only for a few seconds. This means that any client which requests another page quickly will receive it, but if it doesn't then the listening will stop - allowing your server to handle another connection instead.

To do this use:

#
#  Keep connections alive, but only for two seconds.
#
KeepAlive On
KeepAliveTimeout 2
Deny OverRides

Another common source of slowdown in Apache is the use of .htaccess files to change Apache's behaviour.

Many settings can be altered on a per-directory basis using these files, but looking for them and reading them will cause the server to slow down, and do more work than it really needs to.

For example the following URL:

This file should be something that Apache can serve quickly, there's nothing (obviously) dynamic about it. But if you allow the use of "Override files" then Apache must scan for and process:

Setting "AllowOverride None" inside any virtual hosts or directory directives you might have will disable this searching and reduce the amount of file testing and reading your server will need.

Of course many times you will discover that you need some directories to have specific processing - the solution here is to add such configuration settings inside your Apache setup directly.

Compress Content

Compress your content with mod_deflate, or mod_gzip, if you can.

Whilst there's some CPU overhead in performing this compression when serving a lot of mostly static content the network saturation is a bigger problem than CPU overload.

If you have CPU load issues you can easily disable this compression when you spot it.

Remove Debugging Logs

Many Apache modules such as mod_rewrite (used for making prettier URLs) or mod_security (a simple security module) allow you to setup logfiles useful for debugging problems.

If you're happy that your setup is working correctly then you no longer need any logfiles so the following entries, for example, should be removed:

RewriteLog        /tmp/rewrite.log
SecFilterDebugLog /var/log/apache2/modsec_debug_log

Hopefully those small tips will allow you to setup your server to handle more load, and perform more efficiently if you get slashdotted.

If you're routinely suffering from lots of load these tips might not be so useful, instead you might need to consider:

Both of these solutions will ease the load on your servers, but they are overkill for smaller sites.

If you have any tips of your own to share feel free to leave them in the comments!


This article can be found online at the Debian Administration website at the following bookmarkable URL (along with associated comments):

This article is copyright 2005 Steve - please ask for permission to republish or translate.