Simple webserver load balancing with pound

Posted by Steve on Tue 20 Sep 2005 at 11:38

There are times when having only a single webserver is insufficient to handle the amount of traffic, or load, you're receiving. In this situation you have several options. If you have the ability to add new webservers into your setup then using pound might be a good approach.

For load-balancing there are several common solutions, depending upon your requirements:

The first solution might be the best one, but if you don't have the money to spend on dedicated hardware (after buying another server) then a software-only solution might be your only option.

Choosing between the other solutions will be a matter of knowing why you're using load balancing:

Using round-robin DNS gives you the ability to setup a pair, or more, of machines and have users "randomly" connect to a different host. This is simple and reasonably effective, however it doesn't give you much redundancy. (If one machine fails then some users will still be sent to that host, and will receive errors).

A simple non-DNS based load balancing setup will look something like this:

Simple cluster diagram

Here you can see there is a publicly visible host at the front, this will be the main machine to which users connect, www.example.com. Behind that is the actual cluster - incoming connections will be routed to one of those machines via "magic".

The magic involved might be a load balancing piece of hardware, an Apache module, or something else. In our case it will be an installation of the Pound software.

pound is very simple to understand and use. It is configured with a list of machines in the cluster, and accepts incoming HTTP-connections. When a request comes in it will be sent to one of the hosts in the pool.

If your server uses some form of state management, such as cookies, and it is important for a particular client to stay with a particular host for the duration of its connections then this can also be accomadated.

Installing the software is simple:

apt-get install pound

Once installed you can configure it by modifying the /etc/pound/pound.cfg file. Note that by default the package will be installed in a disabled state. Once you've configured the software appropriately you must enable it by changing the file /etc/default/pound.

The initial version looks like this:

# Defaults for pound initscript
# sourced by /etc/init.d/pound
# installed at /etc/default/pound by the maintainer scripts

# prevent startup with default configuration
# set the below varible to 1 in order to allow pound to start
startup=0

The configuration of pound comes in three parts:

The global options will likely be setup already to your satisfaction, the only thing you will likely have to change is the IP address to bind upon. This can be setup via something like this:

ListenHTTP 123.123.123.123,80

pound has the ability to read the HTTP connections and take decisions based upon the requested URI. This allows you to send some requests, such as all those beneath http://example.com/images to a particular host. Here we will ignore this, and other options (such as SSL proxying).

Ignoring special handling, then, you'll define your list of machines via settings such as this:

UrlGroup ".*"
BackEnd 192.168.1.1,80,1
BackEnd 192.168.1.2,80,1
BackEnd 192.168.1.3,80,1
EndGroup

Here the UrlGroup prologue means that this setting applies to all incoming URLs (".*" is a regular expression applied against the incoming request URI). The BackEnd settings are a list of IP addresses, ports, and priorities.

The priorities are used to express the relative power of the webserver at the given IP address. The acceptable values are 1-9, and those servers listed with a higher priority will receive more connections.

For example if you have two hosts in your cluster (192.168.1.{ 1 100}), and the machine 192.168.1.100 is twice as powerful as the other you could use the following to make sure it gets twice as many incoming connections:

UrlGroup ".*"
BackEnd 192.168.1.1,80,1
BackEnd 192.168.1.100,80,2
EndGroup

pound will keep track of the status of each of the hosts in the cluster. This means it won't send requests to hosts which have failed. You can configure this checking period with a setting such as:

# Check backend machines every half-minute
Alive 30

The only other thing you need to do is to consider how to maintain state. HTTP is a stateless protocol, and to add the illusion of state to it there are several different options in common use:

pound can handle any of these options, but you must tell it which to use. In the case of cookie-based session you must also specify the name of the cookie which is being used.

To specify the session type you must add the Session setting to your UrlGroup stanza. The available options are:

IP

The session is kept based on client IP address. Specify this as follows:

Session IP N
BASIC

The session is based upon HTTP "Basic Authentication", use it as follows:

Session BASIC N
URL

The session is specified by a parameter appended to all URLs. You specify the name as follows:

Session URL phpsession N
COOKIE

The sessions are maintained by a cookie passed with each connection. You specify the cookie name as follows:

Session COOKIE cookie-name N

The "N" value is the value for which sessions will be maintained, in seconds. After longer than the given time the client may be passed to another back-end machine.

A complete example, using the cookie name "auth" lasting for an hour would look like this then:

UrlGroup ".*"
BackEnd 192.168.0.11,20,1
BackEnd 192.168.0.11,21,1
Session COOKIE auth 360
EndGroup

There are many more options you can tweak in pound and the man page does a good job of explaining them - especially combined with the homepage.

To read the man page run:

man pound

This article can be found online at the Debian Administration website at the following bookmarkable URL:

This article is copyright 2005 Steve - please ask for permission to republish or translate.