Avoiding greedy webclients with mod_bwshare
Posted by Steve on Fri 3 Nov 2006 at 09:36
If you're running a popular website you'll most likely notice that some clients are less well-behaved than others. Greedy clients can do anything from make numerous requests, to attempting to spider your entire site. One simple way of preventing these clients from slowing down your server is with the mod_bwshare module for Apache2.
Unfortunately mod_bwshare is not yet packaged for Debian GNU/Linux however installing it from the source code is very straightforward.
First of all download the latest source code from the bwshare website.
Note
Annoyingly the site is setup so you have to accept a cookie before you can view it. Still it is a small price to pay for such good code.
Once you have the code you'll need to ensure that you have a compiler, etc, installed. This might not be an issue for most people but I try to avoid installing compilers upon production servers so if you're like me you'll need to install one first of all:
root@secret:~# apt-get install build-essential unzip
In addition to the compiler you've just installed, if required, you'll also need the Apache2 development libraries and headers. You can install these by running:
root@secret:~# apt-get install apache2-dev
Now you're ready to begin the installation. Unpack your downloaded source code somewhere :
secret:~# unzip mod_bwshare-0.2.0.zip Archive: mod_bwshare-0.2.0.zip creating: mod_bwshare-0.2.0/ inflating: mod_bwshare-0.2.0/config.m4 .. .. inflating: mod_bwshare-0.2.0/mod_bwshare.c secret:~# cd mod_bwshare-0.2.0 secret:~/mod_bwshare-0.2.0#
Now using the apxs2 command you can compile and install the module in one easy command:
secret:~/mod_bwshare-0.2.0# apxs2 -cia mod_bwshare.c /usr/bin/libtool .... ... ... ... ---------------------------------------------------------------------- Libraries have been installed in: /usr/lib/apache2/modules ... ... [activating module `bwshare' in /etc/apache2/httpd.conf]
If all goes well the module will be compiled and the following line will be automatically added to /etc/apache2/httpd.conf:
LoadModule bwshare_module /usr/lib/apache2/modules/mod_bwshare.so
If we were following the standard Debian mechanism for working with Apache2 sites & modules we would example to use the file /etc/apache2/mods-available/bwshare.load for loading the module. However since we're compiling from source I do like to keep things obviously different, so in this case we'll leave things as-is.
Before we restart our Apache2 server we should actually configure the module, since so far we've only caused it to be loaded when Apache2 (re)starts.
A minimal configuration can be placed in the file /etc/apache2/conf.d/bwshare.conf:
<IfModule mod_bwshare.c>
# 1.
<Location /bwshare-info>
SetHandler bwshare-info
</Location>
# 2.
<Location /bwshare-trace>
SetHandler bwshare-trace
</Location>
# 3.
# Some bandwidth control parameters.
<Directory />
BW_tx1debt_max 30
BW_tx1cred_rate 0.095
BW_tx2debt_max 3000000
BW_tx2cred_rate 2500
</Directory>
</IfModule>
This snippet does three things:
- Sets up a handler so that we can query information about the active client connections.
- Sets up another handler for viewing information about active/recent client connections.
- Sets up the limits which apply to / directory and locations beneath that.
Now that we've enabled and configured the module we can restart our server with the following command:
secret:~# /etc/init.d/apache2 reload Reloading web server config...done. secret:~#
All being well your Apache2 server should restart and you should be able to view interesting information at the following two URIs:
- http://yourserver.example.com/bwshare-info
- http://yourserver.example.com/bwshare-trace
Now we need to explain what the various magic numbers were which we used for the controlling. If you look back at the configuration snippet we entered you'll see various names and numbers being used. Here is what they mean:
- BW_tx1cred_rate
This sets the maximum rate of serving files (files/second).
- BW_tx1debt_max
This sets the maximum files to serve in excess of BW_tx1cred_rate (files)
- BW_tx2cred_rate
This sets the maximum rate of serving bytes (bytes/second).
- BW_tx2debt_max
This sets the maximum bytes to serve in excess of BW_tx2cred_rate (bytes).
This might be a little bit hard to understand, but a little trial and error should make it clearer.
Open your main site in a browser and repeatedly hit "Reload" a lot and you'll see that after a while you will receive an error page "You've been greedy .. your next request will be honoured after XX seconds".
Once you've paused for the correct timeout take a look at the /bwshare-trace location and you'll see the figures that have been used - with the exceeded columns being coloured red.
Tweak the numbers as you see fit if the limits are either too high or two low.
More details are available at the bwshare homepage - and there are more options you can experiment with, for example enabling/disabling the module on a per-virtualhost basis.
I'm very pleased with the performance and stability of the software and have found it an extremely simple way of avoiding badly behaving clients from killing my server(s).
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
What I'd usually suggest is adding an ACL like this:
<Location /bwshare-trace>
SetHandler bwshare-trace
order deny,allow
deny from all
allow from 127.0.0.1
</Location>
I just left that out in the interests of simplicity.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
You could start by taking a look at:
- mod_bwshare (you could just use this to limit the transfer speed and ignore the other settings.)
- mod_cband
- mod_throttle
- mod_bandwidth
- etc.
Personally I believe that if somebody is being abusive it makes more sense to block them, and explain it to them, rather than let them mirror your contents very slowly. I guess.
[ Parent | Reply to this comment ]
It suits my purpuses better then this one.
[ Parent | Reply to this comment ]
But creating a user account and clicking through a few pages I quickly got stopped and had to wait 60+ seconds before I could try another request. I think the threshold should at least be high enough to allow for normal browsing.
I am also using kontact to receive the RSS feed for this site, and this module breaks my summary view for debian-administration, forcing me to restart kontact if I want to see the headlines of the latest stories.
Wouldn't it be better to slow down requests iff the server is terribly busy, and to never block requests altogether?
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
I will increase the limits since you seem to have been hit legitimately.
Still you'd be amazed the number of RSS readers who want to poll the feeds every minute - I definitely need something in place to avoid that kind of abuse.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]