Avoiding greedy webclients with mod_bwshare

Posted by Steve on Fri 3 Nov 2006 at 09:36

If you're running a popular website you'll most likely notice that some clients are less well-behaved than others. Greedy clients can do anything from make numerous requests, to attempting to spider your entire site. One simple way of preventing these clients from slowing down your server is with the mod_bwshare module for Apache2.

Unfortunately mod_bwshare is not yet packaged for Debian GNU/Linux however installing it from the source code is very straightforward.

First of all download the latest source code from the bwshare website.

Note

Annoyingly the site is setup so you have to accept a cookie before you can view it. Still it is a small price to pay for such good code.

Once you have the code you'll need to ensure that you have a compiler, etc, installed. This might not be an issue for most people but I try to avoid installing compilers upon production servers so if you're like me you'll need to install one first of all:

root@secret:~# apt-get install build-essential unzip

In addition to the compiler you've just installed, if required, you'll also need the Apache2 development libraries and headers. You can install these by running:

root@secret:~# apt-get install apache2-dev

Now you're ready to begin the installation. Unpack your downloaded source code somewhere :

secret:~# unzip mod_bwshare-0.2.0.zip
Archive:  mod_bwshare-0.2.0.zip
   creating: mod_bwshare-0.2.0/
  inflating: mod_bwshare-0.2.0/config.m4
  ..
  ..
  inflating: mod_bwshare-0.2.0/mod_bwshare.c
secret:~# cd mod_bwshare-0.2.0
secret:~/mod_bwshare-0.2.0#

Now using the apxs2 command you can compile and install the module in one easy command:

secret:~/mod_bwshare-0.2.0# apxs2 -cia mod_bwshare.c
/usr/bin/libtool ....
...
...
...
----------------------------------------------------------------------
Libraries have been installed in:
   /usr/lib/apache2/modules
...
...
[activating module `bwshare' in /etc/apache2/httpd.conf]

If all goes well the module will be compiled and the following line will be automatically added to /etc/apache2/httpd.conf:

LoadModule bwshare_module     /usr/lib/apache2/modules/mod_bwshare.so

If we were following the standard Debian mechanism for working with Apache2 sites & modules we would example to use the file /etc/apache2/mods-available/bwshare.load for loading the module. However since we're compiling from source I do like to keep things obviously different, so in this case we'll leave things as-is.

Before we restart our Apache2 server we should actually configure the module, since so far we've only caused it to be loaded when Apache2 (re)starts.

A minimal configuration can be placed in the file /etc/apache2/conf.d/bwshare.conf:

<IfModule mod_bwshare.c>
    # 1.
    <Location /bwshare-info>
        SetHandler bwshare-info
    </Location>

    # 2.
    <Location /bwshare-trace>
        SetHandler bwshare-trace
    </Location>

    # 3.
    # Some bandwidth control parameters.
    <Directory />
        BW_tx1debt_max          30
        BW_tx1cred_rate         0.095
        BW_tx2debt_max          3000000
        BW_tx2cred_rate         2500
    </Directory>
</IfModule>

This snippet does three things:

  1. Sets up a handler so that we can query information about the active client connections.
  2. Sets up another handler for viewing information about active/recent client connections.
  3. Sets up the limits which apply to / directory and locations beneath that.

Now that we've enabled and configured the module we can restart our server with the following command:

secret:~# /etc/init.d/apache2 reload
Reloading web server config...done.
secret:~#

All being well your Apache2 server should restart and you should be able to view interesting information at the following two URIs:

  • http://yourserver.example.com/bwshare-info
  • http://yourserver.example.com/bwshare-trace

Now we need to explain what the various magic numbers were which we used for the controlling. If you look back at the configuration snippet we entered you'll see various names and numbers being used. Here is what they mean:

BW_tx1cred_rate

This sets the maximum rate of serving files (files/second).

BW_tx1debt_max

This sets the maximum files to serve in excess of BW_tx1cred_rate (files)

BW_tx2cred_rate

This sets the maximum rate of serving bytes (bytes/second).

BW_tx2debt_max

This sets the maximum bytes to serve in excess of BW_tx2cred_rate (bytes).

This might be a little bit hard to understand, but a little trial and error should make it clearer.

Open your main site in a browser and repeatedly hit "Reload" a lot and you'll see that after a while you will receive an error page "You've been greedy .. your next request will be honoured after XX seconds".

Once you've paused for the correct timeout take a look at the /bwshare-trace location and you'll see the figures that have been used - with the exceeded columns being coloured red.

Tweak the numbers as you see fit if the limits are either too high or two low.

More details are available at the bwshare homepage - and there are more options you can experiment with, for example enabling/disabling the module on a per-virtualhost basis.

I'm very pleased with the performance and stability of the software and have found it an extremely simple way of avoiding badly behaving clients from killing my server(s).

 

 


Posted by Anonymous (62.49.xx.xx) on Fri 3 Nov 2006 at 11:34
You might want to add passwords to the bwshare-info and trace pages.

[ Parent | Reply to this comment ]

Posted by Steve (80.68.xx.xx) on Fri 3 Nov 2006 at 11:38
[ View Steve's Scratchpad | View Weblogs ]

What I'd usually suggest is adding an ACL like this:

<Location /bwshare-trace>
    SetHandler bwshare-trace
    order deny,allow
    deny from all
    allow from 127.0.0.1
</Location>

I just left that out in the interests of simplicity.

Steve

[ Parent | Reply to this comment ]

Posted by JoshTriplett (66.93.xx.xx) on Fri 3 Nov 2006 at 21:19
Any way to just throttle them down, rather than responding with an error and forcing them to retry? I mostly just want a module which would move from the model where every request gets an equal amount of bandwidth, to a model where every user gets an equal amount of bandwidth spread over all their requests.

[ Parent | Reply to this comment ]

Posted by Steve (62.30.xx.xx) on Sat 4 Nov 2006 at 19:38
[ View Steve's Scratchpad | View Weblogs ]

You could start by taking a look at:

  • mod_bwshare (you could just use this to limit the transfer speed and ignore the other settings.)
  • mod_cband
  • mod_throttle
  • mod_bandwidth
  • etc.

Personally I believe that if somebody is being abusive it makes more sense to block them, and explain it to them, rather than let them mirror your contents very slowly. I guess.

Steve

[ Parent | Reply to this comment ]

Posted by Anonymous (81.216.xx.xx) on Thu 30 Nov 2006 at 22:59
There's another nice one at http://ivn.cl/apache called bw_mod which supports throttling per vhost.

It suits my purpuses better then this one.

[ Parent | Reply to this comment ]

Posted by superbrose (87.113.xx.xx) on Sat 23 Dec 2006 at 19:27
I think at least here on debian-administration mod_bwshare is too restrictive. I am not a spider and not abusive - I am just a human!

But creating a user account and clicking through a few pages I quickly got stopped and had to wait 60+ seconds before I could try another request. I think the threshold should at least be high enough to allow for normal browsing.

I am also using kontact to receive the RSS feed for this site, and this module breaks my summary view for debian-administration, forcing me to restart kontact if I want to see the headlines of the latest stories.

Wouldn't it be better to slow down requests iff the server is terribly busy, and to never block requests altogether?

[ Parent | Reply to this comment ]

Posted by Steve (62.30.xx.xx) on Sun 24 Dec 2006 at 12:49
[ View Steve's Scratchpad | View Weblogs ]

I will increase the limits since you seem to have been hit legitimately.

Still you'd be amazed the number of RSS readers who want to poll the feeds every minute - I definitely need something in place to avoid that kind of abuse.

Steve

[ Parent | Reply to this comment ]

Posted by superbrose (87.112.xx.xx) on Sun 24 Dec 2006 at 18:43
Thanks a lot. I fully understand your reasoning.

[ Parent | Reply to this comment ]

Sign In

Username:

Password:

[Register|Advanced]

 

Flattr

 

Current Poll

What do you use for configuration management?








( 470 votes ~ 5 comments )