Saving bandwidth when serving files with Apache2

Posted by Steve on Wed 11 May 2005 at 18:29

If you run a website which is mostly serving static files you can save a lot of bandwidth, in exchange for a slightly higher CPU load, by compressing the files you send to visitors. Both Apache and Apache2 allow this to be setup easily, although the method differs. Here we'll explain how this is achived with Apache2.

Traditionally mod_gzip has been used to compress files which are sent to client browsers on Debian platforms, and this process is well understood. But with the advent of Apache2 this module is no longer available, and a new technique is required.

For the Debian Apache2 package the alternative is to use mod_deflate.

The principle of both modules is the same, if a visiting web browser requests a page and tells the server that it supports compression then each of the documents which are sent back to that client will be compressed.

Naturally compressing documents as they are being served is going to result in a higher CPU usage than merely sending "plain" documents. But when the bandwidth savings are significant this tradeoff can be worth it.

To avoid compressing files which wouldn't benefit from this both modules allow you to exclude particular filetypes. This avoids trying to compress image files, which are typically compressed already.

If you've installed one of the Apache2 packages then you'll already have mod_deflate available, although it is not enabled by default.

To use it you must enable it, and then setup the configuration directives.

To enable an arbitary module with Apache2 you run:

a2enmod "modulename"

(a2enmod stands for "Apache2 enable module").

So to enable the deflate module you should run, as root:

root@lappy:~# a2enmod deflate
Module deflate installed; run /etc/init.d/apache2 force-reload to enable.

To complete this job you should reload the server as the output message informs you. However we still need to setup the configuration options, so we'll not restart it just yet.

To configure the module in a basic fashion you merely need to give it a list of the type of files you wish to serve compressed (if the client browser supports it.)

You can do this by adding the following example line to your global server configuration section, or to your virtual host in /etc/apache2/sites-enabled/*:

AddOutputFilterByType DEFLATE text/html text/plain text/xml

This informs the module that three MIME-types will be compressed, HTML files, Text files, and XML files.

If you add this to the "global" section then it will apply to all virtual hosts upon the server. Applying it to only one virtual host at a time is a useful thing to do, as it allows you to quickly see if the CPU overhead for the compression outweighs the benefits.

To allow you to keep track of the compression also add the following:

DeflateFilterNote Input instream
DeflateFilterNote Output outstream
DeflateFilterNote Ratio ratio

LogFormat '"%r" %{outstream}n/%{instream}n (%{ratio}n%%)' deflate
CustomLog /var/log/apache2/deflate_log deflate

This will log the deflation ratios to the file /var/log/apache2/deflate_log. A sample log looks something like this:

"GET / HTTP/1.0" 4407/19057 (23%)

Here we served only 4407 bytes to the client, instead of 19057 bytes. That's a compression ratio of 23%. Not bad for ten minutes work!

There is one caveat to note. Some older browsers don't support compressed documents unless they are HTML files. To avoid this you should insert the following configuration directives:

BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

This should identify older version of Netscape Navigator and disable it for them. (The final line recognises Internet Explorer which also sends 'Mozilla/4' as part of it's user-agent string - but can handle the compression properly.

If you wish to tweak your setup further there is comprehensive documention on Apache's mod_deflate page.



Posted by sabin (62.99.xx.xx) on Fri 13 May 2005 at 05:08
[ View sabin's Scratchpad | View Weblogs ]
would you recommand it as well, even if I run a mysql based website which gets updated 2-4 times a week? it sounds pretty cool so I wanted to try that using apache2 on sarge.

./sabin -s

[ Parent | Reply to this comment ]

Posted by Steve (82.41.xx.xx) on Fri 13 May 2005 at 11:33
[ View Steve's Scratchpad | View Weblogs ]

It should still give you savings, but it does depend on the type of site you run - and whether you need it or not you'll only know if you start running out of bandwidth or CPU ;)


[ Parent | Reply to this comment ]

Posted by simonw (212.24.xx.xx) on Tue 22 Jul 2008 at 16:22
[ View Weblogs ]
Testing here says enable it! We have a Perl FastCGI type app that is pretty CPU heavy, but the compression saving means much smaller responses, and a much snappier feeling application results from enabling compression at both ends. Of course that could be that we are often bandwidth constrained, but we see it as snappier even on a LAN connection to the server, I'm guessing ADSL users will see even bigger benefits.

The config given works fine for dynamic content, but it doesn't work in Debian Sarge (I know ancient history), this seems to be a bug in the specific versions of Apache modules shipped in Sarge, Etch is fine. Just one server left to upgrade.... just the really complicated one left!

[ Parent | Reply to this comment ]

Posted by mahdi (217.113.xx.xx) on Mon 6 Jun 2005 at 23:05
If you are serving static files (by static I mean really static—without any server side scripting) there is better option—Content Negotiation. This way you have best of both worlds—low CPU usage and bandwidth savings.

Just make sure you have MultiViews option enabled, and. . . you need to do little trickery :).

Let's suppose you have two files example.html and example.html.gz—in ideal world when browser asks for example.html and declares compression support, Apache2 should have found gzipped version and serve that one. But in this case Apache finds exact match, so doesn't use content negotiation at all and just gives back founded file (at least my little test suggests that).

Of course I'm writing this comment because there's workaround—rename your files to example.html.en and example.html.en.gz (or other language extension that reflects to document contents) but on other pages still refer to it by name example.html. Now Apache cannot find such file so starts content negotiation, finds both versions and chooses smaller one. Everything works fine... not exactly. What if user doesn't have en in browser's preferred languages list? Apache will return "406 Not Acceptable" and asks user which (compressed or not) document he prefers. To prevent this you need to force language preference by ForceLanguagePriority Fallback or ForceLanguagePriority Prefer directive (check which one suits you better). Now everything should work fine (ok, only when en is in Apache's LanguagePriority list :).

It can also be done other way—stay with default example.html and example.html.gz names and change references to them to not contain .html extension to force content negotiation. But for me mass renaming looks like easier way than parsing all the files that may contain links for change.

I should add that this is only roughly tested and I've never used it in real server. I had just thought about this method while reading the article and decided to do fast check if it works at all. So maybe something else should also be enabled or disabled to make it work properly in all cases. Anyway—looks like the article above should start with If you run website which is mostly serving dynamic files. . . (because for static pages you have better method ;-)

I've just almost forgot about one drawback—price for such method is of course increased disk usage (you need both compressed and uncompressed files to be stored).

[ Parent | Reply to this comment ]

Posted by Anonymous (145.94.xx.xx) on Wed 5 Oct 2005 at 23:41
Excellent tips! It worked in less than five minutes. Perhaps you should give some more pointers for the less experienced where to put the lines exactly. Kudos to you for saving me some bandwidth ;-).

[ Parent | Reply to this comment ]

Posted by Anonymous (66.41.xx.xx) on Fri 28 Oct 2005 at 08:29

I get the log created but nothing is added to it.
I'm sure it does compress the targeted MIME contente but the log growing.
What do you think I've missed ?

Thanks in advance!

Affordable Stock Photography

[ Parent | Reply to this comment ]

Posted by Anonymous (81.231.xx.xx) on Sat 20 May 2006 at 22:19
Hey. How can i check if it works?

i get nothing in the error log or deflat log. The mod is enabled. something tells med that it sends gz but doesn't log that.. but im not sure.

Maybe i can se it in opera, firefox, wget if i get contents in gzip?

Thank you for very god articles. its realy helping me from format debian disks=)

[ Parent | Reply to this comment ]

Posted by Anonymous (80.203.xx.xx) on Thu 29 May 2008 at 18:41
You can test it here

Nils-Anders Nøttseter

[ Parent | Reply to this comment ]

Posted by Anonymous (114.48.xx.xx) on Tue 13 Jan 2009 at 07:42
Ditto here, the logging doesn't seem to work (results in an empty file) and I'm trying to track down why the the AddOutputFilterByType statement didn't work either for text/xml files. The blanket statement "SetOutputFilter DEFLATE" works, though (but it causes everything to be compressed)...

[ Parent | Reply to this comment ]

Sign In







Current Poll

What do you use for configuration management?

( 76 votes ~ 0 comments )



Related Links