Transparent proxies via Squid.

Posted by Steve on Tue 4 Jan 2005 at 16:12

If you've setup a Linux machine as a gateway, standing between you and the internet then you have a lot of options for tweaking it. One of the most common things to setup is transparent proxying via squid.

If you've followed the previous guide on setting up a Linux gateway you will have one machine with two network interfaces, one for internal use (eth0) and one which is publically connected to the internet (eth1).

This gateway machine has a collection of firewall rules which:

If we wish to setup a transparent proxy server to cache web pages - which would speed up browsing for those machines behind the gateway we need to do two things:

The first is simple. As root run:

apt-get install squid

This will install the Squid caching proxy server. This is configured by the file /etc/squid/squid.conf and we will need to make several changes to it.

First of all we need to tell it that we only wish it to listen on the internal interface. Remember that this gateway machine has two networking interfaces, one for the internal LAN and one for the internet.

Pick the one which is internal and add it to the configuration file as follows:

http_port 127.0.0.1:8080
http_port 192.168.1.1:8080

(In my case the internal address is 192.168.1.1, allowing the server to listen on the "loopback" address of 127.0.0.1 is a good idea too, and will be required later).

As well as that we'll need to tell the server what it's hostname is, and which email address is in charge of it for error display, etc:

visible_hostname gateway.my.flat
cache_mgr        proxy@foo.com

We also need to add the support for the transparent proxying we will be using:

httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on

The only remaining thing we need to do is to tell Squid which networks are allowed to connect to our proxy server, without this it will refuse all incoming requests.

For a network which is 192.168.1.x internally the following will be fine:

# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS

# Example rule allowing access from your local networks. Adapt
# to list your (internal) IP networks from where browsing should
# be allowed
acl our_networks src 192.168.1.0/24
http_access allow our_networks
http_access allow localhost

# And finally deny all other access to this proxy
http_access deny all

If you're using a different network internally then you will need to adjust the addresses appropriately.

That's all the squid setup complete, so now we restart it:

/etc/init.d/squid restart

Now we have a caching proxy server - which you is listening on 192.168.1.1:8080. If you were to enter that into your browser you should see it working - but what we are going to do next is make it transparent.

Nobody behind the gateway should need to do anything, instead it should just magically work (tm ;)

The way we do that is to add a rule to the firewal, which will redirect outgoing requests to the web (port 80) to instead go via the proxy server we've setup on the gateway machine on port 8080.

Add the following towards the end of your firewall rules:

# Transparent proxying
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 
8080

This says that anything coming from the internal interface (eth0) which has a destination port of 80 (web) should be redirected to the new squid installation we've made on port 8080.

Reload your firewall and you should have it in place.

To test it you simply need to watch the squid logfile as you browse the web from another machine.

On the gateway machine you can run:

tail -f /var/log/squid/access.log

If you see something like the following when you surf the web on a machine upon your LAN you know it worked:

1104854410.086    159 192.168.1.50 TCP_MISS/302 469 GET http://www.google.com/ -
 DIRECT/216.239.59.104 text/html
1104854410.217    128 192.168.1.50 TCP_MISS/200 1459 GET http://www.google.co.uk
/ - DIRECT/216.239.59.99 text/html
1104854410.397    180 192.168.1.50 TCP_MISS/200 9022 GET http://www.google.co.uk
/intl/en_uk/images/logo.gif - DIRECT/216.239.59.99 image/gif
1104854415.196    200 192.168.1.50 TCP_MISS/200 1459 GET http://www.google.co.uk
/ - DIRECT/216.239.59.99 text/html
1104854415.271     74 192.168.1.50 TCP_REFRESH_HIT/304 235 GET http://www.google
.co.uk/intl/en_uk/images/logo.gif - DIRECT/216.239.59.99 text/html

This article can be found online at the Debian Administration website at the following bookmarkable URL (along with associated comments):

This article is copyright 2005 Steve - please ask for permission to republish or translate.