Updating multiple machines on low bandwidth

Posted by Steve on Fri 16 Jun 2006 at 11:40

There are situations where it is common to want to update multiple machines running Debian GNU/Linux whilst minimizing the bandwidth used for downloading packages and updates. There are several different solutions for this problem and here we'll look at one of them: apt-proxy.

In my home setup I have three machines all running Debians unstable distribution sid. It is wasteful to have each of these machines download the latest packages from the network especially considering that each host contains an almost identical list of installed packages.

One of the simplest solutions is to setup a caching proxy server which will be used to fetch packages by each host. This will ensure that packages are downloaded from the network the first time they are requested, then when the next two machines come to request the same package it will be fetched from the cache - not using up any external bandwidth at all!

There are several proxies which are included in the Debian distribution, the one I like best is the apt-proxy package.

Installing the package upon a single host is very straightforward:

root@itchy:~# apt-get install apt-proxy

Once installed you can configure the software by editing the file /etc/apt-proxy/apt-proxy-v2.conf. In most environments you'll be fine with the defaults.

The main things you might consider changing are the port number the server listens upon, 9999 by default, and the location upon the host where the .deb files will be cached. These can be changed by the following entries in the configuration file:

;; Server port to listen on
port = 9999

;; Cache directory for apt-proxy
cache_dir = /var/cache/apt-proxy

(The cached files are stored in the same "pool structure" as they would be on Debians mirrors, so choosing to save them to /var/cache/apt/archives which might seem sensible won't do what you might expect.)

If you do choose to made some changes you'll need to restart the server to make them take effect:

root@itchy:~# /etc/init.d/apt-proxy restart
Stopping apt-proxy [wait 1].
Starting apt-proxy.

Now that you've setup the proxy the next thing you must do is update your clients to actually use it. For each machine upon your LAN you need to update the sources.list file which apt-get uses to determine the download sources.

In my case the server I installed apt-proxy upon was called itchy (and each machine can find the IP address for that host) so I'll change each machines /etc/apt/sources.list file from this:

#
#  /etc/apt/sources.list
#

#
# Unstable
#
deb     http://ftp.uk.debian.org/debian sid main contrib non-free
deb-src http://ftp.uk.debian.org/debian sid main contrib non-free

To this:

#
#  /etc/apt/sources.list
#

#
# Unstable, via apt-proxy running on itchy.
#
deb     http://itchy.my.flat:9999/debian sid main contrib non-free
deb-src http://itchy.my.flat:9999/debian sid main contrib non-free

Once this is done running "apt-get update" on an updated machine looks like this:

root@desktop:~# apt-get update
Get: 1 http://itchy sid Release.gpg [189B]
Hit http://itchy sid Release
Ign http://itchy sid/main Packages/DiffIndex
Ign http://itchy sid/contrib Packages/DiffIndex
Ign http://itchy sid/non-free Packages/DiffIndex
Ign http://itchy sid/main Sources/DiffIndex
Ign http://itchy sid/contrib Sources/DiffIndex
Ign http://itchy sid/non-free Sources/DiffIndex
Hit http://itchy sid/main Packages
Hit http://itchy sid/contrib Packages
Hit http://itchy sid/non-free Packages
Hit http://itchy sid/main Sources
Hit http://itchy sid/contrib Sources
Hit http://itchy sid/non-free Sources
Fetched 189B in 3s (56B/s)
Reading package lists... Done

Here we see that we connected to itchy instead of ftp.uk.debian.org, and once we run "apt-get update" upon a machine we'll see the cached files appear on itchy.

Remember that the .deb files are cached to /var/cache/apt-proxy by default. Looking in that directory we can see:

root@itchy:~# ls /var/cache/apt-proxy/debian/pool/main/
a  d  g  j  liba  libe  libh  libm  libp  libt  libw  m  p  s  v  y
b  e  h  k  libc  libf  libi  libn  libr  libu  libx  n  q  t  w  z
c  f  i  l  libd  libg  libl  libo  libs  libv  liby  o  r  u  x

For example in the a/ directory we have:

root@itchy:~# ls /var/cache/apt-proxy/debian/pool/main/a/
aalib        alsa-lib  alsa-tools  apache2    apmd  apt-proxy  arts
alsa-driver  alsa-oss  alsa-utils  apachetop  apt   aptitude   autoconf

We can see the total space currently in use with the du command, with appropriate arguments:

root@itchy:~# du  --total --human-readable /var/cache/apt-proxy/ | grep total
762M    total

That represents a bandwidth saving of almost 2Gb! (Considering that most of the packages in the cache would have been downloaded three times were the cache not in place. Not 100% since the package lists upon the hosts do differ somewhat.)

The apt-proxy installation can also be used to cache the downloaded packages used by debootstrap and pbuilder if you use either of those tools. See /usr/share/doc/apt-proxy/README.gz for details.


This article can be found online at the Debian Administration website at the following bookmarkable URL:

This article is copyright 2006 Steve - please ask for permission to republish or translate.