Automating ssh and scp across multiple hosts

Posted by Steve on Mon 2 Feb 2009 at 11:14

If you're like me you'll run Debian GNU/Linux upon a number of hosts and at times you'd like to run a command or two upon all of those hosts. There are several ways you can accomplish this, ranging from manually connecting to each host in turn, to the more complex solutions such as CFEngine or Puppet. Midway between the two you can use pssh to run commands upon multiple hosts.

The pssh package is one of a number of tools which allows you to perform SSH connections in parallel across a number of machines.

Searching the Debian package repository shows several similar solutions:

me@home:~$ apt-cache search cluster ssh
clusterssh - administer multiple ssh or rsh shells simultaneously
dish - the diligence/distributed shell for parallel sysadmin
dsh - dancer's shell, or distributed shell
kanif - cluster management and administration swiss army knife
libtaktuk2 - C bindings for taktuk
libtaktuk2-dev - C bindings for taktuk (development files)
pssh - Parallel versions of SSH-based tools
taktuk - efficient, large scale, parallel remote execution of commands

Of the tools I've only used pssh, so I'm unable to compare and contrast the alternatives. Still if you're in the position where you've got SSH access secured via private keys and wish to run multiple commands remotely, or copy files, then it may be ideal for you.

Once installed the pssh package installs a number of new commands:

parallel-slurp

This command allows you to copy files from multipl remote hosts to the local system. We'll demonstrate the usage shortly.

parallel-ssh

This command allows you to run commands upon a number of systems in parallel. We'll also demonstrate this command shortly.

parallel-nuke

This command likes you kill processes on multiple remote systems.

parallel-scp

This is the opposite of parallel-slirp and allows you to copy a file, or files, to multiple remote systems.

General Usage

Each of the new commands installed by the pssh package will expect to read a list of hostnames from a text file. This makes automated usage a little bit more straightforward, and simplifies the command-line parsing.

Running Commands On Multiple Hosts

The most basic usage is to simply run a command upon each host, and not report upon the output. For example given the file hosts.txt containing a number of hostnames we can run:

me@home:~$ parallel-ssh  -h hosts.txt uptime
[1] 18:29:35 [SUCCESS] gold.my.flat 22
[2] 18:29:35 [SUCCESS] silver.my.flat 22

This command didn't show us the output of the "uptime" command as we'd expect. To do that you need to add the "-i" (inline) flag:

me@home:~$ parallel-ssh -i -h hosts.txt uptime
[1] 18:30:29 [SUCCESS] gold.my.flat 22
 18:30:29 up 36 days,  5:21,  5 users,  load average: 0.39, 0.24, 0.23
[2] 18:30:29 [SUCCESS] silver.my.flat 22
 18:30:29 up 36 days,  4:45,  1 user,  load average: 0.04, 0.02, 0.00

For most simple commands this will be fine, but for times when you wish to actually collect the output that is possible. Simply specify an output path with "-o path" - when you do this the output for each system will be written to a file:

me@home:~$ parallel-ssh -o uptime.out -h hosts.txt uptime
[1] 18:32:30 [SUCCESS] gold.my.flat 22
[2] 18:32:30 [SUCCESS] silver.my.flat 22
me@home:~$ ls uptime.output/
gold.my.flat  silver.my.flat
me@home:~$ cat uptime.out/silver.my.flat
 18:32:30 up 36 days,  4:47,  1 user,  load average: 0.00, 0.01, 0.00

There are more options you can specify, such as the username to connect as, and the port number to use. To see these options just run parallel-ssh with no arguments.

Copying Files From Multiple Hosts

Copying files works it much the same was as the execution of commands we've just demonstrated. The only caveat is that you really do need to specify a local directory - so that the file copied from the remote host isn't repeatedly overwritten.

So, to copy the file /etc/motd from each host in our hosts.txt file to the local system we'd run this:

me@home:~$ parallel-slurp -h hosts.txt  -L local.dir /etc/motd  motd
[1] 18:39:39 [SUCCESS] gold.my.flat 22
[2] 18:39:39 [SUCCESS] silver.my.flat 22

This will give us the file /etc/motd copied to the local system with the name motd inside the directory local.dir:

me@home:~$ tree local.dir/
local.dir/
|-- gold.my.flat
|   `-- motd
`-- silver.my.flat
    `-- motd

2 directories, 2 files

As you can see there has been one level of subdirectory created for each hostname we copied from.

There are also several more options that might be useful to explore with this tool, including the -r (recursive) option which should be familiar to you from the scp command itself.

Note that parallel-slurp only copies from (multiple) systems. To copy files to to (multiple) systems you'll want to use the parallel-scp tool.

In conclusion this tool is worth exploring if you manage multiple systems and already have SSH key-based authentication setup.


This article can be found online at the Debian Administration website at the following bookmarkable URL (along with associated comments):

This article is copyright 2009 Steve - please ask for permission to republish or translate.