Using strace to debug application errors

Posted by Steve on Wed 15 Feb 2006 at 11:58

Recently I inheritted ownership of an SVN server which was misbehaving. Trying to determine why it wasn't working correctly involved a few hours of testing, careful thought, and caffeine. Eventually I got it working correctly using the often-overlooked tool strace.

strace is a common tool upon many GNU/Linux systems including Debian. Put simply strace is a "system call tracer" - which is where it gets its name from.

Using strace, as root, you can monitor the system calls made by any process upon your system. This can be enormously beneficial when you have a misbehaving program.

If you don't have it installed you can install it as easily with apt-get (or aptitude):

root@itchy:~# apt-get install strace
Reading package lists... Done
Building dependency tree... Done
The following NEW packages will be installed
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 92.7kB of archives.
After unpacking 213kB of additional disk space will be used.
Get: 1 sid/main strace 4.5.12-1 [92.7kB]
Fetched 92.7kB in 1s (81.8kB/s)
Selecting previously deselected package strace.
(Reading database ... 73032 files and directories currently installed.)
Unpacking strace (from .../strace_4.5.12-1_i386.deb) ...
Setting up strace (4.5.12-1) ...

Now that you have it installed you can experiment with using it. The usage is:

strace [strace options] program [program arguments]

You run strace giving it the name of a binary to run, and any arguments that binary might take. As a simple example take a look at running /usr/bin/uptime via strace:

strace /usr/bin/uptime

Output will immediately fill your screen, and will probably look something like this:

execve("/usr/bin/uptime", ["uptime"], [/* 15 vars */]) = 0
uname({sys="Linux", node="itchy", ...}) = 0
brk(0)                                  = 0x804a000
access("/etc/", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/", R_OK)      = -1 ENOENT (No such file or directory)
open("/proc/uptime", O_RDONLY)          = 3
lseek(3, 0, SEEK_SET)                   = 0
read(3, "114232.96 111223.25\n", 1023)  = 20
access("/var/run/utmpx", F_OK)          = -1 ENOENT (No such file or directory)
open("/var/run/utmp", O_RDWR)           = 4
fcntl64(4, F_GETFD)                     = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
_llseek(4, 0, [0], SEEK_SET)            = 0
open("/proc/loadavg", O_RDONLY)         = 4
lseek(4, 0, SEEK_SET)                   = 0
read(4, "0.03 0.07 0.02 1/122 11930\n", 1023) = 27
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fe8000
write(1, " 19:37:34 up 1 day,  7:43,  3 us"..., 69) = 69
munmap(0xb7fe8000, 4096)                = 0
exit_group(0)                           = ?

I've snipped the loading of several library files, but even with that truncated output you can see a lot of actions taking place such as:

  • The opening of files.
  • The return codes of things which failed.
  • The writing of text to the console.

Each of the lines of output we see correspond to the invocation of a particular kernel system call. These system-calls are the primitives which are used by the C runtime, and are the kernels interface to userspace.

For example the "open" systemcall is used to open files, etc. We can see several details of how the uname binary operates - because we can see that it opens several local files for reading and writing. Some of these fail and others succeed:

  • It tries to access /etc/ - which fails
  • It tries to acesss /etc/ - which also fails.
  • It opens /proc/loadavg which succeeds and returns the filehandle "4".
  • It writes to filehandle 1 the string "19:37...." which is the output of the program.

Because the output can be quite unwieldy dumped to the console it is common to redirect this text to a file, via the -o option. For example to record the system calls used by the ls command to the file /tmp/ls-trace we can execute:

strace -o /tmp/ls-trace /bin/ls

Another common option is the "-p", or PID, option. This allows you to connect to a running program and see its output. This is useful in the case of long-running daemons which you cannot restart easily - or which only need to be monitored very rarely.

For more details of the available options please see the manpage by running "man strace".

Returning back to the subversion problem we'll see how useful strace was to me. The basic problem was simple enough:

  • An SVN server was setup in the dim and distant past.
  • Clients running Microsoft Windows connected to this repository via the tortoise SVN client.
  • Randomly SVN clients would "hang" and once they did so nobody could access the SVN server again - not just the hung client all clients in the office would be unable to use SVN.
  • Rebooting the server "fixed" it, but not permanently.

Initially I tried looking for logfiles, but there weren't any.

Once I excluded that I tried to see how things were setup. The basic installation seemed simple enough. Clients had SSH access and the SVN client appeared to essentially run:

ssh username@svnRepository svnserve

This svnserve process was responsible for actually conducting the operations, and for some reason was hanging. Since there were no useful debugging options that I could see for altering svnserve I resorted to a hack.

Rather than asking the clients to adjust their setup to add logging, or verbosity (which might not have shown the problem anyway) I moved the svnserve binary to a safe name:

# mv /usr/bin/svnserve /usr/bin/svnserve.real

Once that was done I wrote small shell script which would run the real binary under strace so I could try to see why it was hanging, or stalling:

# temporary wrapper for svnserve, log output to /tmp
strace -o/tmp/svnserve.$$ /usr/bin/svnserve.real $*

This script, when made executable, meant that clients could still connect and run the svnserve command they expected to use - but that now I'd have a strace logfile generated in /tmp for each invocation.

Note that there is no quoting of the arguments to the child process - something that wasn't an issue here.

After a few clients had connected I could immediately see the problem as each logfile ended with the lines:

open("/dev/random", O_RDONLY)          = 5

Here we see that the client attempted to open the file /dev/random to read some random data - this open succeeded and the client was given the filehandle number 5. Unfortunately the reading of this filehandle failed. (We can see that it failed since the read call didn't return.)

/dev/random stalls if there isn't enough entropy available to generate a random number stream. This explains why rebooting the host fixed the problem temporarily - since rebooting would cause this pool of entropy to fill as the system started back up.

A real solution to this problem would be to investigate why there wasn't enough entropy available, but as a simple "hotfix" to allow things to work it was sufficient to link /dev/random to /dev/urandom - which doesnt stall when entropy isn't available.

In many cases this might give you security weaknesses, especially if you need a good random number stream, but for this particular server I don't believe there to be any significant problems.

Tracking down this problem and creating a fix would have been almost impossible without the ability to use strace or a debugger - and using strace is much simpler.

The idea of using a shell script as a wrapper around commands which you cannot modify is also a neat thing to remember. I first saw it used by a colleague a couple of years ago and thought it was an inspired solution at the time, and still do now..

(Just remember to put things back the way they were when you're done!)



Posted by grifferz (82.44.xx.xx) on Wed 15 Feb 2006 at 13:31
Let me guess, are you running this in a Xen domain?

I have a related issue of my own around entropy. One of my Xen domains based on etch occasionally has its sshd die. Last time I ran it foreground in debug mode to try and figure out why and I see a fatal error regarding not enough entropy in /dev/random.

Digging about a bit I see that /dev/random gets entropy from mouse, keyboard, display, audio and the disk subsystem. I think that since a Xen domain has no disk subsystem it may not get anything from there. It also has none of the other devices, so I can see why it might run low.

None of my other domains do this, but they are all sarge. Sure enough, etch has a much more up to date openssh. Something may have changed in openssh to make it more paranoid about entropy.

I would be interested in any possible solutions for this that do not involve installing a non-Debian package or recompiling a debian package.

egd doesn't appear to be packaged for debian and even if it was I think openssh would need to be recompiled to use it.


[ Parent | Reply to this comment ]

Posted by Steve (82.41.xx.xx) on Wed 15 Feb 2006 at 13:37
[ View Steve's Scratchpad | View Weblogs ]

Actually no this was on a Redhat Enterprise Server 3 machine.

Doing some googling afterwards I found a related RedHat bug report filed against a slightly earlier kernel concerning the failure to update the entropy pool once it had been depleted.

When I get back to work tomorrow I'll post the bug number for future reference.

I find the etch problem you describe strange since as far as the running kernel is concerned it does have a local disk. (Whatever is setup in /etc/xen/foo.cfg will be the device used.)

Since I mostly run Sarge under Xen I've not been bothered by this myself.Steve

[ Parent | Reply to this comment ]

Posted by grifferz (82.44.xx.xx) on Wed 15 Feb 2006 at 13:41
I suspect the entropy is updated based on receipt of hardware interrupts and I don't think a domU kernel is going to get those when it doesn't have any hardware. Just because it has an etho and a /dev/sda doesn't mean it has an ethernet card or scsi. Disable all ethernet and scsi drivers and it still works.

But I could be wrong. :)


[ Parent | Reply to this comment ]

Posted by Steve (82.41.xx.xx) on Wed 15 Feb 2006 at 13:43
[ View Steve's Scratchpad | View Weblogs ]

Ahh that's a good point!

You may well be right.


[ Parent | Reply to this comment ]

Posted by Anonymous (84.161.xx.xx) on Thu 16 Feb 2006 at 01:02
ln /dev/zero /dev/random

[ Parent | Reply to this comment ]

Posted by spiney (85.124.xx.xx) on Thu 16 Feb 2006 at 07:10
Kids, don't do this at home. ;)
Debian GNU/Linux on an IBM Thinkpad T43p

[ Parent | Reply to this comment ]

Posted by k8to (64.142.xx.xx) on Wed 15 Feb 2006 at 13:32
If anything I overuse strace. It's often one of the first things I reach for when a program is behaving mysteriously.

Applications which don't seem to be obeying the configfile? I strace to see if they're even looking for it as documented--they usually aren't.

Programs which spit out useless 'permission denied' without specifying what permission failed get an strace pretty much immediately.

I love seeing what that misbehaving program is really doing.

[ Parent | Reply to this comment ]

Posted by Steve (82.41.xx.xx) on Wed 15 Feb 2006 at 13:40
[ View Steve's Scratchpad | View Weblogs ]

I've done found myself doing that too!

I think the last time I used strace in a useful manner was when I'd tried to install a bunch of perl modules from one host to another - and I'd used rsync.

Once the transfer was over it turned out that none of the modules using a shared library would work - and it was my fault for not telling rsync to preserve permissions. Somehow I'd failed to spot this for around 40 minutes, despite looking at the .so files a lot.


[ Parent | Reply to this comment ]

Posted by mdione (201.231.xx.xx) on Fri 17 Feb 2006 at 12:13
You can also add the -ff option when using the -o one. This will make strace to generate on file per each process spawned by the command you analize. It's simpler than following in your head who's doing what by their PID.

[ Parent | Reply to this comment ]

Posted by hq4ever (87.69.xx.xx) on Fri 17 Feb 2006 at 15:59
By default strace will output to stderr (FD 2).
For a quick program analyzes like to pipe the output to grep and filter by open

The way to do this is by redirecting the stderr of strace back to stdout:

$ strace /usr/bin/uptime 2>&1 | grep open
open("/etc/", O_RDONLY) = 3
open("/lib/", O_RDONLY) = 3
open("/lib/tls/", O_RDONLY) = 3
open("/etc/mtab", O_RDONLY) = 3
open("/proc/stat", O_RDONLY) = 3
open("/etc/localtime", O_RDONLY) = 3
open("/proc/uptime", O_RDONLY) = 3
open("/var/run/utmp", O_RDWR) = -1 EACCES (Permission denied)
open("/var/run/utmp", O_RDONLY) = 4
open("/proc/loadavg", O_RDONLY) = 4

-- Maxim.

[ Parent | Reply to this comment ]

Posted by Anonymous (24.79.xx.xx) on Sat 18 Feb 2006 at 15:38

strace -eopen /usr/bin/uptime


[ Parent | Reply to this comment ]

Posted by stevenothing (212.36.xx.xx) on Mon 20 Feb 2006 at 13:25
strace has to be one of my favourite commands. I can't recommend the book 'Advanced Programming in the Unix Environment', by W. R. Stevens, highly enough for very in depth documentation on the behaviour and idiosyncrasies of the various Unix system calls. My usual usage would be something along the lines of:

strace -Ff -o myapp.trace $(pgrep myapp | xargs -n1 echo -p ) &

Probably worth mentioning that stracing setuid apps generally doesn't work properly. And also strace's library tracing cousin, ltrace.

[ Parent | Reply to this comment ]

Posted by Anonymous (221.222.xx.xx) on Sun 5 Mar 2006 at 13:52
I happened to have a similar issue and thanks to your 'inspired' solution I think at least I've found a possible way to solve it.

[ Parent | Reply to this comment ]

Posted by Anonymous (216.113.xx.xx) on Fri 16 Jun 2006 at 22:58
I read through your article, which was very good...however, I can't seem to attach strace to a perl process and have it trace it's children's children.

Here's the pstree output of my process that I'd like to watch (pid: 12431)

ratlperl,12431 ./
----`-ratlperl,12568 ./
---------`-cleartool,13258 deliver -cancel

However, strace -Tfv -p 12431 just hangs on "read(3, "
If I run strace -Tfv -p 12568, I see a few read()'s every other second or so
If I run strace -Tfv -p 13258, I see much more, with "[pid ] ..." pre-pended to children processes....

From 'man strace' docuemntation, it seems that -Ff -f and/or -F will follow children, however this doesn't seem to be the case. Any ideas?

I wonder if this is a glitch with using the backtick operator in perl vs. system?

Anthony Ettinger

[ Parent | Reply to this comment ]

Posted by Steve (62.30.xx.xx) on Fri 16 Jun 2006 at 23:06
[ View Steve's Scratchpad | View Weblogs ]

The backtick operator will ultimately end up calling the system syscall in the kernel, so it isn't significantly different from the system() function in perl.

Are you perhaps just seeing a loop withing your perl script which is not doing very much at the time you attach to the process?

I'd suggest instead of attaching to the process by PID running the script initially under strace and seeing what comes up..


[ Parent | Reply to this comment ]

Posted by Anonymous (216.113.xx.xx) on Fri 16 Jun 2006 at 23:33
...actually, i just did a small test, and noticed that the parent will pause if it's waiting for output.

for example, executing in series is not problem, but if the next command depends on the output of it's child, then it will hang with "read(3, " if I trace the parent.

[ Parent | Reply to this comment ]

Sign In







Current Poll

What do you use for configuration management?

( 75 votes ~ 0 comments )