A transient /var/log

Posted by mcortese on Thu 18 Nov 2010 at 20:01

So you're afraid of wearing out your SSD. Or you don't want to spin up a sleeping hard disk just to write a line of log. Or maybe you're just looking for some use for the disproportionate amount of RAM you've got. Whatever your case is, what you need is a way to keep the log files in memory as long as the system is up, and write them down to the regular storage just before shutdown.

Starting

The concept itself is pretty simple: mount a tmpfs on top of /var/log.

mount -t tmpfs -o nosuid,noexec,nodev,mode=0755,size=16M \
      transientlog /var/log
Feel free to tune the size and mode parameters according to your actual needs.

The problem of mounting a new file system on top of a non-empty directory, is that it hides the previous contents of such directory: they are still there, but you cannot access them any more. To overcome this drawback, you have to relocate /var/log to another place before mounting the tmpfs. The best way to do this is via a "bind mount".

mount --bind /var/log /var/my_true_log

The next thing to do is to populate the newly created directory with the contents from the old one so that the logging facilities can continue to work as if nothing happened.

cp -rfp /var/my_true_log -T /var/log

If you want to create a script to automate all this, you must be prepared to handle the exceptions: in case the copy should fail, just unmount everything in the reverse order. And before anything else, ensure that you are root, that the directories actually exist, and that the transient log is not already running. The latter can be handled with a lock file. So this will be the preamble to your script.

[ -f /var/lock/transientlog.lock ] && return 1
[ `id -u` -eq 0 ] || return 2
[ -d /var/log ] || return 2
[ -d /var/my_true_log ] || mkdir -p /var/my_true_log

And this will be the complete function.

do_start()
{
        # Return
        #   0 if transient log has been started
        #   1 if transient log was already running
        #   2 if transient log could not be started

        [ -f /var/lock/transientlog.lock ] && return 1
        [ `id -u` -eq 0 ] || return 2
        [ -d /var/log ] || return 2
        [ -d /var/my_true_log ] || mkdir -p /var/my_true_log
        mount --bind /var/log /var/my_true_log
        mount -t tmpfs -o nosuid,noexec,nodev,mode=0755,size=16M \
              transientlog /var/log
        if [ $? -eq 0 ]; then
		if cp -rfp /var/my_true_log -T /var/log; then
			touch /var/lock/transientlog.lock
			return 0
		fi
		# Rollback the mount
		umount /var/log
	fi
        # Rollback the directory mangling
        umount /var/my_true_log
        return 2
}

Stopping

At shut down, the log files need to be copied back to the permanent storage before unmounting the tmpfs. Using cp with the option -u is an effective way to restrict the copy to only those files that were actually changed, thus minimizing the writes to the disk. The option -l to umount, known as lazy unmount, tells the kernel to delay the execution as long there are processes with an open file on that file system.

cp -rfup /var/log -T /var/my_true_log
umount -l /var/log
umount -l /var/my_true_log

Be aware that a race condition can happen here, because a process may decide to start writing to /var/log right between the copy and the unmount. There is little you can do about it, except trying to stop the transient log after everything else. Be prepared to risk losing the last messages of those processes that are still running!

After adding the usual sanity checks, the final look of the stop routine is the following.

do_stop() {
        # Return
        #   0 if transient log has been stopped
        #   1 if transient log was already stopped
        #   2 if transient log could not be stopped
        #   other if a failure occurred

        [ -f /var/lock/transientlog.lock ] || return 1

        # Check if I am root
        [ `id -u` -eq 0 ] || return 2

        # Merge back to permanent storage
        cp -rfup /var/log -T /var/my_true_log

        umount -l /var/log
        umount -l /var/my_true_log
        rm -r /var/lock/transientlog.lock
        return 0
}

When a refresh is needed

Now you have two independent log directories, one static and one dynamic, that exchange data only at boot and shutdown. It would be a nice feature to allow the two directories to be synchronized at will. A wise administrator could then add a cron rule to periodically ensure consistency between the transient and permanent directories.

The idea is simple: just duplicate the concept seen for the stopping phase, leaving out the unmounts. Without any further delay, this is your whole routine (called do_reload() for reasons you will soon understand).

do_reload() {
        # Return
        #   0 if transient log has been reloaded
        #   1 if transient log was not running
        #   2 if transient log could not be reloaded

        [ -f /var/lock/transientlog.lock ] || return 1
        
        # Check if I am root
        [ `id -u` -eq 0 ] || return 2

        # Merge back to permanent storage
        cp -rfup /var/log -T /var/my_true_log
        touch /var/lock/transientlog.lock
        return 0
}

The initscript

In order to have the transient log automatically started at boot time and stopped at shutdown, you need to shape everything into an initscript. You can use the skeleton in /etc/init.d as a starting point and merge the three functions defined above, namely do_start(), do_stop() and do_reload().

What you do have to change, though, are the so-called LSB headers. These are nothing but comments at the beginning of an initscript, which define the relative dependences with other initscripts. Setting up the right headers saves you from the heavy task of finding the correct order of initscripts, because insserv will do it for you (it is, in fact, the only way to handle the init sequence from Debian Squeeze on).

Transient log must be started before and stopped after any service that writes to /var/log, and you can reasonably assume that the first of such services is syslog. The following lines express these relationships.

# X-Start-Before:  $syslog
# X-Stop-After:    $syslog

Now, this sounds reasonable, and is indeed the best approximation you can have. But for sake of completeness, you must be aware that, while sysolg is for sure the first program to access /var/log, there are so many processes writing log files that you can hardly tell which will be still running at shutdown. This issue has no easy solution, not under Debian's runlevels model. Things may change if Debian moves to other boot schemes, but for the moment you have to live with it. At least, the lazy unmount seen above prevents the system from hanging forever if some process keeps /var/log busy at shutdown.

Another LSB header that you might want to deploy is this Debian-specific keyword:

# X-Interactive:   yes
Although yours is definitely not an interactive script, this keyword instructs insserv that it has to be executed alone, not in parallel to other startup scripts. This is of course to avoid (or at least minimize) the race conditions mentioned above.

There should be no doubts in which runlevels transient log is supposed to run and in which not:

# Default-Start:    2 3 4 5
# Default-Stop:     0 1 6

Add some descriptions and your whole LSB header will look like this:

### BEGIN INIT INFO
# Provides: transientlog
# X-Start-Before:       $syslog
# X-Stop-After:         $syslog
# X-Interactive:        yes
# Default-Start:        2 3 4 5
# Default-Stop:         0 1 6
# Short-Description:    Keeps /var/log in RAM
# Description: Moves the contents of /var/log to RAM during boot
#              and keeps it there until shutdown/reboot, when it
#              copies the contents back to permanent storage.
### END INIT INFO

The rest of the script can be left mostly untouched. A finished version can be found here (Local Mirror). Save it as /etc/init.d/transientlog, make it executable, then let insserv create the symbolic links in the relevant runlevels.

# chmod a+x /etc/init.d/transientlog
# insserv transientlog

If you have an older Debian without insserv, you must find out yourself reasonable sequence numbers for the start and stop actions. Assuming 00 for start and 99 for stop, the following command will create the symbolic links that you need.

# update-rc.d transientlog start 00 2 3 4 5 . stop 99 0 1 6 . 

The cron job

The initscript seen in the previous section can be called with the reload argument in order to force the synchronization of the transient and permanent directories. You might want to invoke it periodically, for example once a day: just write a script like the following and put it in /etc/cron.daily

#!/bin/sh

[ -x /etc/init.d/transientlog ] || exit 0
[ -f /var/lock/transientlog.lock ] || exit 0
/etc/init.d/transientlog reload

 

 


Posted by Anonymous (89.103.xx.xx) on Thu 18 Nov 2010 at 20:24
Thanks for your script. I wanted to give it a try, but insserv complains about missing entries Required-Start and Required-Stop. Are those important?

[ Parent | Reply to this comment ]

Posted by Anonymous (84.221.xx.xx) on Thu 18 Nov 2010 at 21:05
Instead of copying /var/log to the temporary dir why didn't you use a union mount?

[ Parent | Reply to this comment ]

Posted by mcortese (93.39.xx.xx) on Sun 28 Nov 2010 at 21:34
[ View Weblogs ]
Sounds like a good idea. I need to elaborate on this...

[ Parent | Reply to this comment ]

Posted by banchieri (2a01:0xx:0xx:0xxx:0xxx:0xxx:xx) on Wed 19 Jan 2011 at 22:25
[ View Weblogs ]

It is a good idea: Mount an aufs with a ramfs as writable top branch onto /var/log and during shutdown, use aubrsync to save the changes back to the original /var/log.

[ Parent | Reply to this comment ]

Posted by mcortese (20.142.xx.xx) on Mon 14 Feb 2011 at 16:59
[ View Weblogs ]
I thought aufs had been rejected by the Linux kernel team.

[ Parent | Reply to this comment ]

Posted by lpenz (189.74.xx.xx) on Thu 18 Nov 2010 at 23:20
You can also completely disable logging.

busybox has an interesting solution: all log is stored in a circular buffer in RAM.

[ Parent | Reply to this comment ]

Posted by Anonymous (193.178.xx.xx) on Fri 19 Nov 2010 at 08:02
Correct me if I'm wrong: right after mounting over existing var log all processes have FD on old mounted partition (on disk) so it will write to it until i restart logging process ?

[ Parent | Reply to this comment ]

Posted by mcortese (93.39.xx.xx) on Sun 28 Nov 2010 at 22:22
[ View Weblogs ]
Correct. That's why it has to be started before anything that writes to /var/log.

[ Parent | Reply to this comment ]

Posted by Anonymous (59.167.xx.xx) on Fri 19 Nov 2010 at 11:15
Why not configure the syslogd to not do synchronous writes and then use the commit= mount option (for ext3/4 filesystems) to allow the data to be stored in cache for longer periods and thus allow fewer and bigger writes?

The approach described here will lose all logs if the system crashes before writing back while configuring a filesystem to delay writes by a minute will just risk losing the last minute's worth of logs and not require any special cron jobs.

[ Parent | Reply to this comment ]

Posted by mcortese (93.39.xx.xx) on Sun 28 Nov 2010 at 22:17
[ View Weblogs ]
Alas, syslogd is just one of the several programs which write to /var/log: configuring all of them can be painful. Furthermore your comment seems to assume that /var/log is a file system on its own, which is not usually true.

As in many similar situations, the risk of losing n minutes worth of logs is inversely proportional to the rate of writes to disk. You might go from entirely skipping the cron part (and risk loosing all log files since last boot), to configuring syslogd to sync the disk after each write, and even to logging to two redundant remote machines, like someone else suggested in the comments. It depends on how much you value the log files and how you use them: it's up to you to tune the system and find the balance that suits you (someone even suggested you should get rid of log files in the first place).

[ Parent | Reply to this comment ]

Posted by Anonymous (190.3.xx.xx) on Fri 19 Nov 2010 at 14:06
Great article! I'm thinking in doing a similar approach for an embedded proxy server running Squid.
With the enough amount of ram, you can store all Squid's cache in ram without installing a hard disk.

Thanks!

[ Parent | Reply to this comment ]

Posted by vegiVamp (81.246.xx.xx) on Mon 22 Nov 2010 at 13:09
Umm... What's the point of logs? Oh, right, to see what happened AFTER THINGS WENT HORRIBLY WRONG.

When things go horribly wrong, how likely would you say it is that your stopscript has a chance to run? Close to zero, I'd say.

No, you want your logs to not bother your SSD or sleeping harddisk, simply get one of the teensie USB sticks - they go up to 128G these days - and mount that at /var/log. Alternatively, and it is what I do, dedicate a machine (or two if you want redundancy) to remote syslogging, and turn of local logging. All log entries get sent over UDP to a single destination, where they can additionally easily be parsed by something like logcheck, which then mails you any anomalies.

[ Parent | Reply to this comment ]

Posted by Anonymous (93.125.xx.xx) on Mon 22 Nov 2010 at 20:38
I'm with vegiVamp. Matteo, you must be mad to want to implement a transient /var/log. This is the worst idea ever.

[ Parent | Reply to this comment ]

Posted by Anonymous (87.127.xx.xx) on Fri 26 Nov 2010 at 08:16
> Umm... What's the point of logs?

Erm ... debug, info, notice, warning, warn, error, err, crit, alert, emerg;
plenty there not worth panicking about.

[ Parent | Reply to this comment ]

Posted by mcortese (93.39.xx.xx) on Sun 28 Nov 2010 at 21:58
[ View Weblogs ]
Well, if you really think that log files are only useful in case of disaster, then yes, this solution is not for you. And neither the one you propose about an external USB stick: you really need log files recorded on a different machine, like you finally suggest. Unfortunately, that solution may be difficult to implement if you are not continuously connected to a well-known, stable network or for a home installation without a second PC to spare.

Granted, the solution I propose is NOT safe in a lot of circumstances, including if your hardware catches fire and so on. I would not recommend it for a server or a mission-critical system.

But if you have a netbook for personal use and just browse through log files now and then, for example to see why that new application failed to start, or what that new driver is complaining about, then it may perfectly suit your needs.

[ Parent | Reply to this comment ]

Posted by vegiVamp (81.246.xx.xx) on Mon 29 Nov 2010 at 11:53
I never said they were useful *only* for the worst-case, but as an administrator, that *is* their primary purpose for me, yes. What will you do if the new driver complains by causing a kernel panic?

Things aren't black and white; the USB stick solution is a good intermediate where you both don't wear out the SSD *and* don't lose your logs in the event of catastrophic failure. Well, unless the failure is more than catastrophic and destroys your actual hardware, at which point it's probably more a matter of finding out who dropped your laptop into the large hadron collider.

Drives like this aren't even in the way on a laptop: http://www.2dayblog.com/2009/03/04/elecoms-mf-su2-series-thumbdri ve/

[ Parent | Reply to this comment ]

Posted by linulin (91.202.xx.xx) on Sun 12 Dec 2010 at 13:56

> What will you do if the new driver complains by causing a kernel panic?

You want to have console access in this case, (either physical, or connected to some machine via serial port). Kernel panic logs may not reach your /var/log, regardless of how it is mounted and where syslog is configured to send logs to.

--
...Bye..Dmitry.

[ Parent | Reply to this comment ]

Posted by Anonymous (206.212.xx.xx) on Thu 24 Feb 2011 at 20:14
Unless I am mistaken, it looks like the the 'stop' and 'reload' commands expect the disk log (/var/my_true_log in article, $VARLOGPERM in sample script) to remain mounted while the tmpfs is in use. However, the last command of do_start() unmounts the disk log. Is this a bug?

[ Parent | Reply to this comment ]

Posted by mcortese (20.142.xx.xx) on Fri 25 Feb 2011 at 15:56
[ View Weblogs ]
No. The umount command of do_start() is in the rollback path: it is executed only if something wrong happens before. If everything works as expected, the do_start() function is exited (return 0) after touching the lock file.

[ Parent | Reply to this comment ]

Posted by Anonymous (128.176.xx.xx) on Tue 15 Nov 2011 at 15:38
Why did you use cp and not rsync? I would guess an "incremental backup" with rsync would even less wear down the ssd than a cp which might just overwrite the whole block.

[ Parent | Reply to this comment ]

Posted by mf_user (188.217.xx.xx) on Fri 5 Oct 2012 at 18:47
Greetings,

thanks for the script and the explanation. I see in the script that it is inspired by ramlog (http://www.tremende.com/ramlog). May I ask what are the main difference between this solution and ramlog, and in your opinion in which scenarios one is better (or worse) than the other?

Thanks a lot! MF

[ Parent | Reply to this comment ]

Posted by mcortese (85.158.xx.xx) on Mon 22 Oct 2012 at 13:41
[ View Weblogs ]
If I remember correctly, ramlog was not tailored for Debian, and had some very complex mechanisms to install/remove it.

[ Parent | Reply to this comment ]

Posted by Anonymous (178.196.xx.xx) on Sun 2 Feb 2014 at 12:45
Calling insserv transientlog returns an error.

root@XXXX:/etc/init.d#insserv transientlog
insserv: Script transientlog is broken: incomplete LSB comment.
insserv: missing `Required-Start:' entry: please add even if empty.
insserv: missing `Required-Stop:' entry: please add even if empty.
root@XXXX:cat /etc/debian_version
7.2

Just add both entries after #Provides like:

# Provides: transientlog
# Required-Start:
# Required-Stop:
# X-Start-Before: $syslog
# X-Stop-After: $syslog
# X-Interactive: yes

[ Parent | Reply to this comment ]

Posted by Anonymous (88.156.xx.xx) on Sun 31 Aug 2014 at 09:37

Hi Matteo, thanks for this great script. I wanted to point out that in Debian Jessie (current testing release) I had to addmount --make-private $VARLOGPERM

after the mount --bind, otherwise after calling mount -t tmpfs... the 2 mountpoints would share the same content.

[ Parent | Reply to this comment ]

Posted by Anonymous (96.127.xx.xx) on Mon 15 Sep 2014 at 06:00
Hey thanks for this, I was wondering why it didn't work since I dist-upgraded from wheezy to sid.

for those who might be wondering about this as well, the symptoms are that without the mount --make-private, the mount -t tmpfs ...transienlog /var/log operation actually makes *both* /var/log and /var/my_true_log tmpfs mounts. you'd see something like this:

$ mount | grep log
/dev/mapper/sda5_crypt on /var/my_true_log type ext4 (rw,relatime,errors=remount-ro,data=ordered)
transientlog on /var/log type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,mode=755)
transientlog on /var/my_true_log type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,mode=755)


I've also just discovered the existence of systemd.journald, which by default keeps logs only in ram. So I'm wondering if one might not simply just uninstall rsyslogd and let systemd.journald handle the logging -- but that might lead to dependecy issues for packages (I'm not sure yet)

[ Parent | Reply to this comment ]

Sign In

Username:

Password:

[Register|Advanced]

 

Flattr

 

Current Poll

What do you use for configuration management?








( 704 votes ~ 10 comments )