Posted by jeld on Mon 24 Oct 2005 at 14:09
Every system administrator worth his pay knows, that one's system should log as much as possible and that logs should be checked for errors, warnings, suspicious activities and other things. Unfortunately, by default, logging is set up in a way that makes this job difficult, bordering on impossible.
There is an old problem with the default system log daemon syslogd. It can only log things according to two parameters passed to the syslog() function call by the program doing the logging.
These parameters are called facilities and priorities. There are eight standard priorities defined: emergency, alert, critical, error, warning, notice, info and debug.
This generally should give any program a sufficient control over how badly we want to pass our message to the human in charge. But priority at which a particular message is logged, determined by the creator(s) of the program in question and their opinion of seriousness of the situation is not always the same as that of the system administrator (albeit arguably if the developer of the program considers a particular message critical you might want to see it at least once).
This is far from ideal. On the other hand, facilities have been inherited from the olden UNIX times and for reasons unknown have been passed down the generations. There are twelve standard facilities (auth, authpriv, cron, daemon, ftp, kernel, lpr, mail, news, syslog, user, uucp), plus eight of so called local facilities ranging from local0 to local7.
Just reading the list of default facilities you might see the problem. Where my samba server would log things? Daemon? What about SSH? Auth or user? What about my LDAP server? How many people on earth are still running UUCP that there is a separate facility for it? How do I make my DNS server log into a separate file?
Obviously a lot of programs provide their own logging and do not use syslog for anything except startup and shutdown messages (if that). A good example would be Apache web server. But centralized logging control is a nice thing to have and not everything in this world has its own logging facilities.
What can we do? One of the solutions would be to setup the logging according to priorities and meticulously check anything that gets logged with priority higher then, say, error. The problem with this approach is that not every message that goes there will really be important to you and you can still miss things that you wanted to know and the developers didn't think were important enough.
The solution presented here involves three distinct steps.
Replacing syslogd
There are a few different flavors of syslogd replacements out there, you can find them by running:
apt-cache search syslog daemon
For our purposes we will use the one called syslog-ng (ng stands for new generation). So lets start by installing it.
dpkg -P sysklogd klogd apt-get install syslog-ng
By default syslog-ng will remove sysklogd and the kernel logger, but we do not want their configs to sit in our /etc till the underverse comes and therefore we purge them explicitly (personally I added APT::Get::Purge "true" to my apt.conf so that anything I uninstall gets purged, but apt settings are not for this article).
Customizing syslog-ng config
The default config file for the syslog-ng is called, quite unsurprisingly, in /etc/syslog-ng/syslog-ng.conf. Take a look at it. Scared? Its OK, I was scared too the first time around. OK, lets stop the daemon and customize the file.
/etc/init.d/syslog-ng stop cp /etc/syslog-ng/syslog-ng.conf /etc/syslog-ng/syslog-ng.original vim /etc/syslog-ng/syslog-ng.confThe basic rules of the syntax for the syslog-ng.conf are that everything ends in semicolon and everything is enclosed in curly braces. So, the first thing of order is to have everything log into a single file, then we can look at the result and split the logs into more files. Mind you, considering that we are about to setup a logging monitor a single file would do just as well, but just to show you how much nicer syslog-ng is, compared to sysklogd we will do some log splitting.
Also, once in a while (generally when there is a problem) you might want to look at complete logs, and then it is very useful to have them sorted in a nice logical fashion. The initial file will look like this:
options {
chain_hostnames(0);
time_reopen(10);
time_reap(360);
log_fifo_size(2048);
create_dirs(yes);
owner(root);
group(root);
perm(0644);
dir_perm(0755);
use_dns(no);
};
source int { internal(); };
source main { unix-stream("/dev/log"); };
source kernel { file("/proc/kmsg" log_prefix("kernel: ")); };
destination messages { file("/var/log/messages"); };
log { source(main); source(int); source(kernel); destination(messages); };
Confused? In syslog-ng we define four main things. The sources of log messages, destinations for the log messages (for example log files), filters defining what messages out of the source actually go to a particular destination and log definitions where we put those three together to form a logging mechanism. In this case we don't have any filters (everything goes into /var/log/messages) we only have one destination and we get messages from three standard sources. We can start the syslog-ng daemon back and watch it dump all the stuff into the messages file.
/etc/init.d/syslog-ng start tail -f /var/log/messages
Now we have to fix something. You see, by default our logs get rotated by logrotate program running from a cronjob every night. If logrotate misses a few files it will still run, but it will complain.
So, lets go and edit /etc/logrotate.d/syslog-ng. Just remove everything and leave the messages file. Be careful to leave around the postrotate script.
The resulting file should look something like this:
/var/log/messages {
rotate 4
weekly
missingok
notifempty
compress
postrotate
/etc/init.d/syslog-ng reload >/dev/null
endscript
}
OK, now that we have logging and rotation under control lets split the logs into several files.
This is where personal taste and system functionality plays the role. For example, on my systems I found that kernel messages are generally something I actually want to see separately and that information from the mail server clutter the messages file a lot. So, for that I add a filter separating mail messages from the rest and setup a couple of new destinations. So the end of the file now looks like this:
filter mail { facility(mail); };
destination mail { file("/var/log/mail.log"); };
destination kernel { file("/var/log/kernel.log"); };
destination messages { file("/var/log/messages"); };
log { source(kernel); destination(kernel); };
log { source(main); filter(mail); destination(mail); flags(final); };
log { source(main); source(int); destination(messages); };
Notice the final flag on the mail log? That insures, that the message processing actually stops at that entry of the log message matches the filter. This way although everything still goes into messages, mail logs only go into mail log file.
As you can see, I use the standard mail facility to filter out the mail logs and you might be thinking, why would he go into the trouble of replacing sysklogd if he still uses standard filtering?
Here are a few things you cannot do with standard sysklogd. Since my systems are serving web content, there is no reason for anyone except me to log into them. On the other hand I know there are people out there who will constantly try. Therefore anything that SSH daemon has to tell me I consider important.
destination sshd { file("/var/log/ssh.log"); };
filter sshd { program("ssh");
log { source(main); filter(sshd); destination(sshd); flags(final); }
Lets say that you are running a very active mail server and even daily rotation of logs makes them impossible to parse. How about rotating logs when they reach certain size? Sure, there is a program in apache2-utils package called rotatelogs. So, all we have to do, to make our mail logs rotate when they reach 500MB in size is to change our mail destination to something like
destination mail { program("/usr/sbin/rotatelogs /var/log/mail 500M"); };
Get the idea? And this is not all. There are a lot of useful options in syslog-ng that might make your log files more comprehensible. To see more things you can do with syslog-ng, check out its web site. Remember, that every time you change one of the standard log definitions you might want to check the logrotate config and add or remove appropriate entries. And obviously, before attempting doing any of these steps you want to RTFM, so
man syslog-ng man syslog-ng.conf man logrotate man logrotate.conf
Setting up log monitoring
OK, so all your logs go into appropriate places and don't mix with one another and rotate properly and all the other nice things from the previous chapter.
You might even wade through the important ones every morning looking for anything that might be out of order.
That process quickly gets tedious. And what if something bad happens that you would want to know right away or at least in a matter of minutes, not the next morning?
Yes, there is a solution. First, I will digress a little. A few weeks ago a rather controversial article was published by a security professional Marcus Ranum. The title of the article was The Six Dumbest Ideas in Computer Security. The article discusses some common approaches to computer and network security and discusses (in rather harsh terms) why is it not a good idea to follow them. In my opinion Mr. Ranum, being a very intelligent person, is very convincing, but not always very realistic. But one of the discussed practices struck me as very true and very bad. Marcus calls it "enumerating badness". In this particular case, topic of this article being logging, I was looking at different programs to monitor logs for signs of bad things. As I soon found out, most log watcher programs out there are doing exactly what Marcus Ranum warns us against. The want you to have a list of patterns of bad stuff and they will tell you when they see such a pattern appear in one of your logs.
Why is it bad? Because there is a lot of bad stuff that can possibly happen and you cannot be sure to include every possible pattern in your watch lists. The right approach is of course weed out good stuff. That is if you see a message and you know that it is not important, filter it out.
Soon, the only messages you get are the one worth paying attention. Enter logcheck. Logcheck is a bunch of clever shell scripts that do exactly that. Even better, Debian users are blessed, since a lot of Debian packages provide logcheck pattern files with patterns for common unimportant messages. So, as usual the first step, is to install:
apt-get install logcheck
Now we need edit a few things. First lets define what files we want to watch. This is defined in /etc/logcheck/logcheck.logfiles.
My logcheck.logfiles looks like this:
/var/log/messages /var/log/kernel.log /var/log/mail.log
Now lets look at the actual config file. Generally speaking the only thing you need to specify here is the address where you would like to receive the alerts, but there are a few things here that might warrant changing. One of them is the report level.
Logcheck maintains three distinct sets of patterns, designed for different system and different administrators. These sets are called "server", "workstation" and "paranoid".
It is easy to guess, that workstation is most liberal at weeding out things, server is a bit more conservative and paranoid will start sending alerts at the slightest provocation.
Choosing a particular level is up to you, but I found myself quite comfortable using the default server setting. The file is very well commented and should give you no trouble. Here is what I did:
INTRO=0 REPORTLEVEL="server" SENDMAILTO="myemail@company.com" ADDTAG="yes" ATTACKSUBJECT="Attack Alerts" SECURITYSUBJECT="Security Events" EVENTSSUBJECT="System Events"
The ADDTAG allows me to easily filter logcheck alerts to a separate folder, and INTRO=0 makes logcheck skip silly message headers and footers.
Now you want to make sure that your MTA is actually capable of delivering mail (the default setup of exim should be enough for Debian users, personally I replaced it with postfix, but this is again out of the scope of this article).
And there you go. Logcheck runs from cron. The file controlling its frequency is located in /etc/cron.d/logcheck. By default it runs every hour.
I found that every hour is enough for me to keep track of potential issues. One thing about logcheck and syslog-ng are permissions on the log files. If you look at my samples of syslog-ng.conf again, you will notice (or already notices) that I set default log file permissions to 644. This is rather important, since logcheck doesn't run as root (it creates its own logcheck user) and with default permissions of 640 it is unable to access the log files.
Another way of solving this is to set the default group ownership of log files to the logcheck group.
Customizing logcheck rules
So, now you open your mail and see all the interesting log messages neatly sitting in your mail. No more wading through endless log files and no more scrolling through thousands of similar unimportant log entries. What next?
Right. Albeit default logcheck rules weed out the most common stuff, in the real world and on a real system there are still a lot of repeating messages that you consider to be non-critical.
This is where things get tricky. Logcheck uses grep (or to be precise egrep) to weed out log messages, so logcheck pattern files are actually just lists of extended grep regular expressions.
If you are good with grep and something like the following makes perfect sense to you, you are in luck.
^\w{3} [ :0-9]{11} [._[:alnum:]-]+ jabberd/s2s\[[0-9]+\]: \[[0-9]+\] \[[0-9.]{7,15}, port=[0-9]+\]
If not, remember Douglas Adams. Do Not Panic.
Log entries produced by syslog-ng (and by sysklogd and by some other logger daemons) all start more or less the same. They have date and time, name of the host system usually continued by the name of the program doing the logging and by the message itself.
The pattern files for logcheck are located in /etc/logcheck/ignore.d.[report level] directories.
If you followed my lead and didn't change the default report level, the patterns used will be in /etc/logcheck/ignore.d.server. Since we do not want to interfere with software packaging, we might want to create a separate file for our custom patterns. I use /etc/logcheck/ignore.d.server/local.
Since describing proper writing of regular expressions is a big job, I will give you two examples. Soon after setting up logcheck, I found that due to some reasons, my system is attempting to send mail messages with no recipients.
After checking out the source of the problem, I found that nothing sinister is going on and decided to filter these postfix messages out. The message in question looks like this (here and in other sample logs and patterns I inserted line breaks for better readability. In the actual files these are single lines):
Oct 19 13:08:19 hostname postfix/cleanup[6560]: E366374E63: to=, relay=none, delay=1, status=bounced (No recipients specified)
Since I didn't want to come up with a regular expression for this from scratch, I looked at the existing pattern list for postfix. A quick grep gave me a sample:
grep clean postfix
^\w{3} [ :0-9]{11} [._[:alnum:]-]+ postfix/cleanup\[[0-9]+\]: [[:alnum:]]+: message-id=.*$
^\w{3} [ :0-9]{11} [._[:alnum:]-]+ postfix/cleanup\[[0-9]+\]: [[:alnum:]]+: resent-message-id=<.+>$
Now all I have to do is copy the start of these patterns and add something that would be unique for the message I want to filter. The result I came up with looks like this: ^\w{3} [ :0-9]{11} [._[:alnum:]-]+ postfix/cleanup\[[0-9]+\]: [[:alnum:]]+: [[:print]]+ \(No recipients specified\)
After I added this line to my custom patterns logcheck stopped bothering me about mails with no recipients being bounced.
After that, I installed the shorewall firewall on some of my system and immediately started receiving a lot of messages about packets being dropped by the firewall. A sample message looks like this:
Oct 21 11:05:24 hostname kernel: Shorewall:net2all:DROP:IN=eth0 OUT= MAC=00:06:5b:0f:3e:ba:00:0e:38:57:d1:80:08:00 SRC=[Source IP] DST=[My IP] LEN=1067 TOS=0x00 PREC=0x00 TTL=114 ID=11979 PROTO=UDP SPT=50606 DPT=80 LEN=1047
Now, if the packet was dropped, I think I am reasonably safe. So I don't want logcheck bother me every hour with loads of these lines.
I looked at the set of kernel patterns, and came up with the following:
^\w{3} [ :0-9]{11} [._[:alnum:]-]+ kernel: Shorewall:net2all:DROP:IN=eth0
This was sufficient and so far I am a happy system administrator, knowing what is going on on his systems and not having to go too far to obtain that knowledge.
Obviously this is not the only measure that I took on my way to true enlightenment, but it is one of the important ones.
If this article is well received, I will continue to document some of the methods I have come up with to keep my systems safe and secure. Till then, good luck to you all.
Dmitriy Kropivnitskiy
Happy System Administrator
This article can be found online at the Debian Administration website at the following bookmarkable URL (along with associated comments):
This article is copyright 2005 jeld - please ask for permission to republish or translate.