More hardware monitoring: IPMI
Posted by simonw on Fri 27 Jan 2006 at 13:40
Many of the higher end servers have an Intelligent Platform Management Interface, that lets you observe a whole host of hardware parameters. Usually these systems also support plug-in remote management cards (for example DELL RAC cards), that allow remote resets, and other remote diagnostics.
This software use to be a pain to install, as it required kernel patches or extra modules, but we needed some thermal monitoring added in a hurry here due to air conditioning problems, and it seems it is now much simpler.
On DELL 2650 running Debian Sarge with 2.6 stock Sarge kernel;
# apt-get install ipmitool # /usr/share/ipmitool/ipmi.init.basic # ipmitools -I open sensor list
If these two command work, and produce useful output, all you need do it make it work after the next reboot as the device file created by the init script may need a different major deive number, and find some way of handling the output. The tools allow network management. For reboot I went with the old /etc/rc.boot directory, just sticking the ipmi.init.basic script in there (See /etc/init.d/rcS).
For monitoring we've gone with a simple Perl script to check everything is okay, and page us if it isn't, tested it by setting the upper non-critical (unc) temperature threshold below ambient temperature.
ipmitool also lets you adjust the thresholds, we figured early warning of temperature issues is kind of important to us right now.
So we tweaked down the non-critical thresholds.
ipmitool -I open sensor thresh "ESM Frt I/O Temp" unc 40
IPMI also allows watchdog checking for operating system crashes, but I'll likely ignore that for now, crashes really aren't a big problem.
Anyone familiar with this technology going to tell me what I should have done? And how it fits with the other free software for such tools?
These are usually found only on Intel's server boards so give it a try in that case. I guess it's more usefull then usual "sensors".
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
With Ubuntu 2.6.15-26-server, the IPMITOOL script seems to have an error with DRAC3, which doesn't work the KCS module.
Here are some instructions that worked for Ubuntu, or might be required for newer versions of Debian on machines with a DRAC3 card:
Install IPMI
apt-get install ipmitool
/usr/share/ipmitool/ipmi.init.basic
If this throws an error as follows ...
Setting up OpenIPMI driver...
FATAL: Module ipmi_kcs_drv not found.
... then make a new file for debian/ubuntu:
nano /usr/share/ipmitool/ipmi.init.basic
The top section looks like this:
# load the ipmi modules
modprobe ipmi_msghandler
modprobe ipmi_devintf
if ! modprobe ipmi_kcs_drv ; then
modprobe ipmi_si # try new module name
fi
This script check doesn't work, and throws an error that it can't load the KCS driver.
Change it to:
modprobe ipmi_msghandler
modprobe ipmi_devintf
modprobe ipmi_si # try new module name
Save as: /usr/share/ipmitool/ipmi.init.debian
chmod 755 /usr/share/ipmitool/ipmi.init.debian
Run this script from startup as described in the article above.
Many of the commands don't seem to work with the DRAC3, but the one I was after was ipmitool sel elist, and that works fine now.
Once again, thanks for the info.
Neale Rudd
Metawerx Pty Ltd
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
I'm now off in search of the source for the ipmi_kcs_drv module to see if that gives more info.
Thanks for the article, and thanks for the informative comments above.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
[ Send Message ]
There is a typo in the first "code" block :
should be "ipmitool".
Thanks again, I keep on reading & trying it :)
[ Parent | Reply to this comment ]