System hangs while accessing USB 2.0 disks
Posted by rmcgowan on Thu 22 Dec 2005 at 02:18
I've recently upgraded my system to the new Debian 3.1r0a, using kernel version 2.6.8-2-686-smp. I purchased my system in late (November, I believe) 2001. My initial efforts to install with the 2.6 kenel failed, with what later turned out to be problems related to APIC. Using pci=noapic allowed me to use the 2.6 kernel.
But, I wanted full functionality of the 2.6 kernel, so I obtained a BIOS upgrade for my motherboard from ESupport.com. I can now boot the 2.6 kernel without the pci=noapic option.
But, during boot and on examination of /var/log/messages, I see rather large numbers (as in 153 in a file of 4096 total lines) of the error message "APIC error on CPU0: 00(40)". There are minor variations in the strings listed, some referring to CPU2, some with the last 'word' as "40(40)". The '00(40)' is always first. The sequence is the same for both CPU's.
Dumping /proc/cpuinfo gives the following details.
processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 1003.681 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1986.56 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 1003.681 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 2002.94
Searching the web for this error message returned lots of hits, most being similar to this one. One I found seemed to indicate the error is innocuous and I can ignore it.
But is it? And, I do have the problem of my system hanging during pretty heavy I/O on the USB bus (3 data streams, two cp's of ~500MB files, and a cpio restore of a several Gig file). Is it possible there is a relationship? Do I need to consider purchasing newer CPU's (or should I just go back to the pci=noapic)?
Thanks,
Bob
[ Parent | Reply to this comment ]
Falko
http://www.smi-softmark.de
[ Parent | Reply to this comment ]
No, I have not yet tried to build a new kernel. Mostly due to lack of time, right now ;-) 'tis the season for lots of other duties...
I will give it a go next week, if I'm lucky, or early Jan., and see if that fixes the problem with the disks.
[ Parent | Reply to this comment ]
Well, I got lucky. Here's the low down.
- I downloaded and compiled kernel 2.6.14.4, the latest stable version as of last week. In the 'make xconfig' session, I did not change anything from the installed kernel config defaults, I just saved the configuration and exited.
- I used the Debian kernel-package to build a .deb file and installed using dpkg. The reboot failed. The pertinent parts of the error message were 'not syncing' and 'unable to mount root fs on unknown-block(0,0).
- The result of a Google search on this error was the discovery that there was no 'initrd.img' for the new kernel. Creating one resolved the boot problem.
- I'm now able to run at least 4 concurrent I/O streams from the USB disk, one of which was a couple of GB file, which ran for about 10 minutes before I killed it (the original problem manifested within 2 or 3 minutes). So, it looks like the system hang problem is fixed.
- And, as a by product, no more 'APIC error on CPU# ...' either.
[ Parent | Reply to this comment ]
If I transfer 40 files with 700kB each, the USB(SCSI) device goes offline(may I put the /var/log/messages error here later), but if I write a one-liner script to sleep 4 seconds between every file copy everythings goes ok.
I was using the Debian Etch with 2.6.12 kernel, and the system is out of date(for 3 months), since the machine is completely offline.
--
Andreyev
PS: excuse-me for language mistakes, I am still learning english.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
The file system corruption is probably due to issues with VFAT support, and not related in any way to the USB bus hanging.
When I installed the hotplug package, one of the things I found, in the /etc/hotplug/*conf file, was the comment that VFAT does not yet fully support the options required to allow hotplugging the device. Specifically having to do with "auto syncing" (I don't have access to a Debian system right now so I can't look up the exact name/config values).
To use VFAT devices with USB hotplugging, you will need to be sure that you use the 'sync' command from a terminal window before you unplug, otherwise you will have data and/or filesystem corruption.
[ Parent | Reply to this comment ]
Well, my comment needs fixing. There is a configuration file that talks about VFAT and it talks about lack of support for "sync-mounting". But this is not a 'hotplug subsystem' configuration file. It is:
/etc/usbmount/usbmount.conf
My apologies for any confusion I caused (I was beginning to think I'd lost my mind, 'cause looking for this file where I thought I'd found it was comming up completely empty).
[ Parent | Reply to this comment ]