Weblog entry #4 for forrest

my linux instability hell
Posted by forrest on Fri 8 Dec 2006 at 04:53
Tags: none.
My main machine, my pride and joy, my desktop ... has slowly become more and more unstable until I really can't ignore it.

These are more notes to myself to try and keep track of what I have and haven't tried, but if anyone has a suggestion I'd be glad to hear it.

This machine, gemini, is a dual-Athlon box with a Tyan S2466N-4M mobo.

I have an ATI graphics card and a ViewSonic VX900 LCD monitor: a combo which acts funky in a way that maybe I'll begin to describe once I've discussed the system hangs sufficiently.

When I got this machine I was running a 2.4.x SMP kernel, and even then it hanged (hung?) occasionally. That seemed to be related to it choosing a 3-D screensaver.

So, I just didn't do that.

The SMP kernels seemed to work fine up to 2.6.10. I didn't upgrade for a long time and the next kernel I tried was 2.6.16. With SMP turned on, I could get as far as logging into my GNOME session, but it would hang pretty quickly whenever I started doing anything. I found I could consistently get it to hang by opening a terminal window, playing an ogg file with ogg123, and trying to drag the terminal window. Usually the window would only move a little bit before the system locked up.

I tried turning off "Preempt the Big Kernel Lock" because that sounded particularly ominous, but the result was the same.

Tragically, I'm running a uniprocessor 2.6.16 now, wasting half my processing power.

When I tried upgrading to 2.6.18, I couldn't even run a uniprocessor kernel. The modes of failure were different this time ...

I guess you could say that 2.6.18 uni didn't crash, it boots all the way up but when GDM is supposed to fire up the screen goes blank. I logged in remotely from another linux box I have and "top" told me that Xorg was taking over 90% of the CPU!

... more later ...

 

Comments on this Entry

Posted by Anonymous (59.178.xx.xx) on Fri 8 Dec 2006 at 06:58
Is it an X thing? Is it a GDM thing? Memtest86 check done? Heat? (do the fan checks).

[ Parent | Reply to this comment ]

Posted by ajt (204.193.xx.xx) on Fri 8 Dec 2006 at 12:09
[ Send Message | View Weblogs ]
Is this an X thing? If you run the machine without starting x is it stable?

nvidia and ATI kernel drivers are buggy as hell and can make X very unstable. Try the stock x.org drivers.

I've heard a lot of people say that APM/ACPI can lead to a lot of instability, are either turned on?

--
"It's Not Magic, It's Work"
Adam

[ Parent | Reply to this comment ]

Posted by Utumno (211.72.xx.xx) on Fri 8 Dec 2006 at 18:01
[ Send Message | View Utumno's Scratchpad | View Weblogs ]
Like people have said already, I would try booting the thing without X and seeing if it is stable. You can then try your ogg123 test.

If it is stable then, try using stock X.org graphics drivers.

[ Parent | Reply to this comment ]

Posted by Anonymous (216.113.xx.xx) on Thu 21 Dec 2006 at 06:56
I'm trying to run Debian Sarge 2.6.8 on an IBM xSeries 330 with Dual P3 866MHz and find that the system crashes every 2 hours -> 3 weeks. No idea what triggers the crash. No X running (text only console) -- not harware, have three identical units and all show the problem. Ran like a champ with a uniprocessor kernel.

Still trying to get to the bottom of this one.

[ Parent | Reply to this comment ]

User Login

Username:

Password:

[ Advanced Login ]

Register Account

Quick Site Search