Weblog entry #1 for rsteuer
Being the newb that I am, I have no idea how to get past this. The machine ran 3.1 without issue, so I'm guessing something in the kernel has changed that causes the hang. Any clues?
Comments on this Entry
[ Send Message | View dkg's Scratchpad | View Weblogs ]
[ Parent | Reply to this comment ]
I found the same condition occurs if I try installing any kernel above 2.4.27-2-386.
[ Parent | Reply to this comment ]
[ Send Message | View dkg's Scratchpad | View Weblogs ]
If you try posting a few more lines of the pre-hang output (instead of just the final line) it might give a better sense of the context where the hang is happening. (using a serial console makes it much easier to cut-n-paste this stuff accurately)
Also, if it's a stock debian install with a modern initramfs, you might try adding a break=XXX command to get a (very limited) shell during the boot process. XXX should be one of top, modules, premount, mount, bottom, or init, depending on what point in the process you want to get access to the shell to poke around.
[ Parent | Reply to this comment ]
I don't have a clue how to capture the info with a serial console during the boot process. If you can point me in the right direction for instructions on running a serial console, I'll get the info requested and post.
[ Parent | Reply to this comment ]
[ Send Message | View dkg's Scratchpad | View Weblogs ]
The basic idea is that the machine you're booting (the Host) uses a serial line instead of a video terminal/keyboard for all its console interactions. You hook another machine (the "Monitoring Machine", or MM) up to that serial port and can log the entire transaction.
The key steps are:
- connect the MM to the Host via a null-modem serial cable (no affiliation with the vendor, just pulled from a google search). This works best if you've got two machines each with a built-in RS232 (DB9) serial port, but you only really need one built-in on the Host. If you've got a laptop or a "legacy-free" machine without a serial port as the MM, you might want a USB to serial adapter (not affiliated with this vendor either). If you go this route, make sure you get one with a chipset that's supported by the kernel running on the MM. The Prolific chipset (handled by the pl2303 kernel module) is a good bet.
- On the MM, start up a logged session connected to the serial port. /usr/bin/screen is good for this. If the MM's serial port is /dev/ttyS0, you would run:
screen -L /dev/ttyS0 115200
This assumes your user has read/write access to /dev/ttyS0, and write access to the current directory, where the log (screenlog.0) will be written. You should see just a blank screen in that window on the MM. - Boot the Host. In the bootloader, pass the Host's kernel an additional parameter of console=ttyS0,115200n8 (this assumes you've connected the serial cable to the first serial port on the Host).
- You should see the familiar boot messages scroll past in the screen session on the MM now, and you should see screenlog.0 starting to fill up.
The Remote Serial Console HOWTO has a lot of other good information (like how to get your bootloader running over the serial console as well), but for this particular project, the above steps should be what you need to get going. Once you have this, though, you might not want to go back to using a monitor/keyboard on your servers. Having remotely-stored clean, logged boot messages is too nice to pass up! And once you get the bootloader running over the serial console, you can do remote reboots without touching the box with much greater confidence.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
Detecting hardware: agpgart parport_pc100 via 82cxxx aic7xxx usb_uhci
Loading agpgart module
Linux agpgart interface v0.100 (c) Dave Jones
Skipping already loaded module parport_pc.
Skipping already loaded module e100.
Skipping already loaded module via82cxxx.
Skipping already loaded module aic7xxx.
Skipping already loaded module uhci_hcd
Running 0dns_down to make sure resolv.conf is ok...done
Setting up networking...done
Starting hotplug subsytem: pci
agpgart: Detected VIA Apollo Pro 133 chipset
agpgart: Maximum main memory to use for agp memory: 203M
The last line is actually further along than I've seen in the past. Usually, it hangs at the next to last line.
As mentioned previously, if I let 3.1 install with the default, or if I select the 2.4 kernel in expert mode, the system will boot properly and all is well.
Any comments would be greatly appreciated.
[ Parent | Reply to this comment ]
[ Send Message | View dkg's Scratchpad | View Weblogs ]
How long have you let it wait at this hang point? Is it possible that the machine is just really really slow as its probing some unusual piece of hardware?
What alternate kernel parameters have you tried during boot time?
Another tack: stock etch kernels rely on udev, which really prefers that the hotplug package be purged. Have you tried purging hotplug, discover, and any other hardware-scanning packages? They're handy, but if they're getting in the way of a boot, it'd be better to get the machine booting first. Then you can load modules by hand, since you know which modules you need, and maybe try adding back in some of the packages if you feel you need them.
[ Parent | Reply to this comment ]
The machine has been left overnight and also been rebooted and left for hours - all to no avail.
I haven't specified any other parameters for the kernel, only selected 2.4 or 2.6 when prompted.
I wouldn't know where to begin to purge hotplug packages. This is getting to be more complicated than it should to get an install completed.
[ Parent | Reply to this comment ]
[ Send Message | View dkg's Scratchpad | View Weblogs ]
apt-get remove --purge hotplug discover discover1 dpkg --purge hotplug dpkg --purge discover dpkg --purge discover1 apt-get -f installthe three dpkg lines are to ensure that packages that have already been removed (but not purged) actually get purged.
You could also try booting the new kernel with an init=/bin/sh parameter, just to check that the system does in fact work,and that the initscripts are what is causing the failure. You could then invoke scripts in /etc/rcS.d sequentially by hand, (e.g. /etc/rcS.d/S01glibc.sh start) note which one of them immediately precedes the hang, and remove its symlink on a second boot with init=/bin/sh.
Please report back here what you find, so that other folks who hit this same bump can learn from you!
[ Parent | Reply to this comment ]
I will try your suggestion on the init parameter and let you know what I find.
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
As a quick test I'd suggest adding "noapic acpi=off irqpoll" to your command line; just in case it is ACPI/APIC related problems. I've certainly seen enough of those in my time.
You can edit the grub command line as the system is booting to append them to your "kernel ...." line.
[ Parent | Reply to this comment ]
[ Send Message | View dkg's Scratchpad | View Weblogs ]
Do you have a good link for documentation of these kernel options? i'd love to read more to understand what they really do. /usr/share/doc/linux-doc-2.6.18/Documentation/kernel-parameters.t xt.gz only gives a very limited description of each option, and doesn't talk about the tradeoffs associated with their use.
[ Parent | Reply to this comment ]
My name is David, I'm a Windows admin making the move to Debian to try out some OSS softare for our startup company. I know computers, but I barely know linux, got Ubuntu and Debian running on some new machines, but I'm stopped by this same issue. First boot after install, hangs on this very same line:
agpgart: via apollo 133 chipset detected.
I have the same problem on an ibm eSeries x300, with debian etch 4r0. Were you ever able to solve it?
I have tried everything I could find in this thread, including:
adding the noacpi, acpi=off, and several other boot params
I tried to blacklist agpgart, lm-sensors, via686a, and a couple other things by adding ...
blacklist agpgart
blacklist via686a
blacklist xyzpdq
..to the blacklist file in the modprobe.d directory, but everything I reference still loads. How do I determine what is launching agpgart, or anything else?
I looked at the .sh scripts in rcS.d folder and determined the S90 script is what hangs my boot, but I cannot decipher what its' actually loading. If I knew, I couldn't stop it.
I found the conf file for x, and replaced the 'savage?' driver with 'vesa'. Same result.
I'v tried other things that I cannot remember due to my brains natural response at blocking out unpleasant experiences...
I've installed unbuntu 7.04 sever,desktop, alternate, and finally debian etch before writing this post. All hanging on the same line....
All these things, and still my boot hangs with the last line of
agpgart: via apollo 133 chipset detected.
I've invested 3 days in this, learning way more about linux boot process than I ever wanted, but find that the more I learn, the more I realize is not documented in anyplace I can find. I'd be willing to pay $50 an hour to any Debian guru who can convince me they know more than what they can find in a google search to solve this undoubtedly simple problem and bootlinux on my old, but certainly not useless server.
Thanks for reading, gurus unite and rescue this poor Windows admin who is running at top speed to catch up to the FOSS bandwagon...
-David
[ Parent | Reply to this comment ]
I got a solution here for the next guy, thanks to the folks in #debian
If you install debian on an IBM eSeries 330 or similar, you might need to do this...
In a nutshell you need to disable agpgart.do, and then rebuild initrd
[stuff in single quotes was typed in command line, no quotes]
I booted to shell by adding 'init=/bin/sh' to the boot param in grub.[look up how to add boot params to gub if you don't know how]
found agpgart.do under /lib/modules/.... renamed it to agpgart.do.disabled
had to mount the boot partition with 'mount /boot'
ran dpkg-reconfigure linux-image-$whateverversion, my version is 2.6.18-4-686
it does a lot of stuff, rebuilding initrd
I did 3 finger salute and it booted all the way to the desktop.
SWEET!
I learned a lot, run man initrd to see the way debian boots, pretty cool.
email me at debian AT atsfl.com if you have questions.
-david
[ Parent | Reply to this comment ]
Debian IBM eSeries x300 problem with Apollo 133 agpgart
had to edit kernel line in grub before booting (move on selected kernel you want to edit and press e), then on line starting with kernel (mine was: kernel /boot/vmlinuz-2.6.18-5-686 root=/dev/sda1 ro) and edit it (press e).
mine looked like this:
kernel /boot/vmlinuz-2.6.18-5-686 root=/dev/sda1 rw init=/bin/sh
Press enter and then b to boot kernel with theese settings.
I had to use command (becouse dpkg-reconfigure didn't work for me):
first to setup the PATH in enviroment by issuing:
export PATH=/usr/local/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:usr/bin
renamed the agpgart.ko module:
cd /lib/modules/`uname -r`/kernel/drivers/char/agp/
mv agpgart.ko agpgart.ko.disabled
then updating initrd with:
update-initramfs -k `uname -r` -u
After that i had to reboot and everything was ok.
Thank you for your help, hope that this alternate way will help someone.
cheers
Dan
mail me: danba AT suppcom DOT cz
[ Parent | Reply to this comment ]