[PlanetCCRMA] SMP Boot Problem Report

Michael Gurevich gurevich@ccrma.Stanford.EDU
Tue Dec 20 19:44:01 2005


On Tue, 20 Dec 2005, Fernando Lopez-Lezcano wrote:

> On Tue, 2005-12-20 at 08:03 -0800, Timothy Polashek wrote:
> > Hello-
> > 
> > I just tried the booting the latest edge FC4 Kernels after installing from a
> > fresh install of FC4.
> > 
> > Booting with 2.6.14-0.10.rrt.rhfc4.ccrma works just fine.
> > 
> > However, booting with 2.6.14-0.10.rrt.rhfc4.ccrmasmp always freezes the machine
> > at the "Starting networking" point of the boot process.
> > 
> > This problem persisted even after performing an "apt-get dist-upgrade" command.
> > 
> > I'd prefer during the smp version running to get the most out of both CPUs.  Has
> > anyone also experienced this problem with this kernel?  Or better yet, found a
> > way to fix it?
> 
> No, I have not seen this problem. 
> 
> I would seem that loading one of the drivers is hanging the machine. Did
> you try to boot without the graphical boot interface to see if there are
> messages there to see? (edit out the "rhgb quiet" part of the boot line
> in grub). Do you see any messages in /var/log/messages from the smp
> failed boot when you boot into the up kernel? What kind of hardware do
> you have?
> 
> Even if you manage to get to the actual error messages I would not be of
> much help - if it is indeed a problem with some particular driver Ingo
> Molnar and the linux kernel mailing list would be the one to contact
> with error messages and such. 
> 

I reported this problem to the planetccrma list about a month ago, for the 
previous two smp kernels. I just tried the latest and get the same result 
unfortunately. I turned off the graphical and quiet boot, and when it 
locks up (after the ethernet modules appear to be loaded ok) I do get 
error messages like:
irq 209: nobody cared (try booting with the "irqpoll" option)

followed by a stack trace or something. I tried adding the irqpoll 
option in my grub.conf, but that makes things much worse. The messages 
look similar to ones I get in /var/log/messages when 
2.6.13-0.3.rdt.rhfc4.ccrma (not smp) crashes (also reported previously, 
errors copied below). I've booted into the 2.6.14 uni-processor kernel and 
will wait to see if I get the same crashes. 

Nov 28 15:18:23 localhost kernel: irq 21: nobody cared (try booting with
the "irqpoll" option)
Nov 28 15:18:23 localhost kernel:  [<c01471b4>] __report_bad_irq+0x24/0x90
(8)
Nov 28 15:18:23 localhost kernel:  [<c01472c2>] note_interrupt+0x72/0xd0
(20)
Nov 28 15:18:23 localhost kernel:  [<c0146ec4>] do_hardirq+0xe4/0x110 (20)
Nov 28 15:18:23 localhost kernel:  [<c0146ef0>] do_irqd+0x0/0x90 (20)
Nov 28 15:18:23 localhost kernel:  [<c0146f52>] do_irqd+0x62/0x90 (8)
Nov 28 15:18:23 localhost kernel:  [<c0137078>] kthread+0x98/0xa0 (24)
Nov 28 15:18:23 localhost kernel:  [<c0136fe0>] kthread+0x0/0xa0 (12)
Nov 28 15:18:23 localhost kernel:  [<c01013b1>]
kernel_thread_helper+0x5/0x14 (12)
Nov 28 15:18:23 localhost kernel: handlers:
Nov 28 15:18:23 localhost kernel: [<c02b08d0>] (usb_hcd_irq+0x0/0x80)