[PlanetCCRMA] Shutdown Problems

Nigel Henry cave.dnb@tiscali.fr
Sat Jun 17 09:00:02 2006


On Saturday 17 June 2006 03:04, Fernando Lopez-Lezcano wrote:
> On Thu, 2006-06-15 at 14:01 +0200, nigel henry wrote:
> > On Thursday 15 June 2006 03:19, Fernando Lopez-Lezcano wrote:
> > > On Thu, 2006-06-15 at 01:09 +0200, nigel henry wrote:
> > > > Hi Dave. This may not have anything to do with your problem.
> > > >
> > > > Installed FC5, did the updates, which also installed a new
> > > > kernel-2.6.16-1.2122_fc5. Then installed the planetccrma repos to
> > > > yum, then installed planetccrma core/core-edge, can't remember which,
> > > > but ended up with kernel-2.6.16-1.2080.13.rrt.rhfc5.ccrma, and the
> > > > kernel-modules-alsa package that goes with it. No problems with that,
> > > > and the sounds work fine.
> > > >
> > > > Some days later, more updates brought kernel-2.6.16-1.2133_fc5. This
> > > > one seems to be creating some problems, and conflicts with the
> > > > kernel-modules-alsa package for the planetccrma kernel.
> > > >
> > > > If I bootup using the 2133 kernel, it boots up ok, but there are
> > > > problems shutting down, lots of errors then a trace runs, and ends up
> > > > with a segfault and the shutdown stalled.
> > > >
> > > > I removed the 2133 kernel completely, then reinstalled it, getting 3
> > > > warnings about it not being able to work with ccrma's
> > > > kernel-modules-alsa package.
> > >
> > > Could you please send the exact stuff you got? (if you still have it,
> > > of course). Perhaps yum is complaining about not finding the
> > > kernel-modules-alsa packages for kernel 2133, which is to be expected.
> >
> > Hi Fernando. Yes that is what the warnings were about. "unable to find
> > kernel-modules-alsa for kernel-2133".
>
> You can ignore those, it is the yum system trying to "complete" the
> installation of 2133 with any kernel modules that were installed on
> previous kernels.

Ok. Fair enough.
>
> > I've put the shutdown failure message at the bottom.
> >
> > > > Shut down the machine, booted up with the 2133 kernel, shut down and
> > > > same problems again with a segfault and the shutdown stalling.
> > > >
> > > > I have another install of FC5 on the same machine, which does not
> > > > have any planetccrma software on it. There are no problems with the
> > > > 2133 kernel on this, and shutdown proceeds as it should.
> > > >
> > > > It may be worth removing the 2133 kernel, reinstalling the
> > > > planetccrma sfuff, including the kernel, and kernel-module-alsa
> > > > package, and see if the machine shuts down ok.
> > > >
> > > > I do have the messages from the halted shutdown if you want to see
> > > > them Fernando.
> > >
> > > Yes, please send any info you have on this.
> > > Sounds weird as both kernels should not interfere with each other.
> >
> > Jun 13 18:59:57 localhost gdm[2149]: Master halting...
> > Jun 13 18:59:58 localhost shutdown[2149]: shutting down for system halt
> > Jun 13 18:59:58 localhost init: Switching to runlevel: 0
> > Jun 13 18:59:59 localhost gconfd (djmons-2369): Received signal 15,
> > shutting down cleanly
> > Jun 13 18:59:59 localhost gconfd (djmons-2369): Exiting
> > Jun 13 18:59:59 localhost avahi-daemon[2003]: Got SIGTERM, quitting.
> > Jun 13 18:59:59 localhost avahi-daemon[2003]: Leaving mDNS multicast
> > group on interface eth0.IPv4 with address 192.168.0.234.
> > Jun 13 19:00:03 localhost kernel: List corruption. next->prev should be
> > cf40ae48, but was cf71ba48
> > Jun 13 19:00:03 localhost kernel: ------------[ cut here ]------------
> > Jun 13 19:00:03 localhost kernel: kernel BUG at include/linux/list.h:58!
> > Jun 13 19:00:03 localhost kernel: invalid opcode: 0000 [#1]
> > Jun 13 19:00:03 localhost kernel: last sysfs file: /block/hda/hda1/size
> > Jun 13 19:00:03 localhost kernel: Modules linked in: appletalk ipx p8023
> > ipv6 autofs4 ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state
> > ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables vfat
> > fat dm_mirror dm_mod video button battery ac lp parport_pc parport floppy
> > nvram usblp uhci_hcd 3c59x mii gameport snd_seq_dummy snd_seq_oss
> > snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> > snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext3
> > jbd Jun 13 19:00:03 localhost kernel: CPU:    0
> > Jun 13 19:00:03 localhost kernel: EIP:    0060:[<d08cf1fa>]    Not
> > tainted VLI Jun 13 19:00:03 localhost kernel: EFLAGS: 00010082  
> > (2.6.16-1.2133_FC5 #1) Jun 13 19:00:03 localhost kernel: EIP is at
> > snd_seq_delete_all_ports+0x74/0x17d [snd_seq]
>
> So, it looks like something is happening when the snd-seq module is
> being removed on shutdown.
>
> Hmmmm.... I think I know what the problem might be, or rather why this
> is happening only when Planet CCRMA is installed. Planet CCRMA's
> alsa-driver package includes an ALSA startup and shutdown script which
> is activated by default (/etc/rc.d/init.d/alsasound). When it is active
> (check with "/sbin/chkconfig --list alsasound") it will stop the alsa
> subsystem as part of the normal shutdown of the computer.
>
> I bet that is triggering a bug in the ALSA sequencer kernel module
> included in the 2133 kernel that only happens on module unload.
>
> Try disabling the alsasound script:
>   /sbin/chkconfig alsasound off
> most probably you will find the shutdown proceeds normally (but the bug
> is still there, it is just not being tickled :-)
>
> Or, while logged in to 2133 (and after saving work just in case), do a:
>   /etc/rc.d/init.d/alsasound stop

It didn't like it at all when I did /etc/rc.d/init.d/alsasound stop. 100% CPU 
with mouse locked up for a couple of minutes, then a load of output on the 
Konsole from syslog. Message below.

[root@localhost djmons]# /etc/rc.d/init.d/alsasound stop
Shutting down sound driver/etc/rc.d/init.d/alsasound: line 215:  3438 
Segmentation fault      /sbin/rmmod `echo $line | cut -d ' ' -f 1`

Message from syslogd@localhost at Sat Jun 17 16:09:30 2006 ...
localhost kernel: ------------[ cut here ]------------

Message from syslogd@localhost at Sat Jun 17 16:09:30 2006 ...
localhost kernel: kernel BUG at include/linux/list.h:58!

Message from syslogd@localhost at Sat Jun 17 16:09:30 2006 ...
localhost kernel: invalid opcode: 0000 [#1]

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: CPU:    0

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: EIP is at snd_seq_delete_all_ports+0x74/0x17d [snd_seq]

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: eax: 00000044   ebx: cf446448   ecx: c38c9f1c   edx: 
d08ef433

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: esi: cf446448   edi: c12e39c0   ebp: c12e3a48   esp: 
c38c9f18

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: ds: 007b   es: 007b   ss: 0068

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: Process rmmod (pid: 3438, threadinfo=c38c9000 task=c646faa0)

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel: Stack: <0>d08ef433 cf446448 c12e3a48 c12e3a50 00000246 
c12e3a5c c2853b1c 005590d0

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel:        c12e39c0 00000000 bfc58960 c38c9000 d08e91bc c12e39c0 
d08e927d c12e39c0

Message from syslogd@localhost at Sat Jun 17 16:09:31 2006 ...
localhost kernel:        d08eb573 d0858c80 c01323d6 5f646e73 5f716573 6d6d7564 
c55b0079 ca4a2e8c

Message from syslogd@localhost at Sat Jun 17 16:09:32 2006 ...
localhost kernel: Call Trace:

Message from syslogd@localhost at Sat Jun 17 16:09:32 2006 ...
localhost kernel:  [<d08e91bc>] seq_free_client1+0x8/0x7e [snd_seq]     
[<d08e927d>] seq_free_client+0x4b/0x80 [snd_seq]

Message from syslogd@localhost at Sat Jun 17 16:09:32 2006 ...
localhost kernel:  [<d08eb573>] snd_seq_delete_kernel_client+0x1a/0x2c 
[snd_seq]     [<c01323d6>] sys_delete_module+0x191/0x1ce

Message from syslogd@localhost at Sat Jun 17 16:09:32 2006 ...
localhost kernel:  [<c02e466a>] do_page_fault+0x189/0x51d     [<c0102be9>] 
syscall_call+0x7/0xb

Message from syslogd@localhost at Sat Jun 17 16:09:32 2006 ...
localhost kernel: Code: 88 00 00 00 39 af 88 00 00 00 74 63 8b 9f 88 00 00 00 
8b b7 8c 00 00 00 8b 43 04 39 f0 74 17 50 56 68 33 f4 8e d0 e8 43 dc 82 ef 
<0f> 0b 3a 00 1e f4 8e d0 83 c4 0c 8b 06 39 d8 74 17 50 53 68 e8

Trying a shutdown at this stage, 2133 just hangs at "Stopping sound driver", 
and I let it hang for quite a few minutes, just in case.

Perhaps strangely, the 2122 kernel, which also came down with updates, and was 
installed after the planetccrma one, has no such problems on shutdown.

I'm not bothered about the 2133 kernel myself, as I use the planetccrma one. I 
don't know what changes there are between the 2122, and the 2133 kernel. Both 
boot up and shut down ok on the other install of FC5 on the same machine, 
which does not have the planetccrma kernel, or the kernel-module-alsa package 
on it.

I'm not clued up enough to know whether this is a genuine bug in the 2133 
kernel. You've had many more years of experience with this sort of thing than 
me.

I'm using Apt again now on FC5 for the updates, and Yum only for the 
planetccrma stuff. Any ideas what I need to do to stop the Fedora kernel 
updates With Apt? This way I could remove the 2133 kernel and it would not be 
reinstalled, along with later versions. It's a pity you've moved to Yum for 
the planetccrma packages. Yumex is so slow to load up compared to Synaptic, 
when you're looking for individual packages.

Have a nice weekend. Nigel.
>
> You will probably see the but being triggered in /var/log/messages.. or
> just a:
>   /sbin/rmmod snd-seq
> should be enough...
>
> -- Fernando
>
> > Jun 13 19:00:03 localhost kernel: eax: 00000044   ebx: cf40ae48   ecx:
> > cbbeef1c   edx: d08d0433
> > Jun 13 19:00:03 localhost kernel: esi: cf40ae48   edi: cf71b9c0   ebp:
> > cf71ba48   esp: cbbeef18
> > Jun 13 19:00:03 localhost kernel: ds: 007b   es: 007b   ss: 0068
> > Jun 13 19:00:03 localhost kernel: Process rmmod (pid: 3599,
> > threadinfo=cbbee000 task=c5974000)
> > Jun 13 19:00:03 localhost kernel: Stack: <0>d08d0433 cf40ae48 cf71ba48
> > cf71ba50 00000246 cf71ba5c c470f7ac 005590d0
> > Jun 13 19:00:03 localhost kernel:        cf71b9c0 00000000 bff2e2e0
> > cbbee000 d08ca1bc cf71b9c0 d08ca27d cf71b9c0
> > Jun 13 19:00:03 localhost kernel:        d08cc573 d0870c80 c01323d6
> > 5f646e73 5f716573 6d6d7564 c8300079 c5ca8334
> > Jun 13 19:00:03 localhost kernel: Call Trace:
> > Jun 13 19:00:03 localhost kernel:  [<d08ca1bc>] seq_free_client1+0x8/0x7e
> > [snd_seq]     [<d08ca27d>] seq_free_client+0x4b/0x80 [snd_seq]
> > Jun 13 19:00:03 localhost kernel:  [<d08cc573>]
> > snd_seq_delete_kernel_client+0x1a/0x2c [snd_seq]     [<c01323d6>]
> > sys_delete_module+0x191/0x1ce
> > Jun 13 19:00:03 localhost kernel:  [<c02e466a>] do_page_fault+0x189/0x51d
> > [<c0102be9>] syscall_call+0x7/0xb
> > Jun 13 19:00:03 localhost kernel: Code: 88 00 00 00 39 af 88 00 00 00 74
> > 63 8b 9f 88 00 00 00 8b b7 8c 00 00 00 8b 43 04 39 f0 74 17 50 56 68 33
> > 04 8d d0 e8 43 cc 84 ef <0f> 0b 3a 00 1e 04 8d d0 83 c4 0c 8b 06 39 d8 74
> > 17 50 53 68 e8 Jun 13 19:00:03 localhost kernel: Continuing in 120
> > seconds.
> > Continuing in 119 seconds.
> > Continuing in 118 seconds.
> > Continuing in 117 seconds.
> >
> > <snip>
> >
> > Continuing in 4 seconds.
> > Continuing in 3 seconds.
> > Continuing in 2 seconds.
> > Continuing in 1 seconds.
> > Jun 13 19:00:03 localhost kernel:  <3>Debug: sleeping function called
> > from invalid context at include/linux/rwsem.h:43
> > Jun 13 19:00:03 localhost kernel: in_atomic():0, irqs_disabled():1
> > Jun 13 19:00:03 localhost kernel:  [<c011c70b>]
> > profile_task_exit+0x13/0x3e [<c011dcbb>] do_exit+0x1c/0x717
> > Jun 13 19:00:03 localhost kernel:  [<c010405f>]
> > register_die_notifier+0x0/0x2f [<c01045ae>] do_invalid_op+0x0/0x9d
> > Jun 13 19:00:03 localhost kernel:  [<c010463f>] do_invalid_op+0x91/0x9d
> > [<d08cf1fa>] snd_seq_delete_all_ports+0x74/0x17d [snd_seq]
> > Jun 13 19:00:03 localhost kernel:  [<c011bcb0>] vprintk+0x16d/0x2fa
> > [<c013f3ed>] get_page_from_freelist+0x2f3/0x364
> > Jun 13 19:00:03 localhost kernel:  [<c01c9420>] vsnprintf+0x422/0x461
> > [<c01036a3>] error_code+0x4f/0x54
> > Jun 13 19:00:03 localhost kernel:  [<d08cf1fa>]
> > snd_seq_delete_all_ports+0x74/0x17d [snd_seq]     [<d08ca1bc>]
> > seq_free_client1+0x8/0x7e [snd_seq]
> > Jun 13 19:00:03 localhost kernel:  [<d08ca27d>] seq_free_client+0x4b/0x80
> > [snd_seq]     [<d08cc573>] snd_seq_delete_kernel_client+0x1a/0x2c
> > [snd_seq] Jun 13 19:00:03 localhost kernel:  [<c01323d6>]
> > sys_delete_module+0x191/0x1ce [<c02e466a>] do_page_fault+0x189/0x51d     
> >  (this is the line I got the segmentation fault on)
> >
> >
> > Jun 13 19:00:40 localhost shutdown[3635]: shutting down for system halt
> > (this is where I pressed and held the softstart button)
> >
> >
> > Jun 13 19:27:55 localhost syslogd 1.4.1: restart.
> > Jun 13 19:27:55 localhost kernel: klogd 1.4.1, log source = /proc/kmsg
> > started.
> > Jun 13 19:27:55 localhost kernel: Linux version
> > 2.6.16-1.2080.13.rrt.rhfc5.ccrma (machbuild@planetforge.stanford.edu)
> > (gcc version 4.1.0 20060304 (Red Hat 4.1.0-3)) #1 PREEMPT Sat Apr 22
> > 19:05:27 EDT 2006
> >
> > Nigel.