[PlanetCCRMA] Shutdown Problems

Fernando Lopez-Lezcano nando@ccrma.Stanford.EDU
Fri Jun 16 18:05:02 2006


On Thu, 2006-06-15 at 14:01 +0200, nigel henry wrote:
> On Thursday 15 June 2006 03:19, Fernando Lopez-Lezcano wrote:
> > On Thu, 2006-06-15 at 01:09 +0200, nigel henry wrote:
> > > Hi Dave. This may not have anything to do with your problem.
> > >
> > > Installed FC5, did the updates, which also installed a new
> > > kernel-2.6.16-1.2122_fc5. Then installed the planetccrma repos to yum,
> > > then installed planetccrma core/core-edge, can't remember which, but
> > > ended up with kernel-2.6.16-1.2080.13.rrt.rhfc5.ccrma, and the
> > > kernel-modules-alsa package that goes with it. No problems with that, and
> > > the sounds work fine.
> > >
> > > Some days later, more updates brought kernel-2.6.16-1.2133_fc5. This one
> > > seems to be creating some problems, and conflicts with the
> > > kernel-modules-alsa package for the planetccrma kernel.
> > >
> > > If I bootup using the 2133 kernel, it boots up ok, but there are problems
> > > shutting down, lots of errors then a trace runs, and ends up with a
> > > segfault and the shutdown stalled.
> > >
> > > I removed the 2133 kernel completely, then reinstalled it, getting 3
> > > warnings about it not being able to work with ccrma's kernel-modules-alsa
> > > package.
> >
> > Could you please send the exact stuff you got? (if you still have it, of
> > course). Perhaps yum is complaining about not finding the
> > kernel-modules-alsa packages for kernel 2133, which is to be expected.
> 
> Hi Fernando. Yes that is what the warnings were about. "unable to find 
> kernel-modules-alsa for kernel-2133".

You can ignore those, it is the yum system trying to "complete" the
installation of 2133 with any kernel modules that were installed on
previous kernels. 

> I've put the shutdown failure message at the bottom.
> >
> > > Shut down the machine, booted up with the 2133 kernel, shut down and same
> > > problems again with a segfault and the shutdown stalling.
> > >
> > > I have another install of FC5 on the same machine, which does not have
> > > any planetccrma software on it. There are no problems with the 2133
> > > kernel on this, and shutdown proceeds as it should.
> > >
> > > It may be worth removing the 2133 kernel, reinstalling the planetccrma
> > > sfuff, including the kernel, and kernel-module-alsa package, and see if
> > > the machine shuts down ok.
> > >
> > > I do have the messages from the halted shutdown if you want to see them
> > > Fernando.
> >
> > Yes, please send any info you have on this.
> > Sounds weird as both kernels should not interfere with each other.
> 
> Jun 13 18:59:57 localhost gdm[2149]: Master halting...
> Jun 13 18:59:58 localhost shutdown[2149]: shutting down for system halt
> Jun 13 18:59:58 localhost init: Switching to runlevel: 0
> Jun 13 18:59:59 localhost gconfd (djmons-2369): Received signal 15, shutting 
> down cleanly
> Jun 13 18:59:59 localhost gconfd (djmons-2369): Exiting
> Jun 13 18:59:59 localhost avahi-daemon[2003]: Got SIGTERM, quitting.
> Jun 13 18:59:59 localhost avahi-daemon[2003]: Leaving mDNS multicast group on 
> interface eth0.IPv4 with address 192.168.0.234.
> Jun 13 19:00:03 localhost kernel: List corruption. next->prev should be 
> cf40ae48, but was cf71ba48
> Jun 13 19:00:03 localhost kernel: ------------[ cut here ]------------
> Jun 13 19:00:03 localhost kernel: kernel BUG at include/linux/list.h:58!
> Jun 13 19:00:03 localhost kernel: invalid opcode: 0000 [#1]
> Jun 13 19:00:03 localhost kernel: last sysfs file: /block/hda/hda1/size
> Jun 13 19:00:03 localhost kernel: Modules linked in: appletalk ipx p8023 ipv6 
> autofs4 ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state 
> ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables vfat fat 
> dm_mirror dm_mod video button battery ac lp parport_pc parport floppy nvram 
> usblp uhci_hcd 3c59x mii gameport snd_seq_dummy snd_seq_oss 
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm 
> snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext3 jbd
> Jun 13 19:00:03 localhost kernel: CPU:    0
> Jun 13 19:00:03 localhost kernel: EIP:    0060:[<d08cf1fa>]    Not tainted VLI
> Jun 13 19:00:03 localhost kernel: EFLAGS: 00010082   (2.6.16-1.2133_FC5 #1) 
> Jun 13 19:00:03 localhost kernel: EIP is at 
> snd_seq_delete_all_ports+0x74/0x17d [snd_seq]

So, it looks like something is happening when the snd-seq module is
being removed on shutdown. 

Hmmmm.... I think I know what the problem might be, or rather why this
is happening only when Planet CCRMA is installed. Planet CCRMA's
alsa-driver package includes an ALSA startup and shutdown script which
is activated by default (/etc/rc.d/init.d/alsasound). When it is active
(check with "/sbin/chkconfig --list alsasound") it will stop the alsa
subsystem as part of the normal shutdown of the computer. 

I bet that is triggering a bug in the ALSA sequencer kernel module
included in the 2133 kernel that only happens on module unload. 

Try disabling the alsasound script:
  /sbin/chkconfig alsasound off
most probably you will find the shutdown proceeds normally (but the bug
is still there, it is just not being tickled :-)

Or, while logged in to 2133 (and after saving work just in case), do a:
  /etc/rc.d/init.d/alsasound stop

You will probably see the but being triggered in /var/log/messages.. or
just a:
  /sbin/rmmod snd-seq
should be enough...

-- Fernando



> Jun 13 19:00:03 localhost kernel: eax: 00000044   ebx: cf40ae48   ecx: 
> cbbeef1c   edx: d08d0433
> Jun 13 19:00:03 localhost kernel: esi: cf40ae48   edi: cf71b9c0   ebp: 
> cf71ba48   esp: cbbeef18
> Jun 13 19:00:03 localhost kernel: ds: 007b   es: 007b   ss: 0068
> Jun 13 19:00:03 localhost kernel: Process rmmod (pid: 3599, 
> threadinfo=cbbee000 task=c5974000)
> Jun 13 19:00:03 localhost kernel: Stack: <0>d08d0433 cf40ae48 cf71ba48 
> cf71ba50 00000246 cf71ba5c c470f7ac 005590d0 
> Jun 13 19:00:03 localhost kernel:        cf71b9c0 00000000 bff2e2e0 cbbee000 
> d08ca1bc cf71b9c0 d08ca27d cf71b9c0 
> Jun 13 19:00:03 localhost kernel:        d08cc573 d0870c80 c01323d6 5f646e73 
> 5f716573 6d6d7564 c8300079 c5ca8334 
> Jun 13 19:00:03 localhost kernel: Call Trace:
> Jun 13 19:00:03 localhost kernel:  [<d08ca1bc>] seq_free_client1+0x8/0x7e 
> [snd_seq]     [<d08ca27d>] seq_free_client+0x4b/0x80 [snd_seq]
> Jun 13 19:00:03 localhost kernel:  [<d08cc573>] 
> snd_seq_delete_kernel_client+0x1a/0x2c [snd_seq]     [<c01323d6>] 
> sys_delete_module+0x191/0x1ce
> Jun 13 19:00:03 localhost kernel:  [<c02e466a>] do_page_fault+0x189/0x51d     
> [<c0102be9>] syscall_call+0x7/0xb
> Jun 13 19:00:03 localhost kernel: Code: 88 00 00 00 39 af 88 00 00 00 74 63 8b 
> 9f 88 00 00 00 8b b7 8c 00 00 00 8b 43 04 39 f0 74 17 50 56 68 33 04 8d d0 e8 
> 43 cc 84 ef <0f> 0b 3a 00 1e 04 8d d0 83 c4 0c 8b 06 39 d8 74 17 50 53 68 e8 
> Jun 13 19:00:03 localhost kernel: Continuing in 120 seconds. 
> Continuing in 119 seconds. 
> Continuing in 118 seconds. 
> Continuing in 117 seconds. 
> 
> <snip>
> 
> Continuing in 4 seconds. 
> Continuing in 3 seconds. 
> Continuing in 2 seconds. 
> Continuing in 1 seconds. 
> Jun 13 19:00:03 localhost kernel:  <3>Debug: sleeping function called from 
> invalid context at include/linux/rwsem.h:43
> Jun 13 19:00:03 localhost kernel: in_atomic():0, irqs_disabled():1
> Jun 13 19:00:03 localhost kernel:  [<c011c70b>] profile_task_exit+0x13/0x3e     
> [<c011dcbb>] do_exit+0x1c/0x717
> Jun 13 19:00:03 localhost kernel:  [<c010405f>] register_die_notifier+0x0/0x2f     
> [<c01045ae>] do_invalid_op+0x0/0x9d
> Jun 13 19:00:03 localhost kernel:  [<c010463f>] do_invalid_op+0x91/0x9d     
> [<d08cf1fa>] snd_seq_delete_all_ports+0x74/0x17d [snd_seq]
> Jun 13 19:00:03 localhost kernel:  [<c011bcb0>] vprintk+0x16d/0x2fa     
> [<c013f3ed>] get_page_from_freelist+0x2f3/0x364
> Jun 13 19:00:03 localhost kernel:  [<c01c9420>] vsnprintf+0x422/0x461     
> [<c01036a3>] error_code+0x4f/0x54
> Jun 13 19:00:03 localhost kernel:  [<d08cf1fa>] 
> snd_seq_delete_all_ports+0x74/0x17d [snd_seq]     [<d08ca1bc>] 
> seq_free_client1+0x8/0x7e [snd_seq]
> Jun 13 19:00:03 localhost kernel:  [<d08ca27d>] seq_free_client+0x4b/0x80 
> [snd_seq]     [<d08cc573>] snd_seq_delete_kernel_client+0x1a/0x2c [snd_seq]
> Jun 13 19:00:03 localhost kernel:  [<c01323d6>] sys_delete_module+0x191/0x1ce     
> [<c02e466a>] do_page_fault+0x189/0x51d       (this is the line I got the 
> segmentation fault on)
> 
> 
> Jun 13 19:00:40 localhost shutdown[3635]: shutting down for system halt   
> (this is where I pressed and held the softstart button)
> 
> 
> Jun 13 19:27:55 localhost syslogd 1.4.1: restart.
> Jun 13 19:27:55 localhost kernel: klogd 1.4.1, log source = /proc/kmsg 
> started.
> Jun 13 19:27:55 localhost kernel: Linux version 
> 2.6.16-1.2080.13.rrt.rhfc5.ccrma (machbuild@planetforge.stanford.edu) (gcc 
> version 4.1.0 20060304 (Red Hat 4.1.0-3)) #1 PREEMPT Sat Apr 22 19:05:27 EDT 
> 2006
> 
> Nigel.