[PlanetCCRMA] CCRMA SMP FC4 crashes

Fernando Lopez-Lezcano nando@ccrma.Stanford.EDU
Mon Dec 5 10:06:01 2005


On Mon, 2005-12-05 at 12:11 -0500, Roman Katzer wrote:
> Hi Fernando, hi all,
> 
> I built a FC4 CCRMA box on a P4/Intel chipset machine.
> I left it running over the weekend and came back to the office to find
> it crashed. I rebooted and took note of these lines in
> /var/log/messages [1]
> 
> It crashed again right befire I write this email. Again, the log has
> these lines [2]
> 
> As you can see, the errors don't seem to be immediately related to the
> crashes (entries after the error lines).
> When the box crashed, I had X (and kate, konsole and ksysguard) running.
> 
> Any ideas as to what could be causing this?

A bug in the realtime preemption patch that manifests on your hardware. 

> At neither time of crash I was actively doing anything on the system.
> 
> uname -a gives
> Linux lab-fc4v2 2.6.12-0.21.rdt.rhfc4.ccrmasmp #1 SMP Mon Jul 11
> 16:37:45 EDT 2005 i686 i686 i386 GNU/Linux

You could try a more recent kernel that I never officially released but
many people are successfully using. You need to activate a special
"planetedge" repository to install it. Edit:
  /etc/apt/sources.list.d/planetccrma.list
duplicate the line that ends in "planetcore" and replace "planetcore"
with "planetedge" in the additional line, then
  apt-get update
  apt-get install planetccrma-core-edge-smp

I hope to be able to release a new kernel for fc4 in the next few days.
I thought I had a good one last week and started to have recurring
crashes on evolution startup (the email client). A tentative patch was
posted on lkml and I'm trying it right now, seems to have solved _that_
problem... we'll see...

-- Fernando

> [1]:
> Dec  3 01:32:24 lab-fc4v2 kernel: BUG: scheduling with irqs disabled:
> X/0x00000001/2594
> Dec  3 01:32:24 lab-fc4v2 kernel: caller is del_timer_sync+0x85/0xc0
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c0355c94>] schedule+0x64/0x100 (8)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> Dec  3 01:32:24 lab-fc4v2 kernel: BUG: scheduling while atomic:
> X/0x00000001/2594
> Dec  3 01:32:24 lab-fc4v2 kernel: caller is schedule+0x85/0x100
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c035595c>] __schedule+0x63c/0x910 (8)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (20)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c013a0bc>]
> __kernel_text_address+0x1c/0x30 (16)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01051f4>] show_trace+0x34/0x90 (8)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c0355cb5>] schedule+0x85/0x100 (36)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  3 01:32:24 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> 
> 
> 
> [2]:
> Dec  5 10:13:34 lab-fc4v2 ntpd[2302]: kernel time sync enabled 0001
> Dec  5 10:18:05 lab-fc4v2 kernel: BUG: scheduling with irqs disabled:
> X/0x00000001/2652
> Dec  5 10:18:05 lab-fc4v2 kernel: caller is del_timer_sync+0x85/0xc0
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c0355c94>] schedule+0x64/0x100 (8)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> Dec  5 10:18:05 lab-fc4v2 kernel: BUG: scheduling while atomic:
> X/0x00000001/2652
> Dec  5 10:18:05 lab-fc4v2 kernel: caller is schedule+0x85/0x100
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c035595c>] __schedule+0x63c/0x910 (8)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (20)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c013a0bc>]
> __kernel_text_address+0x1c/0x30 (16)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01051f4>] show_trace+0x34/0x90 (8)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c0355cb5>] schedule+0x85/0x100 (36)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  5 10:18:05 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> Dec  5 10:20:16 lab-fc4v2 kernel: BUG: scheduling with irqs disabled:
> X/0x00000001/2652
> Dec  5 10:20:16 lab-fc4v2 kernel: caller is del_timer_sync+0x85/0xc0
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c0355c94>] schedule+0x64/0x100 (8)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> Dec  5 10:20:16 lab-fc4v2 kernel: BUG: scheduling while atomic:
> X/0x00000001/2652
> Dec  5 10:20:16 lab-fc4v2 kernel: caller is schedule+0x85/0x100
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c035595c>] __schedule+0x63c/0x910 (8)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (20)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c013a0bc>]
> __kernel_text_address+0x1c/0x30 (16)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01051f4>] show_trace+0x34/0x90 (8)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c0355cb5>] schedule+0x85/0x100 (36)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01306a5>] del_timer_sync+0x85/0xc0 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c013d2e0>]
> autoremove_wake_function+0x0/0x50 (12)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c012b33e>] do_setitimer+0x9e/0x630 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c014ba44>]
> audit_syscall_entry+0x1a4/0x200 (60)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c014bc70>]
> audit_syscall_exit+0x1d0/0x2f0 (8)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01f0eae>] copy_from_user+0x4e/0xc0 (16)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c012b90e>] sys_setitimer+0x3e/0xa0 (28)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c0108fe9>]
> do_syscall_trace+0x109/0x13d (12)
> Dec  5 10:20:16 lab-fc4v2 kernel:  [<c01043a1>] syscall_call+0x7/0xb (40)
> Dec  5 11:01:01 lab-fc4v2 crond(pam_unix)[3410]: session opened for
> user root by (uid=0)
> Dec  5 11:01:01 lab-fc4v2 crond(pam_unix)[3410]: session closed for user root
> Dec  5 11:54:15 lab-fc4v2 syslogd 1.4.1: restart.
> 
> _______________________________________________
> PlanetCCRMA mailing list
> PlanetCCRMA@ccrma.stanford.edu
> http://ccrma-mail.stanford.edu/mailman/listinfo/planetccrma