[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: sable smp



It looks like part of the problem was a bad processor board, removing the 
board fixed the kernel panic, however it hangs after saying the remaining 2 
processors are stuck. I didn't see the 'hit bug in handle_ipi' from the patch 
below pop up during boot.

Waiting on wait_init_idle(map=0x0)
All processors have done init_idle
alpha_fp_emul: Invalid FP insn 0x6392c0 at 0x0


> -Hi,
> 
> >Has anyone had luck getting an smp kernel running on a 4/275? 
> >A standard 
> >non-smp kernel works, but an smp kernel starts, says it cannot 
> >start the 
> >remaining 3 processors then panics. I'm trying 2.4.12 and RH 
> >7.1. Is there a 
> >more stable earlier kernel that works?
> >
> 
> I could manage to boot 2.4.4 with the following patches:
> 
> --- linux-2.4.4/arch/alpha/kernel/sys_sable.c   Fri Oct 27 13:55:01 2000
> +++ linux-2.4.4_mf/arch/alpha/kernel/sys_sable.c        Thu Jun  7 10:09:26 2001
> @@ -96,7 +96,7 @@
>  static inline void
>  sable_update_irq_hw(unsigned long bit, unsigned long mask)
>  {
> -       int port = 0x536;
> +       int port = 0x537;
>  
>         if (bit >= 16) {
>                 port = 0x53d;
> @@ -121,7 +121,7 @@
>         } else if (bit >= 8) {
>                 port = 0x53a;
>                 val1 = 0xE0 | (bit - 8);
> -               val2 = 0xE0 | 2;
> +               val2 = 0xE0 | 3;
>         } else {
>                 port = 0x536;
>                 val1 = 0xE0 | (bit - 0);
> 
> The patch above is against 2.4.4, it got included into the newer kernel
> version.
> With just the patch above the interrupt handling works fine again, but
> there seems to remain an race. I hit a NULL pointer reference
> on line 758 of arch/alpha/kernel/smp.c (still 2.4.4).
> I made a validation check as follows:
> --- linux.mcore/arch/alpha/kernel/smp.c Fri Jun  8 09:49:30 2001
> +++ linux.sable/arch/alpha/kernel/smp.c Fri Jun  8 09:54:54 2001
> @@ -767,6 +767,7 @@
>                         int wait;
>  
>                         data = smp_call_function_data;
> +                       if (!data) printk("hit bug in handle_ipi\n");
>                         func = data->func;
>                         info = data->info;
>                         wait = data->wait;
> 
> and the problem went away. I never saw the print happening, but 'data'
> was not NULL anymore. I played around with memory barriers, but it
> did not help. The lines of the patch are slightly off since it is
> diffed versus 2.4.4 with the in-core-memory-dump patch from
> Mission Critical Linux applied.
> The system still does not run stable. I don't see any panics anymore,
> but on launch of a new program it sometimes segfaults somewhere
> within ld.so. I used a plain Redhat 6.2 with NFS root to test.
> 2.2.18 worked fine (at least for a couple of hours, I did not test
> any longer).
> 
> I don't have access to the systems anymore, so passing the information
> out is probably all I can do right now. But any feedback is appreciated.
> 
> Good luck,
> 
> Martin Frey
> 
> -- 
> Supercomputing Systems AG       email: frey@scs.ch
> Martin Frey                     web:   http://www.scs.ch/~frey/
> Technoparkstrasse 1             phone: +41 1 445 16 00
> CH-8005 Zuerich                 fax:   +41 1 445 16 10
> 
> 
> 
> _______________________________________________
> Axp-list mailing list
> Axp-list@redhat.com
> https://listman.redhat.com/mailman/listinfo/axp-list
> 






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []