[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: sable smp



-Hi,

>Has anyone had luck getting an smp kernel running on a 4/275? 
>A standard 
>non-smp kernel works, but an smp kernel starts, says it cannot 
>start the 
>remaining 3 processors then panics. I'm trying 2.4.12 and RH 
>7.1. Is there a 
>more stable earlier kernel that works?
>

I could manage to boot 2.4.4 with the following patches:

--- linux-2.4.4/arch/alpha/kernel/sys_sable.c   Fri Oct 27 13:55:01 2000
+++ linux-2.4.4_mf/arch/alpha/kernel/sys_sable.c        Thu Jun  7 10:09:26 2001
@@ -96,7 +96,7 @@
 static inline void
 sable_update_irq_hw(unsigned long bit, unsigned long mask)
 {
-       int port = 0x536;
+       int port = 0x537;
 
        if (bit >= 16) {
                port = 0x53d;
@@ -121,7 +121,7 @@
        } else if (bit >= 8) {
                port = 0x53a;
                val1 = 0xE0 | (bit - 8);
-               val2 = 0xE0 | 2;
+               val2 = 0xE0 | 3;
        } else {
                port = 0x536;
                val1 = 0xE0 | (bit - 0);

The patch above is against 2.4.4, it got included into the newer kernel
version.
With just the patch above the interrupt handling works fine again, but
there seems to remain an race. I hit a NULL pointer reference
on line 758 of arch/alpha/kernel/smp.c (still 2.4.4).
I made a validation check as follows:
--- linux.mcore/arch/alpha/kernel/smp.c Fri Jun  8 09:49:30 2001
+++ linux.sable/arch/alpha/kernel/smp.c Fri Jun  8 09:54:54 2001
@@ -767,6 +767,7 @@
                        int wait;
 
                        data = smp_call_function_data;
+                       if (!data) printk("hit bug in handle_ipi\n");
                        func = data->func;
                        info = data->info;
                        wait = data->wait;

and the problem went away. I never saw the print happening, but 'data'
was not NULL anymore. I played around with memory barriers, but it
did not help. The lines of the patch are slightly off since it is
diffed versus 2.4.4 with the in-core-memory-dump patch from
Mission Critical Linux applied.
The system still does not run stable. I don't see any panics anymore,
but on launch of a new program it sometimes segfaults somewhere
within ld.so. I used a plain Redhat 6.2 with NFS root to test.
2.2.18 worked fine (at least for a couple of hours, I did not test
any longer).

I don't have access to the systems anymore, so passing the information
out is probably all I can do right now. But any feedback is appreciated.

Good luck,

Martin Frey

-- 
Supercomputing Systems AG       email: frey@scs.ch
Martin Frey                     web:   http://www.scs.ch/~frey/
Technoparkstrasse 1             phone: +41 1 445 16 00
CH-8005 Zuerich                 fax:   +41 1 445 16 10





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []