[Crash-utility] Running idle threads show wrong CPU numbers

Michael Holzheu holzheu at linux.vnet.ibm.com
Wed Feb 10 13:32:29 UTC 2010


Hallo Dave,

On Fri, 2010-01-22 at 09:32 -0500, Dave Anderson wrote:
> ----- "Michael Holzheu" <holzheu at linux.vnet.ibm.com> wrote:
> 
> > Hi Dave,
> > 
> > I have a problem with a dump where I have defined five CPUs and two of
> > them are offline. In fact the logical CPUs are defined as follows:
> > 
> >   0 on
> >   1 on
> >   2 off
> >   3 off
> >   4 on
> > 
> > The CPU online map looks correct:
> > 
> > crash> print/x *cpu_online_mask
> > $4 = {
> >   bits = {0x13} ---> b10011
> > }
> > 
> > When I issue "ps" I see that all running tasks are idle, but the CPU
> > numbers are not correct (0,1,2 and not 0,1,4):
> > 
> >    PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
> > >     0      0   0       800ef0       RU   0.0       0      0  [swapper]
> > >     0      0   1      18c24240      RU   0.0       0      0  [swapper]
> > >     0      0   2      18c2c340      RU   0.0       0      0  [swapper]
> > 
> > I tried to debug the problem, but got stuck somewhere in "task.c". I
> > think there is a problem with the idle threads initialization, where the
> > online map is not considered.
> > 
> > Maybe you can see the bug immediately. Otherwise I will have spend more
> > effort for debugging that problem. I hope not :-)
> 
> Does "sys" show 5 or 3 cpus?  I'm guessing it shows 3, but should show 5.

Yes it shows 3.

> It looks like the s390/s390x files need to use "get_highest_cpu_online()-1"
> (like x86_64 and ppc64) in order to determine the number of cpus to account
> for.  As it is now, they do this, and would therefore only account for the
> first 3 cpus:
> 
> int
> s390x_get_smp_cpus(void)
> {
>         return get_cpus_online();
> }
> 
> int
> s390_get_smp_cpus(void)
> {
>         return get_cpus_online();
> }

Hmmm ok...

When I change get_smp_cpus() to return "get_highest_cpu_online() + 1" I
see five swapper idle tasks when using "ps". The problem I now have is
that I have to provide a backtrace for the offline cpus. But the offline
CPUs do not have any stack on s390. Is there a way to tell crash that
there is no backtrace available? Probably I overlooked something...

Michael




More information about the Crash-utility mailing list