[Crash-utility] Re: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

Wed Nov 11 14:52:52 UTC 2009

----- "Bob Montgomery" <bob.montgomery at hp.com> wrote:

> I have a dump from a 2.6.31-based x86_64 system where the number of
> "possible" cpus equals the system's NR_CPUS (32).  
> On that system, the __per_cpu_offset table in the kernel consists of 32
> valid offset pointers.
> 
> When crash loads this table into its __per_cpu_offset[NR_CPUS=4096]
> array in struct kernel_table, it knows the length of the kernel's array
> (32*sizeof(long)), and copies the 32 pointers, leaving the rest of its
> (much longer) array full of 0x0s.
> 
> (This happens in kernel.c)
> 
>  193      if (symbol_exists("__per_cpu_offset")) {
>  194              if (LKCD_KERNTYPES())
>  195                      i = get_cpus_possible();
>  196              else
>  197                      i = get_array_length("__per_cpu_offset", NULL, 0);
>  198              get_symbol_data("__per_cpu_offset",
>  199                      sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS),
>  200                      &kt->__per_cpu_offset[0]);
>  201              kt->flags |= PER_CPU_OFF;
>  202      }
> 
> Later, in a couple of places, crash checks for the maximum valid
> __per_cpu_offset by reading the cpu_number value out of each per_cpu
> area and comparing it to the expected number until the comparison fails.
> (Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and
> that's OK).
> 
> >From x86_64.c:
>   
> 4201            for (i = cpus = 0; i < NR_CPUS; i++) {
> 4202                    readmem(symbol_value("per_cpu__cpu_number") +
> 4203                            kt->__per_cpu_offset[i], KVADDR,
> 4204                            &cpunumber, sizeof(int),
> 4205                            "cpu number (per_cpu)", FAULT_ON_ERROR);
> 4206                    if (cpunumber != cpus)
> 4207                            break;
> 4208                    cpus++;
> 4209            }
> 
> This works well when the kernel's array has fewer real per_cpu_offsets
> than its own NR_CPUS, since the kernel preloads its array with a pointer
> (BOOT_PERCPU_OFFSET) and when this loop runs past the real
> per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it
> reads a bogus value for cpunumber and terminates.
> 
> But when the kernel's table is full of valid per_cpu_offset pointers,
> this loop continues off the end of that into the part of crash's
> __per_cpu_offset array that has the 0x0 initial values, and dies with:
> 
> crash: invalid kernel virtual address: cc08  type: "cpu number (per_cpu)"
> 
> The cc08 comes from the symbol_value of per_cpu__cpu_number:
> 000000000000cc08 D per_cpu__cpu_number
> 
> Bottom line:  Crash is assuming an insufficient array termination for
> the kernel's __per_cpu_offset array (a pointer that points to an invalid
> cpu_number).
> 
> The included patch adds an additional loop termination so that crash
> doesn't run off the end of what it loaded from the dump.  It just checks
> for a NULL 0x0 value in kt->__per_cpu_offset[i].
> 
> Bob Montgomery,
> Working at HP

I have a similar-but-different fix queued for this, but instead of
checking for a NULL kt->__per_cpu_offset[i] entry, it changes the
readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR
like this:

                if (!readmem(symbol_value("per_cpu__cpu_number") +
                    kt->__per_cpu_offset[i],
                    KVADDR, &cpunumber, sizeof(int),
                    "cpu number (per_cpu)", QUIET|RETURN_ON_ERROR))
                        break;

That should prevent the failure you're seeing.

But another question is in the (extremely) rare circumstance of a
non-CONFIG_SMP kernel.  In that case, the kt->__per_cpu_offset[] array
would be all NULL, and the symbol_value("per_cpu__cpu_number")
call would return the qualified unity-mapped address.  So the
virtual address calculation should work in x86_64_per_cpu_init(),
and the loop wouldn't even be entered in x86_64_get_smp_cpus()

That being said, I don't think I've seen a recent x86_64 kernel
that was not compiled CONFIG_SMP, so I can't confirm that it's
ever been tested.  

So for sanity's sake, maybe your patch should also be applied,
but should also check if the "i" index is non-zero?

Thanks,
  Dave