[Crash-utility] Question on online/present/possible CPUS

Fri Sep 24 13:15:27 UTC 2010

----- "Jeffrey Hagen" <Jeffrey.Hagen at teradata.com> wrote:

> Paranoia is usually a good thing in this industry and you know this code
> far better that I do...
> 
> For the older kernels that don't have cpu_present_map, if they still
> have the x8664_pda structure, the code my patch changes shouldn't get
> executed.  It's the deprecation of the x8664_pda structure (between
> SLES10 and SLES11 in our case) that exposes this issue.

True...

> 
> The setting of the other CPU's to offline (IPI REBOOT_VECTOR) is done in
> native_smp_send_stop [arch/x86/kernel/smp.c] called by panic().  Note
> that the SLES11 version of the 2.6.32 kernel allows calling
> crash_kexec() after calling  atomic_notifer_call_chain() in panic().

Ah-ha!  That makes sense -- I was under the impression that all of the
other distros would follow upstream with crash_kexec() being called
before, and therefore preventing, the subsequent smp_send_stop() call.

So given that this would happen whenever panic() gets called directly
in a SLES kernel, is the SLES version of the crash utility patched to do
something similar to your patch?  

Petr?

> The flow during an oops or keyboard induced crash does not use this same
> code.  In this case crash_kexec() is called by oops_end() which is
> called by die().

OK, I'm going to give your patch a run-through with ~150 or so x86_64 
dumpfiles I've kept as examples over the years, and see if anything
interesting happens.

Thanks Jeff,
  Dave