[Crash-utility] [RFC][PATCH]: crash aborts with cannot determine idle task

Dave Anderson anderson at redhat.com
Tue Jun 9 20:15:43 UTC 2009


----- "Dave Anderson" <anderson at redhat.com> wrote:

> This looks one good.  The only change that I will make is
> in the map_cpu_prstatus() function -- which should just return
> immediately if get_cpus_online() is equal to nd->num_prstatus_notes.

Sorry -- that's not what I meant...  

What I want to avoid is screwing around with the prstatus notes bookkeeping
unless it is absolutely necessary, i.e., where there had been some cpus offlined
prior to the crash.  The original thread back in April 2008 mentioned something
to the effect that your test system only had cpus 12 and 13 online at the time of
the crash.  When that is the case, is kt->cpus equal to 14?  I.e., what
does the "sys" command show for "CPUS:"?

I ask because this is the way I'd prefer to go:

void
map_cpus_to_prstatus(void)
{
        void **nt_ptr;
        int online, i, j, nrcpus;
        size_t size;

        if (!(online = get_cpus_online()) || (online == kt->cpus))
                return;

        if (CRASHDEBUG(1))
                error(INFO,
                    "cpus: %d online: %d NT_PRSTATUS notes: %d (remapping)\n",
                        kt->cpus, online, nd->num_prstatus_notes);

        size = NR_CPUS * sizeof(void *);

        nt_ptr = (void **)GETBUF(size);
        BCOPY(nd->nt_prstatus_percpu, nt_ptr, size);
        BZERO(nd->nt_prstatus_percpu, size);

        /*
         *  Re-populate the array with the notes mapping to online cpus
         */
        nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS);

        for (i = 0, j = 0; i < nrcpus; i++) {
                if (in_cpu_map(ONLINE, i))
                        nd->nt_prstatus_percpu[i] = nt_ptr[j++];
        }

        FREEBUF(nt_ptr);
}

And since kt->cpus may not be finally initialized until later than
kernel_init(), I moved the call to map_cpus_to_prstatus() to here
in task_init():

        if (ACTIVE()) {
                active_pid = REMOTE() ? pc->server_pid : pc->program_pid;
                set_context(NO_TASK, active_pid);
                tt->this_task = pid_to_task(active_pid);
        }
        else {
                if (KDUMP_DUMPFILE())
                        map_cpus_to_prstatus();
                please_wait("determining panic task");
                set_context(get_panic_context(), NO_PID);
                please_wait_done();
        }

Can you test the map_cpus_to_prstatus() function above, along with the
movement of the call to it from kernel_init() to task_init()?

Thanks,
  Dave




More information about the Crash-utility mailing list