[Crash-utility] Crash shows - Invalid Memory size

Sharyathi Nagesh sharyath at in.ibm.com
Tue Nov 21 07:38:27 UTC 2006


Hi

We encountered this problem with Crash showing invalid memory when ran
on live machine:
======================================
[root at venuslp11 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          2151       1594        557          0        583        795
             ^^^^^^
-/+ buffers/cache:        215       1935
Swap:         1983          0       1983
[root at venuslp11 ~]# cat /proc/ppc64/lparcfg | grep DesMem
DesMem=2304
      ^^^^^^
[root at venuslp11 ~]# rpm -q crash
crash-4.0-3.7
[root at venuslp11 ~]# crash
...
      KERNEL: /usr/lib/debug/lib/modules/2.6.18-1.2732.el5/vmlinux
    DUMPFILE: /dev/mem
        CPUS: 2
        DATE: Fri Oct 27 07:58:54 2006
      UPTIME: 01:12:34
LOAD AVERAGE: 1.52, 1.05, 0.48
       TASKS: 98
    NODENAME: venuslp11.upt.austin.ibm.com
     RELEASE: 2.6.18-1.2732.el5
     VERSION: #1 SMP Tue Oct 17 18:24:27 EDT 2006
     MACHINE: ppc64  (2301 Mhz)
      MEMORY: 3.2 GB
             ^^^^^^^^
         PID: 25097
     COMMAND: "crash"
        TASK: c000000000fedbe0  [THREAD_INFO: c00000000b190000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)

crash>
==================================
As I looked into the code I found:
	The differences are observed because of the different way in which
they(proc and crash) are implemented to calculate Total Memory.
In /proc/meminfo it traverse through the memory counting each page and it has
different routines to calculate No of pages in highmem, init section, bootmem etc.
Which may be difficult to implement with Crash.
  Instead we can look into sys file
implementation(/sys/devices/system/node/node<n>/meminfo). Here the Total Page is
got not from unsigned long node_spanned_pages but from long node_present_pages.
The definitions of  node_present_pages says 'total number of physical pages'
while node_spanned_pages says 'total size of physical page range, including holes'.
	This is observed because of way Node 2 is spread in the machine its
pfn(physical frame number) starts from 0 while that of 0th and 1st node
starts from 4096 and 8192 pfns respectively. so node3->spanned_pages has
double counted value from even the node 0 and node 1. Hence I feel its
better to use present_pages which has only the pages from the node
excluding the holes.

======================================
The patch to fix the problem:
Let me know of your opinion..


-------------- next part --------------
A non-text attachment was scrubbed...
Name: node_present_page.patch
Type: text/x-patch
Size: 1270 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20061121/48ad7f88/attachment.bin>


More information about the Crash-utility mailing list