[Crash-utility] Fwd: s390x fixes

Michael Holzheu holzheu at linux.vnet.ibm.com
Wed May 2 12:58:36 UTC 2012


Hi Dave,

On Tue, 01 May 2012 15:24:17 -0400 (EDT)
Dave Anderson <anderson at redhat.com> wrote:

> With respect to the 3rd "vm -p" bug, I did some cursory debugging, and
> here's what I found.
> 
> In all cases, the readmem() failure occurs in
> _kl_pg_table_deref_s390x() as a result of transitioning from one page
> of PTEs to the next, because the pointer to the "next" page of PTES
> contains 0x20, which looks to be _SEGMENT_ENTRY_INV or
> _REGION_ENTRY_INV? (not sure of the s390x nomenclature...)
> 
> So you'll see something like this in the page table that points
> to the pages of PTEs:
>          
>          ...
>          c6386e0:  0000000000000020
> 0000000000000020   ....... ....... c6386f0:  000000001608c800
> 0000000000000020   ............... c638700:  0000000000000020
> 0000000000000020   ....... ....... ...
> 
> The vaddr's in the page of PTEs pointed to by c6386f0 (at
> 000000001608c800) all resolve as expected, but when the virtual
> address bumps it to c6386f8, it reads the 0x20, and passes it to
> _kl_pg_table_deref_s390x().  The user vaddr(s) that resolve to that
> next page of PTEs are legitimate, given that they are in the virtual
> region defined by the vm_area_struct.  But they certainly may not be
> mapped.
> 
> Anyway, it seems that there should be something that catches the
> invalid entry in s390x_vtop() -- prior to calling
> _kl_pg_table_deref_s390x()--  and return FALSE at that point.  
> 
> So if I make this kludge:
> 
>         ...
> 
>         /* Check if this is a large page. */
>         if (entry & 0x400ULL) {
>                 /* Add the 1MB page offset and return the final
> value. */ *phys_addr = table + (vaddr & 0xfffffULL);
>                 return TRUE;
>         }
> 
> ======> if (entry == 0x20) return FALSE;
> 
>         /* Get the page table entry */
>         entry = _kl_pg_table_deref_s390x(vaddr, entry & ~0x7ffULL);
>         if (!entry)
>                 return FALSE;
> 
>         /* Isolate the page origin from the page table entry. */
>         paddr = entry & ~0xfffULL;
> 
>         /* Add the page offset and return the final value. */
>         *phys_addr = paddr + (vaddr & 0xfffULL);
> 
>         return TRUE;
> }
> 
> then everything seems to work OK.  
> 
> So unless the calculation of the next page of PTEs is incorrect,
> which seems unlikely, it seems that the 0x20 is legitimate, and
> should be recognized?  What do you think?

0x20 means that the invalid bit of a region table entry is set (bit 58).
The problem is a wrong check in _kl_rsg_table_deref_s390x() where we
check for 0x40 (instead of 0x20).

The following is the correct fix for the problem:
---
 s390x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/s390x.c
+++ b/s390x.c
@@ -568,7 +568,7 @@ static ulong _kl_rsg_table_deref_s390x(u
 	if ((entry & 0xcULL) != (level << 2))
 		return 0;
 	/* Check if the region table entry has the invalid bit set. */
-	if (entry & 0x40ULL)
+	if (entry & 0x20ULL)
 		return 0;
 	/* Region table entry is valid and well formed. */
 	return entry;





More information about the Crash-utility mailing list