rpms/kernel/F-8 linux-2.6-block-submit_bh-discards-barrier-flag.patch, NONE, 1.1 linux-2.6-mm-dirty-page-tracking-race-fix.patch, NONE, 1.1 linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch, NONE, 1.1 kernel.spec, 1.521, 1.522

Chuck Ebbert cebbert at fedoraproject.org
Sun Aug 31 05:18:36 UTC 2008


Author: cebbert

Update of /cvs/pkgs/rpms/kernel/F-8
In directory cvs1.fedora.phx.redhat.com:/tmp/cvs-serv30661

Modified Files:
	kernel.spec 
Added Files:
	linux-2.6-block-submit_bh-discards-barrier-flag.patch 
	linux-2.6-mm-dirty-page-tracking-race-fix.patch 
	linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch 
Log Message:
x86-32: amd c1e force timer broadcast late
  (fixes failure to disable local apic timer)
mm: dirty page tracking race fix
block: submit_bh() inadvertently discards barrier flag on a sync write

linux-2.6-block-submit_bh-discards-barrier-flag.patch:

--- NEW FILE linux-2.6-block-submit_bh-discards-barrier-flag.patch ---
From: Jens Axboe <jens.axboe at oracle.com>
Date: Fri, 22 Aug 2008 08:00:36 +0000 (+0200)
Subject: block: submit_bh() inadvertently discards barrier flag on a sync write
X-Git-Tag: v2.6.27-rc5~19^2~4
X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=48fd4f93a00eac844678629f2f00518e146ed30d

block: submit_bh() inadvertently discards barrier flag on a sync write

Reported by Milan Broz <mbroz at redhat.com>, commit 18ce3751 inadvertently
made submit_bh() discard the barrier bit for a WRITE_SYNC request. Fix
that up.

Signed-off-by: Jens Axboe <jens.axboe at oracle.com>
---

diff --git a/fs/buffer.c b/fs/buffer.c
index 38653e3..ac78d4c 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2926,14 +2926,17 @@ int submit_bh(int rw, struct buffer_head * bh)
 	BUG_ON(!buffer_mapped(bh));
 	BUG_ON(!bh->b_end_io);
 
-	if (buffer_ordered(bh) && (rw == WRITE))
-		rw = WRITE_BARRIER;
+	/*
+	 * Mask in barrier bit for a write (could be either a WRITE or a
+	 * WRITE_SYNC
+	 */
+	if (buffer_ordered(bh) && (rw & WRITE))
+		rw |= WRITE_BARRIER;
 
 	/*
-	 * Only clear out a write error when rewriting, should this
-	 * include WRITE_SYNC as well?
+	 * Only clear out a write error when rewriting
 	 */
-	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
+	if (test_set_buffer_req(bh) && (rw & WRITE))
 		clear_buffer_write_io_error(bh);
 
 	/*

linux-2.6-mm-dirty-page-tracking-race-fix.patch:

--- NEW FILE linux-2.6-mm-dirty-page-tracking-race-fix.patch ---
From: Nick Piggin <npiggin at suse.de>
Date: Wed, 20 Aug 2008 21:09:18 +0000 (-0700)
Subject: mm: dirty page tracking race fix
X-Git-Tag: v2.6.27-rc4~6
X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=479db0bf408e65baa14d2a9821abfcbc0804b847

mm: dirty page tracking race fix

There is a race with dirty page accounting where a page may not properly
be accounted for.

clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.

page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty.  It uses page_check_address to
find the pte.  That function has a shortcut to avoid the ptl if the pte is
not present.  Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.

For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value.  There may also be other code in
core mm or in arch which do similar things.

The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy.  XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.

Fix this by having an option to always take ptl to check the pte in
page_check_address.

It's possible to retain this optimization for page_referenced and
try_to_unmap.

Signed-off-by: Nick Piggin <npiggin at suse.de>
Cc: Jared Hulbert <jaredeh at gmail.com>
Cc: Carsten Otte <cotte at freenet.de>
Cc: Hugh Dickins <hugh at veritas.com>
Acked-by: Peter Zijlstra <a.p.zijlstra at chello.nl>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
---

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 69407f8..fed6f5e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -102,7 +102,7 @@ int try_to_unmap(struct page *, int ignore_refs);
  * Called from mm/filemap_xip.c to unmap empty zero page
  */
 pte_t *page_check_address(struct page *, struct mm_struct *,
-				unsigned long, spinlock_t **);
+				unsigned long, spinlock_t **, int);
 
 /*
  * Used by swapoff to help locate where page is expected in vma.
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index 380ab40..8b710ca 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -185,7 +185,7 @@ __xip_unmap (struct address_space * mapping,
 		address = vma->vm_start +
 			((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
 		BUG_ON(address < vma->vm_start || address >= vma->vm_end);
-		pte = page_check_address(page, mm, address, &ptl);
+		pte = page_check_address(page, mm, address, &ptl, 1);
 		if (pte) {
 			/* Nuke the page table entry. */
 			flush_cache_page(vma, address, pte_pfn(*pte));
diff --git a/mm/rmap.c b/mm/rmap.c
index 0597747..0383acf 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -224,10 +224,14 @@ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma)
 /*
  * Check that @page is mapped at @address into @mm.
  *
+ * If @sync is false, page_check_address may perform a racy check to avoid
+ * the page table lock when the pte is not present (helpful when reclaiming
+ * highly shared pages).
+ *
  * On success returns with pte mapped and locked.
  */
 pte_t *page_check_address(struct page *page, struct mm_struct *mm,
-			  unsigned long address, spinlock_t **ptlp)
+			  unsigned long address, spinlock_t **ptlp, int sync)
 {
 	pgd_t *pgd;
 	pud_t *pud;
@@ -249,7 +253,7 @@ pte_t *page_check_address(struct page *page, struct mm_struct *mm,
 
 	pte = pte_offset_map(pmd, address);
 	/* Make a quick check before getting the lock */
-	if (!pte_present(*pte)) {
+	if (!sync && !pte_present(*pte)) {
 		pte_unmap(pte);
 		return NULL;
 	}
@@ -281,7 +285,7 @@ static int page_referenced_one(struct page *page,
 	if (address == -EFAULT)
 		goto out;
 
-	pte = page_check_address(page, mm, address, &ptl);
+	pte = page_check_address(page, mm, address, &ptl, 0);
 	if (!pte)
 		goto out;
 
@@ -450,7 +454,7 @@ static int page_mkclean_one(struct page *page, struct vm_area_struct *vma)
 	if (address == -EFAULT)
 		goto out;
 
-	pte = page_check_address(page, mm, address, &ptl);
+	pte = page_check_address(page, mm, address, &ptl, 1);
 	if (!pte)
 		goto out;
 
@@ -704,7 +708,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	if (address == -EFAULT)
 		goto out;
 
-	pte = page_check_address(page, mm, address, &ptl);
+	pte = page_check_address(page, mm, address, &ptl, 0);
 	if (!pte)
 		goto out;
 

linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch:

--- NEW FILE linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch ---
x86-32: amd c1e force timer broadcast late

In kernel 2.6.26 the 32-bit x86 timers are started earlier than before.
This breaks AMD c1e detection trying to force timer broadcast for the
local apic timer. Copy the code from the 64-bit kernel to force timer
broadcast late.

This patch is not needed in 2.6.27 because it has new c1e-aware idle code.

Signed-off-by: Chuck Ebbert <cebbert at redhat.com>

--- linux-2.6.26.noarch.orig/arch/x86/kernel/apic_32.c
+++ linux-2.6.26.noarch/arch/x86/kernel/apic_32.c
@@ -552,8 +552,31 @@ void __init setup_boot_APIC_clock(void)
 	setup_APIC_timer();
 }
 
-void __devinit setup_secondary_APIC_clock(void)
+/*
+ * AMD C1E enabled CPUs have a real nasty problem: Some BIOSes set the
+ * C1E flag only in the secondary CPU, so when we detect the wreckage
+ * we already have enabled the boot CPU local apic timer. Check, if
+ * disable_apic_timer is set and the DUMMY flag is cleared. If yes,
+ * set the DUMMY flag again and force the broadcast mode in the
+ * clockevents layer.
+ */
+static void __cpuinit check_boot_apic_timer_broadcast(void)
+{
+	if (!local_apic_timer_disabled ||
+	    (lapic_clockevent.features & CLOCK_EVT_FEAT_DUMMY))
+		return;
+
+	lapic_clockevent.features |= CLOCK_EVT_FEAT_DUMMY;
+
+	local_irq_enable();
+	clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_FORCE,
+			   &boot_cpu_physical_apicid);
+	local_irq_disable();
+}
+
+void __cpuinit setup_secondary_APIC_clock(void)
 {
+	check_boot_apic_timer_broadcast();
 	setup_APIC_timer();
 }
 


Index: kernel.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/F-8/kernel.spec,v
retrieving revision 1.521
retrieving revision 1.522
diff -u -r1.521 -r1.522
--- kernel.spec	31 Aug 2008 05:09:14 -0000	1.521
+++ kernel.spec	31 Aug 2008 05:18:36 -0000	1.522
@@ -581,6 +581,7 @@
 Patch88: linux-2.6-x86-64-fix-overlap-of-modules-and-fixmap-areas.patch
 Patch89: linux-2.6-x86-fdiv-bug-detection-fix.patch
 Patch91: linux-2.6-x86-fix-oprofile-and-hibernation-issues.patch
+Patch92: linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch
 
 #ALSA
 
@@ -615,6 +616,7 @@
 
 Patch410: linux-2.6-bio-fix-__bio_copy_iov-handling-of-bv_len.patch
 Patch411: linux-2.6-bio-fix-bio_copy_kern-handling-of-bv_len.patch
+Patch412: linux-2.6-block-submit_bh-discards-barrier-flag.patch
 
 # filesystem patches
 Patch421: linux-2.6-squashfs.patch
@@ -657,6 +659,8 @@
 Patch820: linux-2.6-cpuidle-2-menu-governor-fix-wrong-usage-of-measured_us.patch
 Patch830: linux-2.6-cpuidle-3-make-ladder-governor-honor-latency-requirements.patch
 
+Patch900: linux-2.6-mm-dirty-page-tracking-race-fix.patch
+
 Patch1101: linux-2.6-default-mmf_dump_elf_headers.patch
 
 Patch1308: linux-2.6-usb-ehci-hcd-respect-nousb.patch
@@ -989,6 +993,8 @@
 # bio patches queued for -stable
 ApplyPatch linux-2.6-bio-fix-__bio_copy_iov-handling-of-bv_len.patch
 ApplyPatch linux-2.6-bio-fix-bio_copy_kern-handling-of-bv_len.patch
+# don't discard barrier flags
+ApplyPatch linux-2.6-block-submit_bh-discards-barrier-flag.patch
 
 # Nouveau DRM + drm fixes
 ApplyPatch nouveau-drm.patch
@@ -1011,6 +1017,8 @@
 ApplyPatch linux-2.6-x86-fdiv-bug-detection-fix.patch
 # oprofile / hibernation fix
 ApplyPatch linux-2.6-x86-fix-oprofile-and-hibernation-issues.patch
+# fix failure to disable local apic on AMD c1e-enabled machines
+ApplyPatch linux-2.6-x86-32-amd-c1e-force-timer-broadcast-late.patch
 
 #
 # PowerPC
@@ -1177,6 +1185,10 @@
 ApplyPatch linux-2.6-cpuidle-2-menu-governor-fix-wrong-usage-of-measured_us.patch
 ApplyPatch linux-2.6-cpuidle-3-make-ladder-governor-honor-latency-requirements.patch
 
+# mm
+# possible data corruption, esp. on ppc
+ApplyPatch linux-2.6-mm-dirty-page-tracking-race-fix.patch
+
 # dm / md
 
 # ACPI
@@ -1801,6 +1813,12 @@
 
 
 %changelog
+* Sat Aug 30 2008 Chuck Ebbert <cebbert at redhat.com> 2.6.26.3-12
+- x86-32: amd c1e force timer broadcast late
+  (fixes failure to disable local apic timer)
+- mm: dirty page tracking race fix
+- block: submit_bh() inadvertently discards barrier flag on a sync write
+
 * Sat Aug 30 2008 Chuck Ebbert <cebbert at redhat.com> 2.6.26.3-11
 - Fix cpuidle misbehavior. (F9#459214)
 




More information about the fedora-extras-commits mailing list