rpms/kernel/F-11 linux-2.6-iwlwifi_-fix-TX-queue-race.patch, NONE, 1.1.2.2 kernel.spec, 1.1679.2.4, 1.1679.2.5

John W. Linville linville at fedoraproject.org
Wed Aug 12 18:13:11 UTC 2009


Author: linville

Update of /cvs/pkgs/rpms/kernel/F-11
In directory cvs1.fedora.phx.redhat.com:/tmp/cvs-serv15900

Modified Files:
      Tag: private-fedora-11-2_6_29_6
	kernel.spec 
Added Files:
      Tag: private-fedora-11-2_6_29_6
	linux-2.6-iwlwifi_-fix-TX-queue-race.patch 
Log Message:
iwlwifi: fix TX queue race

linux-2.6-iwlwifi_-fix-TX-queue-race.patch:
 iwl-tx.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- NEW FILE linux-2.6-iwlwifi_-fix-TX-queue-race.patch ---
Backport of the following upstream commit...

commit 3995bd9332a51b626237d6671cfeb7235e6c1305
Author: Johannes Berg <johannes at sipsolutions.net>
Date:   Fri Jul 24 11:13:14 2009 -0700

    iwlwifi: fix TX queue race
    
    I had a problem on 4965 hardware (well, probably other hardware too,
    but others don't survive my stress testing right now, unfortunately)
    where the driver was sending invalid commands to the device, but no
    such thing could be seen from the driver's point of view. I could
    reproduce this fairly easily by sending multiple TCP streams with
    iperf on different TIDs, though sometimes a single iperf stream was
    sufficient. It even happened with a single core, but I have forced
    preemption turned on.
    
    The culprit was a queue overrun, where we advanced the queue's write
    pointer over the read pointer. After careful analysis I've come to
    the conclusion that the cause is a race condition between iwlwifi
    and mac80211.
    
    mac80211, of course, checks whether the queue is stopped, before
    transmitting a frame. This effectively looks like this:
    
            lock(queues)
            if (stopped(queue)) {
                    unlock(queues)
                    return busy;
    	}
            unlock(queues)
            ...             <-- this place will be important
    			    there is some more code here
            drv_tx(frame)
    
    The driver, on the other hand, can stop and start queues, which does
    
            lock(queues)
            mark_running/stopped(queue)
            unlock(queues)
    
    	[if marked running: wake up tasklet to send pending frames]
    
    Now, however, once the driver starts the queue, mac80211 can see that
    and end up at the marked place above, at which point for some reason the
    driver seems to stop the queue again (I don't understand that) and then
    we end up transmitting while the queue is actually full.
    
    Now, this shouldn't actually matter much, but for some reason I've seen
    it happen multiple times in a row and the queue actually overflows, at
    which point the queue bites itself in the tail and things go completely
    wrong.
    
    This patch fixes this by just dropping the packet should this have
    happened, and making the lock in iwlwifi cover everything so iwlwifi
    can't race against itself (dropping the lock there might make it more
    likely, but it did seem to happen without that too).
    
    Since we can't hold the lock across drv_tx() above, I see no way to fix
    this in mac80211, but I also don't understand why I haven't seen this
    before -- maybe I just never stress tested it this badly.
    
    With this patch, the device has survived many minutes of simultanously
    sending two iperf streams on different TIDs with combined throughput
    of about 60 Mbps.
    
    Signed-off-by: Johannes Berg <johannes at sipsolutions.net>
    Signed-off-by: Reinette Chatre <reinette.chatre at intel.com>
    Signed-off-by: John W. Linville <linville at tuxdriver.com>

diff -up linux-2.6.29.noarch/drivers/net/wireless/iwlwifi/iwl-tx.c.orig linux-2.6.29.noarch/drivers/net/wireless/iwlwifi/iwl-tx.c
--- linux-2.6.29.noarch/drivers/net/wireless/iwlwifi/iwl-tx.c.orig	2009-03-23 19:12:14.000000000 -0400
+++ linux-2.6.29.noarch/drivers/net/wireless/iwlwifi/iwl-tx.c	2009-08-12 11:56:04.000000000 -0400
@@ -876,8 +876,6 @@ int iwl_tx_skb(struct iwl_priv *priv, st
 		goto drop_unlock;
 	}
 
-	spin_unlock_irqrestore(&priv->lock, flags);
-
 	hdr_len = ieee80211_hdrlen(fc);
 
 	/* Find (or create) index into station table for destination station */
@@ -885,7 +883,7 @@ int iwl_tx_skb(struct iwl_priv *priv, st
 	if (sta_id == IWL_INVALID_STATION) {
 		IWL_DEBUG_DROP("Dropping - INVALID STATION: %pM\n",
 			       hdr->addr1);
-		goto drop;
+		goto drop_unlock;
 	}
 
 	IWL_DEBUG_TX("station Id %d\n", sta_id);
@@ -904,14 +902,17 @@ int iwl_tx_skb(struct iwl_priv *priv, st
 		/* aggregation is on for this <sta,tid> */
 		if (info->flags & IEEE80211_TX_CTL_AMPDU)
 			txq_id = priv->stations[sta_id].tid[tid].agg.txq_id;
-		priv->stations[sta_id].tid[tid].tfds_in_queue++;
 	}
 
 	txq = &priv->txq[txq_id];
 	q = &txq->q;
 	txq->swq_id = swq_id;
 
-	spin_lock_irqsave(&priv->lock, flags);
+	if (unlikely(iwl_queue_space(q) < q->high_mark))
+		goto drop_unlock;
+
+	if (ieee80211_is_data_qos(fc))
+		priv->stations[sta_id].tid[tid].tfds_in_queue++;
 
 	/* Set up first empty TFD within this queue's circular TFD buffer */
 	tfd = &txq->tfds[q->write_ptr];
@@ -1043,7 +1044,6 @@ int iwl_tx_skb(struct iwl_priv *priv, st
 
 drop_unlock:
 	spin_unlock_irqrestore(&priv->lock, flags);
-drop:
 	return -1;
 }
 EXPORT_SYMBOL(iwl_tx_skb);


Index: kernel.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/F-11/kernel.spec,v
retrieving revision 1.1679.2.4
retrieving revision 1.1679.2.5
diff -u -p -r1.1679.2.4 -r1.1679.2.5
--- kernel.spec	10 Aug 2009 18:32:32 -0000	1.1679.2.4
+++ kernel.spec	12 Aug 2009 18:13:10 -0000	1.1679.2.5
@@ -687,6 +687,7 @@ Patch687: mac80211-don-t-drop-nullfunc-f
 Patch690: iwl3945-release-resources-before-shutting-down.patch
 Patch691: iwl3945-add-debugging-for-wrong-command-queue.patch
 Patch692: iwl3945-fix-rfkill-sw-and-hw-mishmash.patch
+Patch693: linux-2.6-iwlwifi_-fix-TX-queue-race.patch
 
 Patch700: linux-2.6-dma-debug-fixes.patch
 
@@ -1411,6 +1412,9 @@ ApplyPatch iwl3945-release-resources-bef
 ApplyPatch iwl3945-add-debugging-for-wrong-command-queue.patch
 ApplyPatch iwl3945-fix-rfkill-sw-and-hw-mishmash.patch
 
+# iwlwifi: fix TX queue race
+ApplyPatch linux-2.6-iwlwifi_-fix-TX-queue-race.patch
+
 # Fix up DMA debug code
 ApplyPatch linux-2.6-dma-debug-fixes.patch
 
@@ -2131,6 +2135,9 @@ fi
 # and build.
 
 %changelog
+* Wed Aug 12 2009 John W. Linville <linville at redhat.com> 2.6.29.6-217.2.5
+- iwlwifi: fix TX queue race
+
 * Mon Aug 10 2009 Jarod Wilson <jarod at redhat.com> 2.6.29.6-217.2.4
 - Add tunable pad threshold support to lirc_imon
 - Blacklist all iMON devices in usbhid driver so lirc_imon can bind




More information about the fedora-extras-commits mailing list