xhci: Fix failed enqueue in the middle of isoch TD.

When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare() will ensure that there is enough room on the transfer rings for all of the isochronous TDs for that URB. However, when xhci_queue_isoc_tx() is enqueueing individual isoc TDs, the prepare_transfer() function can fail if the endpoint state has changed to disabled, error, or some other unknown state. With the current code, if Nth TD (not the first TD) fails, the ring is left in a sorry state. The partially enqueued TDs are left on the ring, and the first TRB of the TD is not given back to the hardware. The enqueue pointer is left on the TRB after the last successfully enqueued TD. This means the ring is basically useless. Any new transfers will be enqueued after the failed TDs, which the hardware will never read because the cycle bit indicates it does not own them. The ring will fill up with untransferred TDs, and the endpoint will be basically unusable. The untransferred TDs will also remain on the TD list. Since the td_list is a FIFO, this basically means the ring handler will be waiting on TDs that will never be completed (or worse, dereference memory that doesn't exist any more). Change the code to clean up the isochronous ring after a failed transfer. If the first TD failed, simply return and allow the xhci_urb_enqueue function to free the urb_priv. If the Nth TD failed, first remove the TDs from the td_list. Then convert the TRBs that were enqueued into No-op TRBs. Make sure to flip the cycle bit on all enqueued TRBs (including any link TRBs in the middle or between TDs), but leave the cycle bit of the first TRB (which will show software-owned) intact. Then move the ring enqueue pointer back to the first TRB and make sure to change the xhci_ring's cycle state to what is appropriate for that ring segment. This ensures that the No-op TRBs will be overwritten by subsequent TDs, and the hardware will not start executing random TRBs because the cycle bit was left as hardware-owned. This bug is unlikely to be hit, but it was something I noticed while tracking down the watchdog timer issue. I verified that the fix works by injecting some errors on the 250th isochronous URB queued, although I could not verify that the ring is in the correct state because uvcvideo refused to talk to the device after the first usb_submit_urb() failed. Ring debugging shows that the ring looks correct, however. This patch should be backported to kernels as old as 2.6.36. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: Andiry Xu <andiry.xu@amd.com> Cc: stable@kernel.org

xhci: Fix failed enqueue in the middle of isoch TD.
When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare() will ensure that there is enough room on the transfer rings for all of the isochronous TDs for that URB. However, when xhci_queue_isoc_tx() is enqueueing individual isoc TDs, the prepare_transfer() function can fail if the endpoint state has changed to disabled, error, or some other unknown state. With the current code, if Nth TD (not the first TD) fails, the ring is left in a sorry state. The partially enqueued TDs are left on the ring, and the first TRB of the TD is not given back to the hardware. The enqueue pointer is left on the TRB after the last successfully enqueued TD. This means the ring is basically useless. Any new transfers will be enqueued after the failed TDs, which the hardware will never read because the cycle bit indicates it does not own them. The ring will fill up with untransferred TDs, and the endpoint will be basically unusable. The untransferred TDs will also remain on the TD list. Since the td_list is a FIFO, this basically means the ring handler will be waiting on TDs that will never be completed (or worse, dereference memory that doesn't exist any more). Change the code to clean up the isochronous ring after a failed transfer. If the first TD failed, simply return and allow the xhci_urb_enqueue function to free the urb_priv. If the Nth TD failed, first remove the TDs from the td_list. Then convert the TRBs that were enqueued into No-op TRBs. Make sure to flip the cycle bit on all enqueued TRBs (including any link TRBs in the middle or between TDs), but leave the cycle bit of the first TRB (which will show software-owned) intact. Then move the ring enqueue pointer back to the first TRB and make sure to change the xhci_ring's cycle state to what is appropriate for that ring segment. This ensures that the No-op TRBs will be overwritten by subsequent TDs, and the hardware will not start executing random TRBs because the cycle bit was left as hardware-owned. This bug is unlikely to be hit, but it was something I noticed while tracking down the watchdog timer issue. I verified that the fix works by injecting some errors on the 250th isochronous URB queued, although I could not verify that the ring is in the correct state because uvcvideo refused to talk to the device after the first usb_submit_urb() failed. Ring debugging shows that the ring looks correct, however. This patch should be backported to kernels as old as 2.6.36. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: Andiry Xu <andiry.xu@amd.com> Cc: stable@kernel.org
522989a2 · Sarah Sharp · d13565c1 · 522989a2
Commit 522989a2 authored Jul 29, 2011 by Sarah Sharp
Show whitespace changes
Inline Side-by-side

Showing with 44 additions and 6 deletions

drivers/usb/host/xhci-ring.c drivers/usb/host/xhci-ring.c +44 -6

No files found.
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -514,8 +514,12 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci,
 			(unsigned long long) addr);
 }

+/* flip_cycle means flip the cycle bit of all but the first and last TRB.
+ * (The last TRB actually points to the ring enqueue pointer, which is not part
+ * of this TD.)  This is used to remove partially enqueued isoc TDs from a ring.
+ */
 static void td_to_noop(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
-		struct xhci_td *cur_td)
+		struct xhci_td *cur_td, bool flip_cycle)
 {
 	struct xhci_segment *cur_seg;
 	union xhci_trb *cur_trb;
@@ -528,6 +532,12 @@ static void td_to_noop(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 			 * leave the pointers intact.
 			 */
 			cur_trb->generic.field[3] &= cpu_to_le32(~TRB_CHAIN);
+			/* Flip the cycle bit (link TRBs can't be the first
+			 * or last TRB).
+			 */
+			if (flip_cycle)
+				cur_trb->generic.field[3] ^=
+					cpu_to_le32(TRB_CYCLE);
 			xhci_dbg(xhci, "Cancel (unchain) link TRB\n");
 			xhci_dbg(xhci, "Address = %p (0x%llx dma); "
 					"in seg %p (0x%llx dma)\n",
@@ -541,6 +551,11 @@ static void td_to_noop(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 			cur_trb->generic.field[2] = 0;
 			/* Preserve only the cycle bit of this TRB */
 			cur_trb->generic.field[3] &= cpu_to_le32(TRB_CYCLE);
+			/* Flip the cycle bit except on the first or last TRB */
+			if (flip_cycle && cur_trb != cur_td->first_trb &&
+					cur_trb != cur_td->last_trb)
+				cur_trb->generic.field[3] ^=
+					cpu_to_le32(TRB_CYCLE);
 			cur_trb->generic.field[3] |= cpu_to_le32(
 				TRB_TYPE(TRB_TR_NOOP));
 			xhci_dbg(xhci, "Cancel TRB %p (0x%llx dma) "
@@ -719,7 +734,7 @@ static void handle_stopped_endpoint(struct xhci_hcd *xhci,
 					cur_td->urb->stream_id,
 					cur_td, &deq_state);
 		else
-			td_to_noop(xhci, ep_ring, cur_td);
+			td_to_noop(xhci, ep_ring, cur_td, false);
 remove_finished_td:
 		/*
 		 * The event handler won't see a completion for this TD anymore,
@@ -3223,6 +3238,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	start_trb = &ep_ring->enqueue->generic;
 	start_cycle = ep_ring->cycle_state;

+	urb_priv = urb->hcpriv;
 	/* Queue the first TRB, even if it's zero-length */
 	for (i = 0; i < num_tds; i++) {
 		unsigned int total_packet_count;
@@ -3246,12 +3262,13 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags,

 		ret = prepare_transfer(xhci, xhci->devs[slot_id], ep_index,
 				urb->stream_id, trbs_per_td, urb, i, mem_flags);
-		if (ret < 0)
+		if (ret < 0) {
+			if (i == 0)
 				return ret;
+			goto cleanup;
+		}

-		urb_priv = urb->hcpriv;
 		td = urb_priv->td[i];
-
 		for (j = 0; j < trbs_per_td; j++) {
 			u32 remainder = 0;
 			field = TRB_TBC(burst_count) | TRB_TLBPC(residue);
@@ -3341,6 +3358,27 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	giveback_first_trb(xhci, slot_id, ep_index, urb->stream_id,
 			start_cycle, start_trb);
 	return 0;
+cleanup:
+	/* Clean up a partially enqueued isoc transfer. */
+
+	for (i--; i >= 0; i--)
+		list_del(&urb_priv->td[i]->td_list);
+
+	/* Use the first TD as a temporary variable to turn the TDs we've queued
+	 * into No-ops with a software-owned cycle bit. That way the hardware
+	 * won't accidentally start executing bogus TDs when we partially
+	 * overwrite them.  td->first_trb and td->start_seg are already set.
+	 */
+	urb_priv->td[0]->last_trb = ep_ring->enqueue;
+	/* Every TRB except the first & last will have its cycle bit flipped. */
+	td_to_noop(xhci, ep_ring, urb_priv->td[0], true);
+
+	/* Reset the ring enqueue back to the first TRB and its cycle bit. */
+	ep_ring->enqueue = urb_priv->td[0]->first_trb;
+	ep_ring->enq_seg = urb_priv->td[0]->start_seg;
+	ep_ring->cycle_state = start_cycle;
+	usb_hcd_unlink_urb_from_ep(bus_to_hcd(urb->dev->bus), urb);
+	return ret;
 }

 /*