Commit 40ce7919 authored by Marvin Liu's avatar Marvin Liu Committed by Michael S. Tsirkin

virtio_ring: fix stalls for packed rings

When VIRTIO_F_RING_EVENT_IDX is negotiated, virtio devices can
use virtqueue_enable_cb_delayed_packed to reduce the number of device
interrupts.  At the moment, this is the case for virtio-net when the
napi_tx module parameter is set to false.

In this case, the virtio driver selects an event offset and expects that
the device will send a notification when rolling over the event offset
in the ring.  However, if this roll-over happens before the event
suppression structure update, the notification won't be sent. To address
this race condition the driver needs to check wether the device rolled
over the offset after updating the event suppression structure.

With VIRTIO_F_RING_PACKED, the virtio driver did this by reading the
flags field of the descriptor at the specified offset.

Unfortunately, checking at the event offset isn't reliable: if
descriptors are chained (e.g. when INDIRECT is off) not all descriptors
are overwritten by the device, so it's possible that the device skipped
the specific descriptor driver is checking when writing out used
descriptors. If this happens, the driver won't detect the race condition
and will incorrectly expect the device to send a notification.

For virtio-net, the result will be a TX queue stall, with the
transmission getting blocked forever.

With the packed ring, it isn't easy to find a location which is
guaranteed to change upon the roll-over, except the next device
descriptor, as described in the spec:

        Writes of device and driver descriptors can generally be
        reordered, but each side (driver and device) are only required to
        poll (or test) a single location in memory: the next device descriptor after
        the one they processed previously, in circular order.

while this might be sub-optimal, let's do exactly this for now.

Cc: stable@vger.kernel.org
Cc: Jason Wang <jasowang@redhat.com>
Fixes: f51f9826 ("virtio_ring: leverage event idx in packed ring")
Signed-off-by: default avatarMarvin Liu <yong.liu@intel.com>
Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
parent 7d194c21
...@@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) ...@@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
* counter first before updating event flags. * counter first before updating event flags.
*/ */
virtio_wmb(vq->weak_barriers); virtio_wmb(vq->weak_barriers);
} else {
used_idx = vq->last_used_idx;
wrap_counter = vq->packed.used_wrap_counter;
} }
if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) {
...@@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) ...@@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
*/ */
virtio_mb(vq->weak_barriers); virtio_mb(vq->weak_barriers);
if (is_used_desc_packed(vq, used_idx, wrap_counter)) { if (is_used_desc_packed(vq,
vq->last_used_idx,
vq->packed.used_wrap_counter)) {
END_USE(vq); END_USE(vq);
return false; return false;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment