Commit 35b52c70 authored by Tina Yang's avatar Tina Yang Committed by Andy Grover

RDS: Fix corrupted rds_mrs

On second look at this bug (OFED #2002), it seems that the
collision is not with the retransmission queue (packet acked
by the peer), but with the local send completion.  A theoretical
sequence of events (from time t0 to t3) is thought to be as
follows,

Thread #1
t0:
    sock_release
    rds_release
    rds_send_drop_to /* wait on send completion */
t2:
    rds_rdma_drop_keys()   /* destroy & free all mrs */

Thread #2
t1:
    rds_ib_send_cq_comp_handler
    rds_ib_send_unmap_rm
    rds_message_unmapped   /* wake up #1 @ t0 */
t3:
    rds_message_put
    rds_message_purge
    rds_mr_put   /* memory corruption detected */

The problem with the rds_rdma_drop_keys() is it could
remove a mr's refcount more than its due (i.e. repeatedly
as long as it still remains in the tree (mr->r_refcount > 0)).
Theoretically it should remove only one reference - reference
by the tree.

        /* Release any MRs associated with this socket */
        while ((node = rb_first(&rs->rs_rdma_keys))) {
                mr = container_of(node, struct rds_mr, r_rb_node);
                if (mr->r_trans == rs->rs_transport)
                        mr->r_invalidate = 0;
                rds_mr_put(mr);
        }

I think the correct way of doing it is to remove the mr from
the tree and rds_destroy_mr it first, then a rds_mr_put()
to decrement its reference count by one.  Whichever thread
holds the last reference will free the mr via rds_mr_put().
Signed-off-by: default avatarTina Yang <tina.yang@oracle.com>
Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
parent 9e2effba
...@@ -130,14 +130,22 @@ void rds_rdma_drop_keys(struct rds_sock *rs) ...@@ -130,14 +130,22 @@ void rds_rdma_drop_keys(struct rds_sock *rs)
{ {
struct rds_mr *mr; struct rds_mr *mr;
struct rb_node *node; struct rb_node *node;
unsigned long flags;
/* Release any MRs associated with this socket */ /* Release any MRs associated with this socket */
spin_lock_irqsave(&rs->rs_rdma_lock, flags);
while ((node = rb_first(&rs->rs_rdma_keys))) { while ((node = rb_first(&rs->rs_rdma_keys))) {
mr = container_of(node, struct rds_mr, r_rb_node); mr = container_of(node, struct rds_mr, r_rb_node);
if (mr->r_trans == rs->rs_transport) if (mr->r_trans == rs->rs_transport)
mr->r_invalidate = 0; mr->r_invalidate = 0;
rb_erase(&mr->r_rb_node, &rs->rs_rdma_keys);
RB_CLEAR_NODE(&mr->r_rb_node);
spin_unlock_irqrestore(&rs->rs_rdma_lock, flags);
rds_destroy_mr(mr);
rds_mr_put(mr); rds_mr_put(mr);
spin_lock_irqsave(&rs->rs_rdma_lock, flags);
} }
spin_unlock_irqrestore(&rs->rs_rdma_lock, flags);
if (rs->rs_transport && rs->rs_transport->flush_mrs) if (rs->rs_transport && rs->rs_transport->flush_mrs)
rs->rs_transport->flush_mrs(); rs->rs_transport->flush_mrs();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment