• Konrad Rzeszutek Wilk's avatar
    xen/blkback: Check for insane amounts of request on the ring (v6). · 8e3f8755
    Konrad Rzeszutek Wilk authored
    Check that the ring does not have an insane amount of requests
    (more than there could fit on the ring).
    
    If we detect this case we will stop processing the requests
    and wait until the XenBus disconnects the ring.
    
    The existing check RING_REQUEST_CONS_OVERFLOW which checks for how
    many responses we have created in the past (rsp_prod_pvt) vs
    requests consumed (req_cons) and whether said difference is greater or
    equal to the size of the ring, does not catch this case.
    
    Wha the condition does check if there is a need to process more
    as we still have a backlog of responses to finish. Note that both
    of those values (rsp_prod_pvt and req_cons) are not exposed on the
    shared ring.
    
    To understand this problem a mini crash course in ring protocol
    response/request updates is in place.
    
    There are four entries: req_prod and rsp_prod; req_event and rsp_event
    to track the ring entries. We are only concerned about the first two -
    which set the tone of this bug.
    
    The req_prod is a value incremented by frontend for each request put
    on the ring. Conversely the rsp_prod is a value incremented by the backend
    for each response put on the ring (rsp_prod gets set by rsp_prod_pvt when
    pushing the responses on the ring).  Both values can
    wrap and are modulo the size of the ring (in block case that is 32).
    Please see RING_GET_REQUEST and RING_GET_RESPONSE for the more details.
    
    The culprit here is that if the difference between the
    req_prod and req_cons is greater than the ring size we have a problem.
    Fortunately for us, the '__do_block_io_op' loop:
    
    	rc = blk_rings->common.req_cons;
    	rp = blk_rings->common.sring->req_prod;
    
    	while (rc != rp) {
    
    		..
    		blk_rings->common.req_cons = ++rc; /* before make_response() */
    
    	}
    
    will loop up to the point when rc == rp. The macros inside of the
    loop (RING_GET_REQUEST) is smart and is indexing based on the modulo
    of the ring size. If the frontend has provided a bogus req_prod value
    we will loop until the 'rc == rp' - which means we could be processing
    already processed requests (or responses) often.
    
    The reason the RING_REQUEST_CONS_OVERFLOW is not helping here is
    b/c it only tracks how many responses we have internally produced
    and whether we would should process more. The astute reader will
    notice that the macro RING_REQUEST_CONS_OVERFLOW provides two
    arguments - more on this later.
    
    For example, if we were to enter this function with these values:
    
           	blk_rings->common.sring->req_prod =  X+31415 (X is the value from
    		the last time __do_block_io_op was called).
            blk_rings->common.req_cons = X
            blk_rings->common.rsp_prod_pvt = X
    
    The RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, blk_rings->common.req_cons)
    is doing:
    
    	req_cons - rsp_prod_pvt >= 32
    
    Which is,
    	X - X >= 32 or 0 >= 32
    
    And that is false, so we continue on looping (this bug).
    
    If we re-use said macro RING_REQUEST_CONS_OVERFLOW and pass in the rp
    instead (sring->req_prod) of rc, the this macro can do the check:
    
         req_prod - rsp_prov_pvt >= 32
    
    Which is,
           X + 31415 - X >= 32 , or 31415 >= 32
    
    which is true, so we can error out and break out of the function.
    
    Unfortunatly the difference between rsp_prov_pvt and req_prod can be
    at 32 (which would error out in the macro). This condition exists when
    the backend is lagging behind with the responses and still has not finished
    responding to all of them (so make_response has not been called), and
    the rsp_prov_pvt + 32 == req_cons. This ends up with us not being able
    to use said macro.
    
    Hence introducing a new macro called RING_REQUEST_PROD_OVERFLOW which does
    a simple check of:
    
        req_prod - rsp_prod_pvt > RING_SIZE
    
    And with the X values from above:
    
       X + 31415 - X > 32
    
    Returns true. Also not that if the ring is full (which is where
    the RING_REQUEST_CONS_OVERFLOW triggered), we would not hit the
    same condition:
    
       X + 32 - X > 32
    
    Which is false.
    
    Lets use that macro.
    Note that in v5 of this patchset the macro was different - we used an
    earlier version.
    
    Cc: stable@vger.kernel.org
    [v1: Move the check outside the loop]
    [v2: Add a pr_warn as suggested by David]
    [v3: Use RING_REQUEST_CONS_OVERFLOW as suggested by Jan]
    [v4: Move wake_up after kthread_stop as suggested by Jan]
    [v5: Use RING_REQUEST_PROD_OVERFLOW instead]
    [v6: Use RING_REQUEST_PROD_OVERFLOW - Jan's version]
    Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
    
    gadsa
    8e3f8755
common.h 15.6 KB