• Yasuaki Ishimatsu's avatar
    cfq-iosched: fix cfq_cic_link() race confition · 5eb46851
    Yasuaki Ishimatsu authored
    cfq_cic_link() has race condition. When some processes which shared ioc
    issue I/O to same block device simultaneously, cfq_cic_link() returns -EEXIST
    sometimes. The race condition might stop I/O by following steps:
    
    step  1: Process A: Issue an I/O to /dev/sda
    step  2: Process A: Get an ioc (iocA here) in get_io_context() which does not
    		    linked with a cic for the device
    step  3: Process A: Get a new cic for the device (cicA here) in
    		    cfq_alloc_io_context()
    
    step  4: Process B: Issue an I/O to /dev/sda
    step  5: Process B: Get iocA in get_io_context() since process A and B share the
    		    same ioc
    step  6: Process B: Get a new cic for the device (cicB here) in
    		    cfq_alloc_io_context() since iocA has not been linked with a
    		    cic for the device yet
    
    step  7: Process A: Link cicA to iocA in cfq_cic_link()
    step  8: Process A: Dispatch I/O to driver and finish it
    
    step  9: Process B: Try to link cicB to iocA in cfq_cic_link()
    		    But it fails with showing "cfq: cic link failed!" kernel
    		    message, since iocA has already linked with cicA at step 7.
    step 10: Process B: Wait for finishig I/O in get_request_wait()
    		    The function does not wake up, when there is no I/O to the
    		    device.
    
    When cfq_cic_link() returns -EEXIST, it means ioc has already linked with cic.
    So when cfq_cic_link() return -EEXIST, retry cfq_cic_lookup().
    Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: stable@kernel.org
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    5eb46851
cfq-iosched.c 108 KB