1. 13 Dec, 2011 7 commits
    • Tejun Heo's avatar
      block: make ioc get/put interface more conventional and fix race on alloction · 6e736be7
      Tejun Heo authored
      Ignoring copy_io() during fork, io_context can be allocated from two
      places - current_io_context() and set_task_ioprio().  The former is
      always called from local task while the latter can be called from
      different task.  The synchornization between them are peculiar and
      dubious.
      
      * current_io_context() doesn't grab task_lock() and assumes that if it
        saw %NULL ->io_context, it would stay that way until allocation and
        assignment is complete.  It has smp_wmb() between alloc/init and
        assignment.
      
      * set_task_ioprio() grabs task_lock() for assignment and does
        smp_read_barrier_depends() between "ioc = task->io_context" and "if
        (ioc)".  Unfortunately, this doesn't achieve anything - the latter
        is not a dependent load of the former.  ie, if ioc itself were being
        dereferenced "ioc->xxx", it would mean something (not sure what tho)
        but as the code currently stands, the dependent read barrier is
        noop.
      
      As only one of the the two test-assignment sequences is task_lock()
      protected, the task_lock() can't do much about race between the two.
      Nothing prevents current_io_context() and set_task_ioprio() allocating
      its own ioc for the same task and overwriting the other's.
      
      Also, set_task_ioprio() can race with exiting task and create a new
      ioc after exit_io_context() is finished.
      
      ioc get/put doesn't have any reason to be complex.  The only hot path
      is accessing the existing ioc of %current, which is simple to achieve
      given that ->io_context is never destroyed as long as the task is
      alive.  All other paths can happily go through task_lock() like all
      other task sub structures without impacting anything.
      
      This patch updates ioc get/put so that it becomes more conventional.
      
      * alloc_io_context() is replaced with get_task_io_context().  This is
        the only interface which can acquire access to ioc of another task.
        On return, the caller has an explicit reference to the object which
        should be put using put_io_context() afterwards.
      
      * The functionality of current_io_context() remains the same but when
        creating a new ioc, it shares the code path with
        get_task_io_context() and always goes through task_lock().
      
      * get_io_context() now means incrementing ref on an ioc which the
        caller already has access to (be that an explicit refcnt or implicit
        %current one).
      
      * PF_EXITING inhibits creation of new io_context and once
        exit_io_context() is finished, it's guaranteed that both ioc
        acquisition functions return %NULL.
      
      * All users are updated.  Most are trivial but
        smp_read_barrier_depends() removal from cfq_get_io_context() needs a
        bit of explanation.  I suppose the original intention was to ensure
        ioc->ioprio is visible when set_task_ioprio() allocates new
        io_context and installs it; however, this wouldn't have worked
        because set_task_ioprio() doesn't have wmb between init and install.
        There are other problems with this which will be fixed in another
        patch.
      
      * While at it, use NUMA_NO_NODE instead of -1 for wildcard node
        specification.
      
      -v2: Vivek spotted contamination from debug patch.  Removed.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6e736be7
    • Tejun Heo's avatar
      block: misc ioc cleanups · 42ec57a8
      Tejun Heo authored
      * int return from put_io_context() wasn't used by anybody.  Make it
        return void like other put functions and docbook-fy the function
        comment.
      
      * Reorder dummy declarations for !CONFIG_BLOCK case a bit.
      
      * Make alloc_ioc_context() use __GFP_ZERO allocation, take init out of
        if block and drop 0'ing.
      
      * Docbook-fy current_io_context() comment.
      
      This patch doesn't introduce any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      42ec57a8
    • Tejun Heo's avatar
      block, cfq: move cfqd->cic_index to q->id · a73f730d
      Tejun Heo authored
      cfq allocates per-queue id using ida and uses it to index cic radix
      tree from io_context.  Move it to q->id and allocate on queue init and
      free on queue release.  This simplifies cfq a bit and will allow for
      further improvements of io context life-cycle management.
      
      This patch doesn't introduce any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a73f730d
    • Tejun Heo's avatar
      block: add missing blk_queue_dead() checks · 8ba61435
      Tejun Heo authored
      blk_insert_cloned_request(), blk_execute_rq_nowait() and
      blk_flush_plug_list() either didn't check whether the queue was dead
      or did it without holding queue_lock.  Update them so that dead state
      is checked while holding queue_lock.
      
      AFAICS, this plugs all holes (requeue doesn't matter as the request is
      transitioning atomically from in_flight to queued).
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8ba61435
    • Tejun Heo's avatar
      block: fix drain_all condition in blk_drain_queue() · 481a7d64
      Tejun Heo authored
      When trying to drain all requests, blk_drain_queue() checked only
      q->rq.count[]; however, this only tracks REQ_ALLOCED requests.  This
      patch updates blk_drain_queue() such that it looks at all the counters
      and queues so that request_queue is actually empty on completion.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      481a7d64
    • Tejun Heo's avatar
      block: add blk_queue_dead() · 34f6055c
      Tejun Heo authored
      There are a number of QUEUE_FLAG_DEAD tests.  Add blk_queue_dead()
      macro and use it.
      
      This patch doesn't introduce any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      34f6055c
    • Tejun Heo's avatar
      block, sx8: kill blk_insert_request() · 1ba64ede
      Tejun Heo authored
      The only user left for blk_insert_request() is sx8 and it can be
      trivially switched to use blk_execute_rq_nowait() - special requests
      aren't included in io stat and sx8 doesn't use block layer tagging.
      Switch sx8 and kill blk_insert_requeset().
      
      This patch doesn't introduce any functional difference.
      
      Only compile tested.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarJeff Garzik <jgarzik@pobox.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1ba64ede
  2. 09 Dec, 2011 33 commits