1. 11 May, 2007 2 commits
    • Neil Brown's avatar
      When stacked block devices are in-use (e.g. md or dm), the recursive calls · d89d8796
      Neil Brown authored
      to generic_make_request can use up a lot of space, and we would rather they
      didn't.
      
      As generic_make_request is a void function, and as it is generally not
      expected that it will have any effect immediately, it is safe to delay any
      call to generic_make_request until there is sufficient stack space
      available.
      
      As ->bi_next is reserved for the driver to use, it can have no valid value
      when generic_make_request is called, and as __make_request implicitly
      assumes it will be NULL (ELEVATOR_BACK_MERGE fork of switch) we can be
      certain that all callers set it to NULL.  We can therefore safely use
      bi_next to link pending requests together, providing we clear it before
      making the real call.
      
      So, we choose to allow each thread to only be active in one
      generic_make_request at a time.  If a subsequent (recursive) call is made,
      the bio is linked into a per-thread list, and is handled when the active
      call completes.
      
      As the list of pending bios is per-thread, there are no locking issues to
      worry about.
      
      I say above that it is "safe to delay any call...".  There are, however,
      some behaviours of a make_request_fn which would make it unsafe.  These
      include any behaviour that assumes anything will have changed after a
      recursive call to generic_make_request.
      
      These could include:
       - waiting for that call to finish and call it's bi_end_io function.
         md use to sometimes do this (marking the superblock dirty before
         completing a write) but doesn't any more
       - inspecting the bio for fields that generic_make_request might
         change, such as bi_sector or bi_bdev.  It is hard to see a good
         reason for this, and I don't think anyone actually does it.
       - inspecing the queue to see if, e.g. it is 'full' yet.  Again, I
         think this is very unlikely to be useful, or to be done.
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <dm-devel@redhat.com>
      
      Alasdair G Kergon <agk@redhat.com> said:
      
       I can see nothing wrong with this in principle.
      
       For device-mapper at the moment though it's essential that, while the bio
       mappings may now get delayed, they still get processed in exactly
       the same order as they were passed to generic_make_request().
      
       My main concern is whether the timing changes implicit in this patch
       will make the rare data-corrupting races in the existing snapshot code
       more likely. (I'm working on a fix for these races, but the unfinished
       patch is already several hundred lines long.)
      
       It would be helpful if some people on this mailing list would test
       this patch in various scenarios and report back.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      d89d8796
    • J. Bruce Fields's avatar
      locks: fix F_GETLK regression (failure to find conflicts) · 129a84de
      J. Bruce Fields authored
      In 9d6a8c5c we changed posix_test_lock
      to modify its single file_lock argument instead of taking separate input
      and output arguments.  This makes it no longer safe to set the output
      lock's fl_type to F_UNLCK before looking for a conflict, since that
      means searching for a conflict against a lock with type F_UNLCK.
      
      This fixes a regression which causes F_GETLK to incorrectly report no
      conflict on most filesystems (including any filesystem that doesn't do
      its own locking).
      
      Also fix posix_lock_to_flock() to copy the lock type.  This isn't
      strictly necessary, since the caller already does this; but it seems
      less likely to cause confusion in the future.
      
      Thanks to Doug Chapman for the bug report.
      Signed-off-by: default avatar"J. Bruce Fields" <bfields@citi.umich.edu>
      Acked-by: default avatarDoug Chapman <doug.chapman@hp.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      129a84de
  2. 10 May, 2007 38 commits