Commits · d54142c71f05b608b7360d80bdab74eed0f17a98 · Kirill Smelkov / linux

07 Aug, 2010 40 commits

blkfront: Klog the unclean release path · d54142c7

Daniel Stodden authored Aug 07, 2010

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

d54142c7

blkfront: Remove obsolete info->users · 7b32d104

Daniel Stodden authored Apr 30, 2010

This is just bd_openers, protected by the bd_mutex.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

7b32d104

blkfront: Remove obsolete info->users · acfca3c6

Daniel Stodden authored Aug 07, 2010

This is just bd_openers, protected by the bd_mutex.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

acfca3c6

blkfront: Lock blockfront_info during xbdev removal · fa1bd359

Daniel Stodden authored Apr 30, 2010

Same approach as blkfront_closing:
 * Grab the bdev safely, holding the info mutex.
 * Zap xbdev safely, holding the info mutex.
 * Try bdev removal safely, holding bd_mutex.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

fa1bd359

blkfront: Fix blkfront backend switch race (bdev release) · 7fd152f4

Daniel Stodden authored Aug 07, 2010

We cannot read backend state within bdev operations, because it risks
grabbing the state change before xenbus gets to do it.

Fixed by tracking deferral with a frontend switch to Closing. State
exposure isn't strictly necessary, but the backends won't mind.

For a 'clean' deferral this seems actually a more decent protocol than
raising errors.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

7fd152f4

blkfront: Fix blkfront backend switch race (bdev open) · 13961743

Daniel Stodden authored Aug 07, 2010

We need not mind if users grab a late handle on a closing disk. We
probably even should not. But we have to make sure it's not a dead
one already

Let the bdev deal with a gendisk deleted under its feet. Takes the
info mutex to decide a race against backend closing.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

13961743

blkfront: Lock blkfront_info when closing · b70f5fa0

Daniel Stodden authored Apr 30, 2010

The bdev .open/.release fops race against backend switches to Closing,
handled by the XenBus thread.

The original code attempted to serialize block device holders and
xenbus only via bd_mutex. This is insufficient, the info->bd pointer
may already be stale (or null) while xenbus tries to bump up the
refcount.

Protect blkfront_info with a dedicated mutex.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

b70f5fa0

blkfront: Clean up vbd release · a66b5aeb

Daniel Stodden authored Aug 07, 2010

 * Current blkfront_closing is rather a xlvbd_release_gendisk.
   Renamed in preparation of later patches (need the name again).

 * Removed the misleading comment -- this only applied to the backend
   switch handler, and the queue is already flushed btw.

 * Break out the xenbus call, callers know better when to switch
   frontend state.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

a66b5aeb

blkfront: Fix gendisk leak · 9897cb53

Daniel Stodden authored Apr 30, 2010

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

9897cb53

blkfront: Fix backtrace in del_gendisk · 89de1669

Daniel Stodden authored Apr 30, 2010

The call to del_gendisk follows an non-refcounted gd->queue
pointer. We release the last ref in blk_cleanup_queue. Fixed by
reordering releases accordingly.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

89de1669

xenbus: Make xenbus_switch_state transactional · 5b61cb90

Daniel Stodden authored Apr 30, 2010

According to the comments, this was how it's been done years ago, but
apparently took an xbt pointer from elsewhere back then. The code was
removed because of consistency issues: cancellation wont't roll back
the saved xbdev->state.

Still, unsolicited writes to the state field remain an issue,
especially if device shutdown takes thread synchronization, and subtle
races cause accidental recreation of the device node.

Fixed by reintroducing the transaction. An internal one is sufficient,
so the xbdev->state value remains consistent.

Also fixes the original hack to prevent infinite recursion. Instead of
bailing out on the first attempt to switch to Closing, checks call
depth now.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

5b61cb90

xen/blkfront: revalidate after setting capacity · 2def141e

K. Y. Srinivasan authored Mar 18, 2010

Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

2def141e

xen/blkfront: avoid compiler warning from missing cases · b4dddb49

Jeremy Fitzhardinge authored Mar 11, 2010

Fix:
drivers/block/xen-blkfront.c: In function ‘blkfront_connect’:
drivers/block/xen-blkfront.c:933: warning: enumeration value ‘BLKIF_STATE_DISCONNECTED’ not handled in switch
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

b4dddb49

xen/front: Propagate changed size of VBDs · 1fa73be6

K. Y. Srinivasan authored Mar 11, 2010

Support dynamic resizing of virtual block devices. This patch supports
both file backed block devices as well as physical devices that can be
dynamically resized on the host side.
Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

1fa73be6

blkfront: don't access freed struct xenbus_device · 5d7ed20e

Jan Beulich authored Aug 07, 2010

Unfortunately commit "blkfront: fixes for 'xm block-detach ... --force'"
still wasn't quite right - there was a reference to freed memory left
from blkfront_closing().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

5d7ed20e

blkfront: fixes for 'xm block-detach ... --force' · 0e345826

Jan Beulich authored Aug 07, 2010

Prevent prematurely freeing 'struct blkfront_info' instances (when the
xenbus data structures are gone, but the Linux ones are still needed).

Prevent adding a disk with the same (major, minor) [and hence the same
name and sysfs entries, which leads to oopses] when the previous
instance wasn't fully de-allocated yet.

This still doesn't address all issues resulting from forced detach:
I/O submitted after the detach still blocks forever, likely preventing
subsequent un-mounting from completing. It's not clear to me (not
knowing much about the block layer) how this can be avoided.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

0e345826

xen: use less generic names in blkfront driver. · 203fd61f

Ian Campbell authored Dec 04, 2009

All Xen frontend drivers have a couple of identically named functions which
makes figuring out which device went wrong from a stacktrace harder than it
needs to be. Rename them to something specificto the device type.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

203fd61f

writeback.h: needs linux/device.h · 96dccab1

Randy Dunlap authored Jul 19, 2010

include/trace/events/writeback.h uses dev_name(), so it needs to
include linux/device.h.

include/trace/events/writeback.h:12: error: implicit declaration of function 'dev_name'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

96dccab1

block: fix problem with sending down discard that isn't of correct granularity · 10d1f9e2

Jens Axboe authored Jul 15, 2010

If the queue doesn't have a limit set, or it just set UINT_MAX like
we default to, we coud be sending down a discard request that isn't
of the correct granularity if the block size is > 512b.

Fix this by adjusting max_discard_sectors down to the proper
alignment.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

10d1f9e2

blkdev: check for valid request queue before issuing flush · f10d9f61

Dave Chinner authored Jul 13, 2010

Issuing a blkdev_issue_flush() on an unconfigured loop device causes a panic as
q->make_request_fn is not configured. This can occur when trying to mount the
unconfigured loop device as an XFS filesystem. There are no guards that catch
the bio before the request function is called because we don't add a payload to
the bio. Instead, manually check this case as soon as we have a pointer to the
queue to flush.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f10d9f61

block: fix for block tracing build error · 2669b19f

Stephen Rothwell authored Jul 09, 2010

block/compat_ioctl.c: In function 'compat_blkdev_ioctl':
block/compat_ioctl.c:754: error: 'BLKTRACESETUP32' undeclared (first use in this function)
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

2669b19f

scsi/i2o: restore ioctl changes · 2daa672b

Arnd Bergmann authored Jul 08, 2010

This restores the changes from "scsi/i2o_block: cleanup ioctl
handling", which accidentally got reverted.

Origignal changelog:
      This fixes the ioctl function of the i2o_block driver, which
      has multiple problems:

      * The BLKI2OSRSTRAT and BLKI2OSWSTRAT commands always return
        -ENOTTY on success, where they should return 0.
      * Support for 32 bit compat is missing
      * The driver should use the .ioctl function and because
        .locked_ioctl is going away.

      The use of the big kernel lock remains for now, but gets
      made explictit in the ioctl function.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

2daa672b

scsi/sd: remove big kernel lock · 409f3499

Arnd Bergmann authored Jul 07, 2010

Every user of the BKL in the sd driver is the
result of the pushdown from the block layer
into the open/close/ioctl functions.

The only place that used to rely on the BKL is
the sdkp->openers variable, which gets converted
into an atomic_t.

Nothing else seems to rely on the BKL, since the
functions do not touch global data without holding
another lock, and the open/close functions are
still protected from concurrent execution using
the bdev->bd_mutex.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

409f3499

block: remove BKL from partition ioctls · 15392efb

Arnd Bergmann authored Jul 07, 2010

The blkpg_ioctl and blkdev_reread_part access fields of
the bdev and gendisk structures, yet they always do so
under the protection of bdev->bd_mutex, which seems
sufficient.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
cked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

15392efb

block: remove BKL from BLKROSET and BLKFLSBUF · 6de43703

Arnd Bergmann authored Jul 07, 2010

We only call the functions set_device_ro(),
invalidate_bdev(), sync_filesystem() and sync_blockdev()
while holding the BKL in these commands. All
of these are also done in other code paths without
the BKL, which leads me to the conclusion that
the BKL is not needed here either.

The reason we hold it here is that it was originally
pushed down into the ioctl function from vfs_ioctl.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

6de43703

block: push BKL into blktrace ioctls · 62c2a7d9

Arnd Bergmann authored Jul 07, 2010

The blktrace driver currently needs the BKL, but
we should not need to take that in the block layer,
so just push it down into the driver itself.

It is quite likely that the BKL is not actually
required in blktrace code and could be removed
in a follow-on patch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

62c2a7d9

block: push down BKL into .open and .release · 6e9624b8

Arnd Bergmann authored Aug 07, 2010

The open and release block_device_operations are currently
called with the BKL held. In order to change that, we must
first make sure that all drivers that currently rely
on this have no regressions.

This blindly pushes the BKL into all .open and .release
operations for all block drivers to prepare for the
next step. The drivers can subsequently replace the BKL
with their own locks or remove it completely when it can
be shown that it is not needed.

The functions blkdev_get and blkdev_put are the only
remaining users of the big kernel lock in the block
layer, besides a few uses in the ioctl code, none
of which need to serialize with blkdev_{get,put}.

Most of these two functions is also under the protection
of bdev->bd_mutex, including the actual calls to
->open and ->release, and the common code does not
access any global data structures that need the BKL.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

6e9624b8

block: push down BKL into .locked_ioctl · 8a6cfeb6

Arnd Bergmann authored Jul 08, 2010

As a preparation for the removal of the big kernel
lock in the block layer, this removes the BKL
from the common ioctl handling code, moving it
into every single driver still using it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

8a6cfeb6

scsi/i2o_block: cleanup ioctl handling · 34484062

Arnd Bergmann authored Jul 07, 2010

This fixes the ioctl function of the i2o_block driver, which
has multiple problems:

* The BLKI2OSRSTRAT and BLKI2OSWSTRAT commands always return
  -ENOTTY on success, where they should return 0.
* Support for 32 bit compat is missing
* The driver should use the .ioctl function and because
  .locked_ioctl is going away.

The use of the big kernel lock remains for now, but gets
made explictit in the ioctl function.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

34484062

scsi: fix discard page leak · 610a6349

FUJITA Tomonori authored Jul 08, 2010

We leak a page allocated for discard on some error conditions
(e.g. scsi_prep_state_check returns BLKPREP_DEFER in
scsi_setup_blk_pc_cmnd).

We unprep on requests that weren't prepped in the error path of
scsi_init_io. It makes the error path to clean up scsi commands messy.

Let's strictly apply the rule that we can't unprep on a request that
wasn't prepped.

Calling just scsi_put_command() in the error path of scsi_init_io() is
enough. We don't set REQ_DONTPREP yet.

scsi_setup_discard_cmnd can safely free a page on the error case with
the above rule.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

610a6349

writeback: Add tracing to write_cache_pages · 9e094383

Dave Chinner authored Jul 07, 2010

Add a trace event to the ->writepage loop in write_cache_pages to give
visibility into how the ->writepage call is changing variables within the
writeback control structure. Of most interest is how wbc->nr_to_write changes
from call to call, especially with filesystems that write multiple pages
in ->writepage.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

9e094383

writeback: Add tracing to balance_dirty_pages · 028c2dd1

Dave Chinner authored Jul 07, 2010

Tracing high level background writeback events is good, but it doesn't
give the entire picture. Add visibility into write throttling to catch IO
dispatched by foreground throttling of processing dirtying lots of pages.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

028c2dd1

writeback: Initial tracing support · 455b2864

Dave Chinner authored Jul 07, 2010

Trace queue/sched/exec parts of the writeback loop. This provides
insight into when and why flusher threads are scheduled to run. e.g
a sync invocation leaves traces like:

sync-[...]: writeback_queue: bdi 8:0: sb_dev 8:1 nr_pages=7712 sync_mode=0 kupdate=0 range_cyclic=0 background=0
flush-8:0-[...]: writeback_exec: bdi 8:0: sb_dev 8:1 nr_pages=7712 sync_mode=0 kupdate=0 range_cyclic=0 background=0

This also lays the foundation for adding more writeback tracing to
provide deeper insight into the whole writeback path.

The original tracing code is from Jens Axboe, though this version is
a rewrite as a result of the code being traced changing
significantly.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

455b2864

block: remove unused REQ_TYPE_LINUX_BLOCK · a89f5c89

FUJITA Tomonori authored Jul 06, 2010

Nobody uses REQ_TYPE_LINUX_BLOCK (and its REQ_LB_OP_*).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

a89f5c89

scsi: need to reset unprep_rq_fn in sd_remove · 82b6d57f

FUJITA Tomonori authored Jul 03, 2010

This is for block's for-2.6.36.

We need to reset q->unprep_rq_fn in sd_remove. Otherwise we hit kernel
oops if we access to a scsi disk device via sg after removing scsi
disk module.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

82b6d57f

block: remove q->prepare_flush_fn completely · 00fff265

FUJITA Tomonori authored Jul 03, 2010

This removes q->prepare_flush_fn completely (changes the
blk_queue_ordered API).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

00fff265

ide: stop using q->prepare_flush_fn · afc23068

FUJITA Tomonori authored Jul 03, 2010

use REQ_FLUSH flag instead.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

afc23068

virtio_blk: stop using q->prepare_flush_fn · dd40e456

FUJITA Tomonori authored Jul 03, 2010

use REQ_FLUSH flag instead.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

dd40e456

dm: stop using q->prepare_flush_fn · 144d6ed5

FUJITA Tomonori authored Jul 03, 2010

use REQ_FLUSH flag instead.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

144d6ed5

ps3disk: stop using q->prepare_flush_fn · 98d8c8f4

FUJITA Tomonori authored Jul 03, 2010

REQ_FLUSH flag enables us to kill ps3disk_prepare_flush().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

98d8c8f4