- 01 May, 2017 6 commits
-
-
Christoph Hellwig authored
This untangles the DM_MAPIO_* values returned from ->clone_and_map_rq from the error codes used by the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Christoph Hellwig authored
Instead of returning either a DM_ENDIO_* constant or an error code, add a new DM_ENDIO_DONE value that means keep errno as is. This allows us to easily keep the existing error code in case where we can't push back, and it also preparares for the new block level status codes with strict type checking. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Christoph Hellwig authored
This simplifies the I/O completion path a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
-
Mikulas Patocka authored
dm-bufio checks a watermark when it allocates a new buffer in __bufio_new(). However, it doesn't check the watermark when the user changes /sys/module/dm_bufio/parameters/max_cache_size_bytes. This may result in a problem - if the watermark is high enough so that all possible buffers are allocated and if the user lowers the value of "max_cache_size_bytes", the watermark will never be checked against the new value because no new buffer would be allocated. To fix this, change __evict_old_buffers() so that it checks the watermark. __evict_old_buffers() is called every 30 seconds, so if the user reduces "max_cache_size_bytes", dm-bufio will react to this change within 30 seconds and decrease memory consumption. Depends-on: 1b0fb5a5 ("dm bufio: avoid a possible ABBA deadlock") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
__get_memory_limit() tests if dm_bufio_cache_size changed and calls __cache_size_refresh() if it did. It takes dm_bufio_clients_lock while it already holds the client lock. However, lock ordering is violated because in cleanup_old_buffers() dm_bufio_clients_lock is taken before the client lock. This results in a possible deadlock and lockdep engine warning. Fix this deadlock by changing mutex_lock() to mutex_trylock(). If the lock can't be taken, it will be re-checked next time when a new buffer is allocated. Also add "unlikely" to the if condition, so that the optimizer assumes that the condition is false. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 28 Apr, 2017 6 commits
-
-
Dan Williams authored
Commit 99e6608c "block: Add badblock management for gendisks" allowed for drivers like pmem and software-raid to advertise a list of bad media areas. However, it inadvertently added a 'badblocks' to all block devices. Lets clean this up by having the 'badblocks' attribute not be visible when the driver has not populated a 'struct badblocks' instance in the gendisk. Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
The only difference between ->run_work and ->delay_work, is that the latter is used to defer running a queue. This is done by marking the queue stopped, and scheduling ->delay_work to run sometime in the future. While the queue is stopped, direct runs or runs through ->run_work will not run the queue. If we combine the handlers, then we need to handle two things: 1) If a delayed/stopped run is scheduled, then we should not run the queue before that has been completed. 2) If a queue is delayed/stopped, the handler needs to restart the queue. Normally a run of a queue with the stopped bit set would be a no-op. Case 1 is handled by modifying a currently pending queue run to the deadline set by the caller of blk_mq_delay_queue(). Subsequent attempts to queue a queue run will find the work item already pending, and direct runs will see a stopped queue as before. Case 2 is handled by adding a new bit, BLK_MQ_S_START_ON_RUN, that tells the work handler that it should clear a stopped queue and run the handler. Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
This modifies (or adds, if not currently pending) an existing delayed work item. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
They serve the exact same purpose. Get rid of the non-delayed work variant, and just run it without delay for the normal case. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Josef Bacik authored
list_for_each_entry() isn't super safe if we're freeing the objects while we traverse the list. Also don't bother taking the extra reference, the module refcounting stuff will save us from having anybody messing with the device while we're trying to unload. Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ulf Hansson authored
Seems like this was forgotten in the bfq-series from Paolo. Let's do it now so people don't miss out involving Paolo for any future changes or when reporting bugs. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 27 Apr, 2017 17 commits
-
-
Bart Van Assche authored
I/O errors triggered by multipathd incorrectly not enabling the no-flush flag for DM_DEVICE_SUSPEND or DM_DEVICE_RESUME are hard to debug. Add more logging to make it easier to debug this. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
No functional change but makes the code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Instead of checking MPATHF_QUEUE_IF_NO_PATH, MPATHF_SAVED_QUEUE_IF_NO_PATH and the no_flush flag to decide whether or not to push back a request (or bio) if there are no paths available, only clear MPATHF_QUEUE_IF_NO_PATH in queue_if_no_path() if no_flush has not been set. The result is that only a single bit has to be tested in the hot path to decide whether or not a request must be pushed back and also that m->lock does not have to be taken in the hot path. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Introduce an enumeration type for the queue mode. This patch does not change any functionality but makes the DM code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Verify at runtime that __pg_init_all_paths() is called with multipath.lock held if lockdep is enabled. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Ensure that the assumptions about the caller holding suspend_lock are checked at runtime if lockdep is enabled. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
The 'cache_size' argument of dm_block_manager_create() has never been used. Remove it along with the definitions of the constants passed as the 'cache_size' argument. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Otherwise the request-based DM blk-mq request_queue will be put into service without being properly exported via sysfs. Cc: stable@vger.kernel.org Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
Requeuing a request immediately while path initialization is ongoing causes high CPU usage, something that is undesired. Hence delay requeuing while path initialization is in progress. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
If blk_get_request() fails, check whether the failure is due to a path being removed. If that is the case, fail the path by triggering a call to fail_path(). This avoids that the following scenario can be encountered while removing paths: * CPU usage of a kworker thread jumps to 100%. * Removing the DM device becomes impossible. Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK, and the queue is not dying, because in these cases immediate requeuing is inappropriate. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bart Van Assche authored
activate_path() is renamed to activate_path_work() which now calls activate_or_offline_path(). activate_or_offline_path() will be used by the next commit. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Adrian Salido authored
When calling a dm ioctl that doesn't process any data (IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct dm_ioctl are left initialized. Current code is incorrectly extending the size of data copied back to user, causing the contents of kernel stack to be leaked to user. Fix by only copying contents before data and allow the functions processing the ioctl to override. Cc: stable@vger.kernel.org Signed-off-by: Adrian Salido <salidoa@google.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
The log2 of sectors_per_block was already calculated, so we don't have to use the ilog2 function. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Andy Shevchenko authored
There is no need to have a duplication of the generic library, i.e. hex2bin(). Replace the open coded variant. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Jens Axboe authored
At least one driver, mtip32xx, has a hard coded dependency on the value of the reserved tag used for internal commands. While that should really be fixed up, for now let's ensure that we just bypass the scheduler tags an allocation marked as reserved. They are used for house keeping or error handling, so we can safely ignore them in the scheduler. Tested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
mtip32xx supposes that 'request_idx' passed to .init_request() is tag of the request, and use that as request's tag to initialize command header. After MQ IO scheduler is in, request tag assigned isn't same with the request index anymore, so cause strange hardware failure on mtip32xx, even whole system panic is triggered. This patch fixes the issue by initializing command header via request's real tag. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 26 Apr, 2017 11 commits
-
-
Bart Van Assche authored
Show the SCSI CDB for pending SCSI commands in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list. An example of how SCSI commands are displayed by this code: ffff8801703245c0 {.op=READ, .cmd_flags=META PRIO, .rq_flags=DONTPREP IO_STAT STATS, .tag=14, .internal_tag=-1, .cmd=Read(10) 28 00 2a 81 1b 30 00 00 08 00} Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <linux-scsi@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
This new callback function will be used in the next patch to show more information about SCSI requests. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
Show the operation name, .cmd_flags and .rq_flags as names instead of numbers. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
This patch does not change any functionality but makes it possible to produce a single line of output with multiple flag-to-name translations. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
Move the "state" attribute from the top level to the "mq" directory as requested by Omar. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
We currently call blk_mq_free_queue() from blk_cleanup_queue() before we unregister the debugfs attributes for that queue in blk_release_queue(). This leaves a window open during which accessing most of the mq debugfs attributes would cause a use-after-free. Additionally, the "state" attribute allows running the queue, which we should not do after the queue has entered the "dead" state. Fix both cases by unregistering the debugfs attributes before freeing queue resources starts. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
Hctx unregistration involves calling kobject_del(). kobject_del() must not be called if kobject_add() has not been called. Hence in the error path only unregister hctxs for which registration succeeded. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
Since the blk_mq_debugfs_*register_hctxs() functions register and unregister all attributes under the "mq" directory, rename these into blk_mq_debugfs_*register_mq(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
A later patch will move the call of blk_mq_debugfs_register() to a function to which the queue name is not passed as an argument. To avoid having to add a 'name' argument to multiple callers, let blk_mq_debugfs_register() look up the queue name. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Bart Van Assche authored
A later patch in this series will modify blk_mq_debugfs_register() such that it uses q->kobj.parent to determine the name of a request queue. Hence make sure that that pointer is initialized before blk_mq_debugfs_register() is called. To avoid lock inversion, protect sysfs / debugfs registration with the queue sysfs_lock instead of the global mutex all_q_mutex. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
The caller only looks at the scsi_request result field anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-