Commits · 4e781b498ee5008ede91362d91404a362e7a46b3 · Kirill Smelkov / linux

22 Sep, 2016 2 commits

dm cache: speed up writing of the hint array · 4e781b49

Joe Thornber authored Sep 15, 2016

It's far quicker to always delete the hint array and recreate with
dm_array_new() because we avoid the copying caused by mutation.

Also simplifies the policy interface, replacing the walk_hints() with
the simpler get_hint().
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

4e781b49

dm array: add dm_array_new() · dd6a77d9

Joe Thornber authored Sep 15, 2016

dm_array_new() creates a new, populated array more efficiently than
starting with an empty one and resizing.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

dd6a77d9

15 Sep, 2016 4 commits

dm mpath: delay the requeue of blk-mq requests while all paths down · b88efd43

Mike Snitzer authored Sep 09, 2016

Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq.  Also, return
false from .busy, if all paths are down, so that blk-mq requests get
mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
being returned to dm-rq.

This change allows for a noticeable reduction in cpu utilization
(reduced kworker load) while all paths are down, e.g.:

system CPU idleness (as measured by fio's --idle-prof=system):
before: system: 86.58%
after:  system: 98.60%
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>

b88efd43

dm mpath: use dm_mq_kick_requeue_list() · 7e48c768

Mike Snitzer authored Sep 14, 2016

When reinstating a path the blk-mq request_queue's requeue_list should
get kicked.  It makes sense to kick the requeue_list as part of the
existing hook (previously only used by bio-based support).

Rename process_queued_bios_list to process_queued_io_list.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>

7e48c768

dm rq: introduce dm_mq_kick_requeue_list() · e0c10752

Mike Snitzer authored Sep 14, 2016

Make it possible for a request-based target to kick the DM device's
blk-mq request_queue's requeue_list.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>

e0c10752

dm rq: reduce arguments passed to map_request() and dm_requeue_original_request() · fbc39b4c
Mike Snitzer authored Sep 13, 2016
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
```
fbc39b4c

14 Sep, 2016 23 commits

dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests · a8ac51e4

Mike Snitzer authored Sep 09, 2016

Otherwise blk-mq will immediately dispatch requests that are requeued
via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.

Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
with a delay of 5 secs. In the context of DM multipath (all paths down)
it doesn't make any sense to requeue more quickly.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

a8ac51e4

dm: convert wait loops to use autoremove_wake_function() · 9f4c3f87

Bart Van Assche authored Aug 31, 2016

Use autoremove_wake_function() instead of default_wake_function()
to make the dm wait loops more similar to other wait loops in the
kernel.  This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

9f4c3f87

dm: use signal_pending_state() in dm_wait_for_completion() · e3fabdfd

Bart Van Assche authored Aug 31, 2016

Use signal_pending_state() instead of open-coding it. This patch does
not change any functionality but makes it possible to pass TASK_KILLABLE
as the second argument of dm_wait_for_completion(). See also commit
16882c1e ("sched: fix TASK_WAKEKILL vs SIGKILL race").

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

e3fabdfd

dm: rename task state function arguments · b48633f8

Bart Van Assche authored Aug 31, 2016

Rename 'interruptible' into 'task_state' to make it clear that this
argument is a task state instead of a boolean.  Also, change type from
int to long.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

b48633f8

dm: add two lockdep_assert_held() statements · 5a8f1f80

Bart Van Assche authored Aug 31, 2016

Document the locking assumptions for the __bind() and __dm_suspend()
functions.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

5a8f1f80

dm rq: simplify dm_old_stop_queue() · c533f249

Bart Van Assche authored Aug 31, 2016

This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

c533f249

dm mpath: check if path's request_queue is dying in activate_path() · f10e06b7

Mike Snitzer authored Sep 01, 2016

If pg_init_retries is set and a request is queued against a multipath
device with all underlying block device request_queues in the "dying"
state then an infinite loop is triggered because activate_path() never
succeeds and hence never calls pg_init_done().

This change avoids that device removal triggers an infinite loop by
failing the activate_path() which causes the "dying" path to be failed.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

f10e06b7

dm rq: take request_queue lock while clearing QUEUE_FLAG_STOPPED · 9dbeaeab

Mike Snitzer authored Sep 01, 2016

Every call of queue_flag_clear_unlocked() after block device
initialization has finished is wrong if blk_cleanup_queue() can be
called concurrently.  Convert queue_flag_clear_unlocked() into
queue_flag_clear() and protect it by the block layer queue lock.

Also, factor out dm_mq_start_queue().
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

9dbeaeab

dm rq: factor out dm_mq_stop_queue() · 2397a15a

Bart Van Assche authored Aug 31, 2016

Also, check that the blk-mq request_queue isn't already stopped.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

2397a15a

dm: mark request_queue dead before destroying the DM device · 3b785fbc

Bart Van Assche authored Aug 31, 2016

This avoids that new requests are queued while __dm_destroy() is in
progress.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

3b785fbc

dm: return correct error code in dm_resume()'s retry loop · 8dc23658

Minfei Huang authored Sep 06, 2016

dm_resume() will return success (0) rather than -EINVAL if
!dm_suspended_md() upon retry within dm_resume().

Reset the error code at the start of dm_resume()'s retry loop.
Also, remove a useless assignment at the end of dm_resume().

Fixes: ffcc3936 ("dm: enhance internal suspend and resume interface")
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

8dc23658

blk-mq: introduce blk_mq_delay_kick_requeue_list() · 2849450a

Mike Snitzer authored Sep 14, 2016

blk_mq_delay_kick_requeue_list() provides the ability to kick the
q->requeue_list after a specified time.  To do this the request_queue's
'requeue_work' member was changed to a delayed_work.

blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
requests while it doesn't make sense to immediately requeue them
(e.g. when all paths in a DM multipath have failed).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

2849450a

block: remove IOPRIO_BITS · c5c5ca77

Christoph Hellwig authored Sep 11, 2016

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

c5c5ca77

bio.h: remove a very outdated comment · fc95db3e
Christoph Hellwig authored Sep 11, 2016
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
```
fc95db3e

block: remove bio_destructor_t · 3f7c624a

Christoph Hellwig authored Sep 11, 2016

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

3f7c624a

block: Improve bio_set_op_attrs() robustness · 3e1de31b

Bart Van Assche authored Sep 14, 2016

Since REQ_OP_BITS == 3 and __REQ_NR_BITS == 30 it is not that hard
to pass an op_flags argument to bio_set_op_attrs() that is larger
than the number of bits reserved for the op_flags argument. Complain
if this happens. Additionally, ensure that negative arguments trigger
a complaint (1 << ... is signed while 1U << ... is unsigned; adding
0U to an integer expression causes it to be promoted to an unsigned
type).
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

3e1de31b

block, dm-crypt, btrfs: Introduce bio_flags() · 4382e33a

Bart Van Assche authored Sep 14, 2016

Introduce the bio_flags() macro. Ensure that the second argument of
bio_set_op_attrs() only contains flags and no operation. This patch
does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

4382e33a

block: Document that bio_op() uses the data type of bio.bi_opf · 637ca77b

Bart Van Assche authored Sep 14, 2016

Make it clear that the sizeof(unsigned int) expression in BIO_OP_SHIFT
refers to the bi_opf member of struct bio.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

637ca77b

block: remove remnant refs to hardsect · a441b0d0

Linus Walleij authored Sep 14, 2016

commit e1defc4f
"block: Do away with the notion of hardsect_size"
removed the notion of "hardware sector size" from
the kernel in favor of logical block size, but
references remain in comments and documentation.

Update the remaining sites mentioning hardsect.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

a441b0d0

block: remove blk_mq_alloc_single_hw_queue() prototype · abe47114

Linus Walleij authored Sep 14, 2016

The blk_mq_alloc_single_hw_queue() is a prototype artifact that
should have been removed with
commit cdef54dd
"blk-mq: remove alloc_hctx and free_hctx methods" where the last
users of it were deleted.

Fixes: cdef54dd ("blk-mq: remove alloc_hctx and free_hctx methods")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

abe47114

block_dev: remove DAX leftovers · 22375701

Christoph Hellwig authored Sep 14, 2016

DAX support for block devices was removed in commits 03cdad
("block: disable block device DAX by default") and 99a01cdf
("block: remove BLK_DEV_DAX config option"), but we still kept a call to
dax_do_io and some uneeded i_flags manipulations introduced in commit
bbab37 ("block: Add support for DAX reads/writes to block devices").

Remove those leftovers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

22375701

block: enable zeroing of io_poll statistics · d21ea4bc

Stephen Bates authored Sep 13, 2016

Allow the io_poll statistics to be zeroed to make for easier logging
of polling event.
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

d21ea4bc

block: add poll_considered statistic · 6e219353

Stephen Bates authored Sep 13, 2016

In order to help determine the effectiveness of polling in a running
system it is usful to determine the ratio of how often the poll
function is called vs how often the completion is checked. For this
reason we add a poll_considered variable and add it to the sysfs entry
for io_poll.
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

6e219353

08 Sep, 2016 4 commits

nbd: allow block mq to deal with timeouts · 0eadf37a

Josef Bacik authored Sep 08, 2016

Instead of rolling our own timer, just utilize the blk mq req timeout and do the
disconnect if any of our commands timeout.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

0eadf37a

nbd: use flags instead of bool · 9b4a6ba9

Josef Bacik authored Sep 08, 2016

In preparation for some future changes, change a few of the state bools over to
normal bits to set/clear properly.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

9b4a6ba9

nbd: don't shutdown sock with irq's disabled · c2611898

Josef Bacik authored Sep 08, 2016

We hit a warning when shutting down the nbd connection because we have irq's
disabled.  We don't really need to do the shutdown under the lock, just clear
the nbd->sock.  So do the shutdown outside of the irq.  This gets rid of the
warning.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

c2611898

nbd: convert to blkmq · fd8383fd

Josef Bacik authored Sep 08, 2016

This moves NBD over to using blkmq, which allows us to get rid of the NBD
wide queue lock and the async submit kthread.  We will start with 1 hw
queue for now, but I plan to add multiple tcp connection support in the
future and we'll fix how we set the hwqueue's.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

fd8383fd

29 Aug, 2016 6 commits

mtip32xx: mark symbols static where possible · 99e6b87e

Baoyou Xie authored Aug 26, 2016

We get 1 warning when biuld kernel with W=1:
drivers/block/mtip32xx/mtip32xx.c:3689:6: warning: no previous prototype for
 'mtip_block_release' [-Wmissing-prototypes]

In fact, this function is only used in the file in which it is declared
and don't need a declaration, but can be made static.
so this patch marks it 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>

99e6b87e

blk-mq: prefetch request in blk_mq_tag_to_rq() · 88c7b2b7

Jens Axboe authored Aug 25, 2016

When drivers or the core calls this function, they usually
dereference the request shortly there after. Prefetch the first
cache line.

Profiling IO workloads shows that this is the most common cache
miss on the block side of things.
Signed-off-by: Jens Axboe <axboe@fb.com>

88c7b2b7

blk-mq: improve layout of blk_mq_hw_ctx · 8d354f13

Jens Axboe authored Aug 25, 2016

Various cache line optimizations:

- Move delay_work towards the end. It's huge, and we don't use it
  a lot (only SCSI).

- Move the atomic state into the same cacheline as the the dispatch
  list and lock.

- Rearrange a few members to pack it better.

- Shrink the max-order for dispatch accounting from 10 to 7. This
  means that ->dispatched[] and ->run now take up their own
  cacheline.

This shrinks struct blk_mq_hw_ctx down to 8 cachelines.
Signed-off-by: Jens Axboe <axboe@fb.com>

8d354f13

blk-mq: turn hctx->run_work into a regular work struct · 27489a3c

Jens Axboe authored Aug 24, 2016

We don't need the larger delayed work struct, since we always run it
immediately.
Signed-off-by: Jens Axboe <axboe@fb.com>

27489a3c

block: add kblockd_schedule_work_on() · ee63cfa7

Jens Axboe authored Aug 24, 2016

Add a helper to schedule a regular struct work on a particular CPU.
Signed-off-by: Jens Axboe <axboe@fb.com>

ee63cfa7

workqueue: add cancel_work() · f72b8792

Jens Axboe authored Aug 24, 2016

Like cancel_delayed_work(), but for regular work.
Signed-off-by: Jens Axboe <axboe@fb.com>
Mehed-by: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>

f72b8792

28 Aug, 2016 1 commit
- Linux 4.8-rc4 · 3eab887a
  Linus Torvalds authored Aug 28, 2016
  
  3eab887a