Commits · 8e53624d44c1de31b1b0d4f500703669418a4c67 · nexedi / linux

29 Nov, 2016 20 commits

lightnvm: eliminate nvm_lun abstraction in mm · 8e53624d

Javier González authored Nov 28, 2016

In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.

Since targets own the LUNs the are instantiated on top of and manage the
free block list internally, there is no need for a LUN abstraction in
the media manager. LUNs are intrinsically managed as in the physical
layout (ch:0,lun:0, ..., ch:0,lun:n, ch:1,lun:0, ch:1,lun:n, ...,
ch:m,lun:0, ch:m,lun:n) and given to the targets based on the target
creation ioctl. This simplifies LUN management and clears the path for a
partition manager to sit directly underneath LightNVM targets.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

8e53624d

lightnvm: eliminate nvm_block abstraction on mm · 2a02e627

Javier González authored Nov 28, 2016

In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.

A part of this transformation is that targets manage their blocks
internally. This patch eliminates the nvm_block abstraction and moves
block management to the target logic. The rrpc target is transformed.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

2a02e627

lightnvm: remove debug lun statistics from gennvm · eec44565

Javier González authored Nov 28, 2016

Since LUNs are managed internally on targets, the media manager has no
access to the free LUN lists. Thus, debug functions that show LUN
information on the device should not be implemented on the media
manager, but rather on the target in itself.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

eec44565

lightnvm: remove get_lun operation on gennvm · 0ac4072e

Javier González authored Nov 28, 2016

Since LUNs are managed internally on the target, there is no need for
the media manager to implement a get_lun operation.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

0ac4072e

lightnvm: move block provisioning to targets · 8e79b5cb

Javier González authored Nov 28, 2016

In order to naturally support multi-target instances on an Open-Channel
SSD, targets should own the LUNs they get blocks from and manage
provisioning internally. This is done in several steps.

This patch moves the block provisioning inside of the target and removes
the get/put block interface from the media manager.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

8e79b5cb

lightnvm: manage lun partitions internally in mm · 8176117b

Javier González authored Nov 28, 2016

LUNs are exclusively owned by targets implementing a block device FTL.
Doing this reservation requires at the moment a 2-way callback gennvm
<-> target. The reason behind this is that LUNs were not assumed to
always be exclusively owned by targets. However, this design decision
goes against I/O determinism QoS (two targets would mix I/O on the same
parallel unit in the device).

This patch makes LUN reservation as part of the target creation on the
media manager. This makes that LUNs are always exclusively owned by the
target instantiated on top of them. LUN stripping and/or sharing should
be implemented on the target itself or the layers on top.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

8176117b

lightnvm: remove gen_lun abstraction · de93434f

Javier González authored Nov 28, 2016

The gen_lun abstraction in the generic media manager was conceived on
the assumption that a single target would instantiated on top of it.
This has complicated target design to implement multi-instances. Remove
this abstraction and move its logic to nvm_lun, which manages physical
lun geometry and operations.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

de93434f

lightnvm: use constant name instead of value · 98379a12

Javier González authored Nov 28, 2016

There is a constant to refer to free blocks. Use it when marking bad
blocks instead of using a constant value
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

98379a12

lightnvm: remove unnecessary variables in rrpc · eb00352b

Javier González authored Nov 28, 2016

Before vectored I/Os were supported on rrpc, the physical address was
stored as part of the nvm_rqd request. This variable become obsolete
when the ppa_list was introduced. Cleanup this variable.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

eb00352b

lightnvm: make address conversion functions global · 0e5c3246

Javier González authored Nov 28, 2016

Targets are assumed to used the same generic ppa format, where the
address is partitioned on ch:lun:block:pg:pl:sec. Thus, make the
function in charge of transforming the ppa address from a linear format
to the generic one available to all targets.

This function will be needed by the media manager in order to do target
mapping translations when targets are divided on different physical
partitions.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

0e5c3246

lightnvm: cleanup unused target operations · 7e4f64a9

Javier González authored Nov 28, 2016

Cleanup definition leftovers from old gennvm interface
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

7e4f64a9

lightnvm: remove sysfs configuration interface · 17b25cfc

Javier González authored Nov 28, 2016

LightNVM used to be managed and configured through sysfs. Since the
introduction of management ioctls this interface is redundant and
outdated. Get rid of it.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

17b25cfc

lightnvm: rrpc: split bios of size > 256kb · f0b01b6a

Javier González authored Nov 28, 2016

rrpc cannot handle bios of size > 256kb due to NVMe using a 64 bit
bitmap to signal I/O completion. If a larger bio comes, split it
explicitly.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

f0b01b6a

lightnvm: add ECC error codes · 402ab9a8

Javier González authored Nov 28, 2016

Add ECC error codes to enable the appropriate handling in the target.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

402ab9a8

lightnvm: export set bad block table · a24ba464

Javier González authored Nov 28, 2016

Bad blocks should be managed by block owners. This would be either
targets for data blocks or sysblk for system blocks.

In order to support this, export two functions: One to mark a block as
an specific type (e.g., bad block) and another to update the bad block
table on the device.

Move bad block management to rrpc.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

a24ba464

lightnvm: do not protect block 0 · 8a3c95ab

Javier González authored Nov 28, 2016

Device blocks should be marked by the device and considered as bad
blocks by the media manager. Thus, do not make assumptions on which
blocks are going to be used by the device. In doing so we might lose
valid blocks from the free list.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

8a3c95ab

lightnvm: enable to send hint to erase command · bb314979

Javier González authored Nov 28, 2016

Erases might be subject to host hints. An example is multi-plane
programming to erase blocks in parallel. Enable targets to specify this
hint.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

bb314979

nvme: lightnvm: attach lightnvm sysfs to nvme block device · 3dc87dd0

Matias Bjørling authored Nov 28, 2016

Previously, LBA read and write were not supported in the lightnvm
specification. Now that it supports it, lets use the traditional
NVMe gendisk, and attach the lightnvm sysfs geometry export.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

3dc87dd0

nvme: lightnvm: frees wrong cmd structure · 7498e99f

Matias Bjørling authored Nov 28, 2016

When struct nvme_request was introduced, the nvme_nvm_submit_io was
converted to the new interface. The interface moves nvme_nvm_command
data structure into the struct request pdu. On io completion, rq->cmd is
freed, which should have been the dereferenced pdu nvme_request->cmd.

Fixes: d49187e9 "nvme: introduce struct nvme_request"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

7498e99f

blk-mq: Drop explicit timeout sync in hotplug · 415d3dab

Gabriel Krisman Bertazi authored Nov 28, 2016

After commit 287922eb ("block: defer timeouts to a workqueue"),
deleting the timeout work after freezing the queue shouldn't be
necessary, since the synchronization is already enforced by the
acquisition of a q_usage_counter reference in blk_mq_timeout_work.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

415d3dab

28 Nov, 2016 3 commits

blk-wbt: allow wbt to be enabled always through sysfs · d62118b6

Jens Axboe authored Nov 28, 2016

Currently there's no way to enable wbt if it's not enabled in the
kernel config by default for a device. Allow a write to the
'wbt_lat_usec' queue sysfs file to enable wbt.

This is useful for both the kernel config case, but also if the
device is CFQ managed and it was turned off by default.
Signed-off-by: Jens Axboe <axboe@fb.com>

d62118b6

blk-wbt: cleanup disable-by-default for CFQ · fa224eed

Jens Axboe authored Nov 28, 2016

Make it clear that we are disabling wbt for the specified queued,
if it was enabled by default. This is in preparation for allowing
users to re-enable wbt, and not have it disabled automatically
again.
Signed-off-by: Jens Axboe <axboe@fb.com>

fa224eed

blk-wbt: allow reset of default latency through sysfs · 80e091d1

Jens Axboe authored Nov 28, 2016

Allow a write of '-1' to reset the default latency target for
a given device. This removes knowledge of the different default
settings for rotational vs non-rotational from user space.
Signed-off-by: Jens Axboe <axboe@fb.com>

80e091d1

23 Nov, 2016 1 commit

nbd: fix setting of 'error' in NBD_DO_IT ioctl · feffa5cc

Jens Axboe authored Nov 22, 2016

Multiple paths don't set it properly, ensure that we do.

Fixes: 9561a7ad ("nbd: add multi-connection support")
Signed-off-by: Jens Axboe <axboe@fb.com>

feffa5cc

22 Nov, 2016 14 commits

nbd: move multi-connection bit to unused value · 63db89ea

Jens Axboe authored Nov 22, 2016

Bit #7 is already used, move to bit #8 which is the first unused
one.

Fixes: 9561a7ad ("nbd: add multi-connection support")
Signed-off-by: Jens Axboe <axboe@fb.com>

63db89ea

nbd: add multi-connection support · 9561a7ad

Josef Bacik authored Nov 22, 2016

NBD can become contended on its single connection.  We have to serialize all
writes and we can only process one read response at a time.  Fix this by
allowing userspace to provide multiple connections to a single nbd device.  This
coupled with block-mq drastically increases performance in multi-process cases.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

9561a7ad

block,blkcg: use __GFP_NOWARN for best-effort allocations in blkcg · e00f4f4d

Tejun Heo authored Nov 21, 2016

blkcg allocates some per-cgroup data structures with GFP_NOWAIT and
when that fails falls back to operations which aren't specific to the
cgroup.  Occassional failures are expected under pressure and falling
back to non-cgroup operation is the right thing to do.

Unfortunately, I forgot to add __GFP_NOWARN to these allocations and
these expected failures end up creating a lot of noise.  Add
__GFP_NOWARN.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marc MERLIN <marc@merlins.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>