- 29 Nov, 2016 20 commits
-
-
Javier González authored
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. Since targets own the LUNs the are instantiated on top of and manage the free block list internally, there is no need for a LUN abstraction in the media manager. LUNs are intrinsically managed as in the physical layout (ch:0,lun:0, ..., ch:0,lun:n, ch:1,lun:0, ch:1,lun:n, ..., ch:m,lun:0, ch:m,lun:n) and given to the targets based on the target creation ioctl. This simplifies LUN management and clears the path for a partition manager to sit directly underneath LightNVM targets. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. A part of this transformation is that targets manage their blocks internally. This patch eliminates the nvm_block abstraction and moves block management to the target logic. The rrpc target is transformed. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Since LUNs are managed internally on targets, the media manager has no access to the free LUN lists. Thus, debug functions that show LUN information on the device should not be implemented on the media manager, but rather on the target in itself. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Since LUNs are managed internally on the target, there is no need for the media manager to implement a get_lun operation. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
In order to naturally support multi-target instances on an Open-Channel SSD, targets should own the LUNs they get blocks from and manage provisioning internally. This is done in several steps. This patch moves the block provisioning inside of the target and removes the get/put block interface from the media manager. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
LUNs are exclusively owned by targets implementing a block device FTL. Doing this reservation requires at the moment a 2-way callback gennvm <-> target. The reason behind this is that LUNs were not assumed to always be exclusively owned by targets. However, this design decision goes against I/O determinism QoS (two targets would mix I/O on the same parallel unit in the device). This patch makes LUN reservation as part of the target creation on the media manager. This makes that LUNs are always exclusively owned by the target instantiated on top of them. LUN stripping and/or sharing should be implemented on the target itself or the layers on top. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
The gen_lun abstraction in the generic media manager was conceived on the assumption that a single target would instantiated on top of it. This has complicated target design to implement multi-instances. Remove this abstraction and move its logic to nvm_lun, which manages physical lun geometry and operations. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
There is a constant to refer to free blocks. Use it when marking bad blocks instead of using a constant value Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Before vectored I/Os were supported on rrpc, the physical address was stored as part of the nvm_rqd request. This variable become obsolete when the ppa_list was introduced. Cleanup this variable. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Targets are assumed to used the same generic ppa format, where the address is partitioned on ch:lun:block:pg:pl:sec. Thus, make the function in charge of transforming the ppa address from a linear format to the generic one available to all targets. This function will be needed by the media manager in order to do target mapping translations when targets are divided on different physical partitions. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Cleanup definition leftovers from old gennvm interface Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
LightNVM used to be managed and configured through sysfs. Since the introduction of management ioctls this interface is redundant and outdated. Get rid of it. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
rrpc cannot handle bios of size > 256kb due to NVMe using a 64 bit bitmap to signal I/O completion. If a larger bio comes, split it explicitly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Add ECC error codes to enable the appropriate handling in the target. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Bad blocks should be managed by block owners. This would be either targets for data blocks or sysblk for system blocks. In order to support this, export two functions: One to mark a block as an specific type (e.g., bad block) and another to update the bad block table on the device. Move bad block management to rrpc. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Device blocks should be marked by the device and considered as bad blocks by the media manager. Thus, do not make assumptions on which blocks are going to be used by the device. In doing so we might lose valid blocks from the free list. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Javier González authored
Erases might be subject to host hints. An example is multi-plane programming to erase blocks in parallel. Enable targets to specify this hint. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Matias Bjørling authored
Previously, LBA read and write were not supported in the lightnvm specification. Now that it supports it, lets use the traditional NVMe gendisk, and attach the lightnvm sysfs geometry export. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Matias Bjørling authored
When struct nvme_request was introduced, the nvme_nvm_submit_io was converted to the new interface. The interface moves nvme_nvm_command data structure into the struct request pdu. On io completion, rq->cmd is freed, which should have been the dereferenced pdu nvme_request->cmd. Fixes: d49187e9 "nvme: introduce struct nvme_request" Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Gabriel Krisman Bertazi authored
After commit 287922eb ("block: defer timeouts to a workqueue"), deleting the timeout work after freezing the queue shouldn't be necessary, since the synchronization is already enforced by the acquisition of a q_usage_counter reference in blk_mq_timeout_work. Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 28 Nov, 2016 3 commits
-
-
Jens Axboe authored
Currently there's no way to enable wbt if it's not enabled in the kernel config by default for a device. Allow a write to the 'wbt_lat_usec' queue sysfs file to enable wbt. This is useful for both the kernel config case, but also if the device is CFQ managed and it was turned off by default. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
Make it clear that we are disabling wbt for the specified queued, if it was enabled by default. This is in preparation for allowing users to re-enable wbt, and not have it disabled automatically again. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
Allow a write of '-1' to reset the default latency target for a given device. This removes knowledge of the different default settings for rotational vs non-rotational from user space. Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 23 Nov, 2016 1 commit
-
-
Jens Axboe authored
Multiple paths don't set it properly, ensure that we do. Fixes: 9561a7ad ("nbd: add multi-connection support") Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 22 Nov, 2016 14 commits
-
-
Jens Axboe authored
Bit #7 is already used, move to bit #8 which is the first unused one. Fixes: 9561a7ad ("nbd: add multi-connection support") Signed-off-by: Jens Axboe <axboe@fb.com>
-
Josef Bacik authored
NBD can become contended on its single connection. We have to serialize all writes and we can only process one read response at a time. Fix this by allowing userspace to provide multiple connections to a single nbd device. This coupled with block-mq drastically increases performance in multi-process cases. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Tejun Heo authored
blkcg allocates some per-cgroup data structures with GFP_NOWAIT and when that fails falls back to operations which aren't specific to the cgroup. Occassional failures are expected under pressure and falling back to non-cgroup operation is the right thing to do. Unfortunately, I forgot to add __GFP_NOWARN to these allocations and these expected failures end up creating a lot of noise. Add __GFP_NOWARN. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Marc MERLIN <marc@merlins.org> Reported-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
The check on bio->bi_vcnt doesn't make sense in erase_end_io(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Also code gets simplified a bit. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Also this patch simplify the code a bit. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Always bio_add_page() is the standard and preferred way to do the task. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Instead we use standard iterator way to do that. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
When the bio is full, bio_add_pc_page() will return zero, so use this information tell when the bio is full. Also replace access to .bi_vcnt for pr_debug() with bio_segments(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
For a non-cloned bio, bio_add_page() only returns failure when the io vec table is full, but in that case, bio->bi_vcnt can't be zero at all. So remove the impossible failure handling. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Some drivers often use external bvec table, so introduce this helper for this case. It is always safe to access the bio->bi_io_vec in this way for this case. After converting to this usage, it will becomes a bit easier to evaluate the remaining direct access to bio->bi_io_vec, so it can help to prepare for the following multipage bvec support. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixed up the new O_DIRECT cases. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
We store the bits in the bdev sector size locally, but we don't use the calculation anymore. All we do with it is shift it back up to the bdev sector size. So let's just use that directly and kill the variable and bits calculation. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Damien Le Moal authored
A direct I/O alignment must be always checked against the device blocks size, but the I/O offset (bio->bi_iter.bi_sector must always use 512B sector unit, and not the actual logical block size. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 21 Nov, 2016 2 commits
-
-
Shaun Tancheff authored
If a ZBC device is partitioned and operations are performed on the partition the zone information is rebased to the partition, however the zone reset is not mapped from the partition to device as are other operations. This causes the API (report zones / reset zone) to be unbalanced in this regard. Checking for the zone reset op code explicitly will balance the API. Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Since commit 87374179 ("block: add a proper block layer data direction encoding") we only or the new op and flags into bi_opf in bio_set_op_attrs instead of clearing the old value. I've not seen any breakage with the new behavior, but it seems dangerous. Also convert it to an inline function to make the argument passing safer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-