Commit b25c6644 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-5.8/dm-changes' of...

Merge tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - The largest change for this cycle is the DM zoned target's metadata
   version 2 feature that adds support for pairing regular block devices
   with a zoned device to ease the performance impact associated with
   finite random zones of zoned device.

   The changes came in three batches: the first prepared for and then
   added the ability to pair a single regular block device, the second
   was a batch of fixes to improve zoned's reclaim heuristic, and the
   third removed the limitation of only adding a single additional
   regular block device to allow many devices.

   Testing has shown linear scaling as more devices are added.

 - Add new emulated block size (ebs) target that emulates a smaller
   logical_block_size than a block device supports

   The primary use-case is to emulate "512e" devices that have 512 byte
   logical_block_size and 4KB physical_block_size. This is useful to
   some legacy applications that otherwise wouldn't be able to be used
   on 4K devices because they depend on issuing IO in 512 byte
   granularity.

 - Add discard interfaces to DM bufio. First consumer of the interface
   is the dm-ebs target that makes heavy use of dm-bufio.

 - Fix DM crypt's block queue_limits stacking to not truncate
   logic_block_size.

 - Add Documentation for DM integrity's status line.

 - Switch DMDEBUG from a compile time config option to instead use
   dynamic debug via pr_debug.

 - Fix DM multipath target's hueristic for how it manages
   "queue_if_no_path" state internally.

   DM multipath now avoids disabling "queue_if_no_path" unless it is
   actually needed (e.g. in response to configure timeout or explicit
   "fail_if_no_path" message).

   This fixes reports of spurious -EIO being reported back to userspace
   application during fault tolerance testing with an NVMe backend.
   Added various dynamic DMDEBUG messages to assist with debugging
   queue_if_no_path in the future.

 - Add a new DM multipath "Historical Service Time" Path Selector.

 - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.

 - Improve DM writecache target performance by using explicit cache
   flushing for target's single-threaded usecase and a small cleanup to
   remove unnecessary test in persistent_memory_claim.

 - Other small cleanups in DM core, dm-persistent-data, and DM
   integrity.

* tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
  dm crypt: avoid truncating the logical block size
  dm mpath: add DM device name to Failing/Reinstating path log messages
  dm mpath: enhance queue_if_no_path debugging
  dm mpath: restrict queue_if_no_path state machine
  dm mpath: simplify __must_push_back
  dm zoned: check superblock location
  dm zoned: prefer full zones for reclaim
  dm zoned: select reclaim zone based on device index
  dm zoned: allocate zone by device index
  dm zoned: support arbitrary number of devices
  dm zoned: move random and sequential zones into struct dmz_dev
  dm zoned: per-device reclaim
  dm zoned: add metadata pointer to struct dmz_dev
  dm zoned: add device pointer to struct dm_zone
  dm zoned: allocate temporary superblock for tertiary devices
  dm zoned: convert to xarray
  dm zoned: add a 'reserved' zone flag
  dm zoned: improve logging messages for reclaim
  dm zoned: avoid unnecessary device recalulation for secondary superblock
  dm zoned: add debugging message for reading superblocks
  ...
parents 818dbde7 64611a15
======
dm-ebs
======
This target is similar to the linear target except that it emulates
a smaller logical block size on a device with a larger logical block
size. Its main purpose is to provide emulation of 512 byte sectors on
devices that do not provide this emulation (i.e. 4K native disks).
Supported emulated logical block sizes 512, 1024, 2048 and 4096.
Underlying block size can be set to > 4K to test buffering larger units.
Table parameters
----------------
<dev path> <offset> <emulated sectors> [<underlying sectors>]
Mandatory parameters:
<dev path>:
Full pathname to the underlying block-device,
or a "major:minor" device-number.
<offset>:
Starting sector within the device;
has to be a multiple of <emulated sectors>.
<emulated sectors>:
Number of sectors defining the logical block size to be emulated;
1, 2, 4, 8 sectors of 512 bytes supported.
Optional parameter:
<underyling sectors>:
Number of sectors defining the logical block size of <dev path>.
2^N supported, e.g. 8 = emulate 8 sectors of 512 bytes = 4KiB.
If not provided, the logical block size of <dev path> will be used.
Examples:
Emulate 1 sector = 512 bytes logical block size on /dev/sda starting at
offset 1024 sectors with underlying devices block size automatically set:
ebs /dev/sda 1024 1
Emulate 2 sector = 1KiB logical block size on /dev/sda starting at
offset 128 sectors, enforce 2KiB underlying device block size.
This presumes 2KiB logical blocksize on /dev/sda or less to work:
ebs /dev/sda 128 2 4
......@@ -193,6 +193,14 @@ should not be changed when reloading the target because the layout of disk
data depend on them and the reloaded target would be non-functional.
Status line:
1. the number of integrity mismatches
2. provided data sectors - that is the number of sectors that the user
could use
3. the current recalculating position (or '-' if we didn't recalculate)
The layout of the formatted block device:
* reserved sectors
......
......@@ -37,9 +37,13 @@ Algorithm
dm-zoned implements an on-disk buffering scheme to handle non-sequential
write accesses to the sequential zones of a zoned block device.
Conventional zones are used for caching as well as for storing internal
metadata.
metadata. It can also use a regular block device together with the zoned
block device; in that case the regular block device will be split logically
in zones with the same size as the zoned block device. These zones will be
placed in front of the zones from the zoned block device and will be handled
just like conventional zones.
The zones of the device are separated into 2 types:
The zones of the device(s) are separated into 2 types:
1) Metadata zones: these are conventional zones used to store metadata.
Metadata zones are not reported as useable capacity to the user.
......@@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
discard requests. Read requests can be processed concurrently while
metadata flush is being executed.
If a regular device is used in conjunction with the zoned block device,
a third set of metadata (without the zone bitmaps) is written to the
start of the zoned block device. This metadata has a generation counter of
'0' and will never be updated during normal operation; it just serves for
identification purposes. The first and second copy of the metadata
are located at the start of the regular block device.
Usage
=====
......@@ -138,9 +149,46 @@ Ex::
dmzadm --format /dev/sdxx
For a formatted device, the target can be created normally with the
dmsetup utility. The only parameter that dm-zoned requires is the
underlying zoned block device name. Ex::
echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
dmsetup create dmz-`basename ${dev}`
If two drives are to be used, both devices must be specified, with the
regular block device as the first device.
Ex::
dmzadm --format /dev/sdxx /dev/sdyy
Fomatted device(s) can be started with the dmzadm utility, too.:
Ex::
dmzadm --start /dev/sdxx /dev/sdyy
Information about the internal layout and current usage of the zones can
be obtained with the 'status' callback from dmsetup:
Ex::
dmsetup status /dev/dm-X
will return a line
0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential
where <nr_zones> is the total number of zones, <nr_unmap_rnd> is the number
of unmapped (ie free) random zones, <nr_rnd> the total number of zones,
<nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the
total number of sequential zones.
Normally the reclaim process will be started once there are less than 50
percent free random zones. In order to start the reclaim process manually
even before reaching this threshold the 'dmsetup message' function can be
used:
Ex::
dmsetup message /dev/dm-X 0 reclaim
will start the reclaim process and random zones will be moved to sequential
zones.
......@@ -269,6 +269,7 @@ config DM_UNSTRIPED
config DM_CRYPT
tristate "Crypt target support"
depends on BLK_DEV_DM
depends on (ENCRYPTED_KEYS || ENCRYPTED_KEYS=n)
select CRYPTO
select CRYPTO_CBC
select CRYPTO_ESSIV
......@@ -336,6 +337,14 @@ config DM_WRITECACHE
The writecache target doesn't cache reads because reads are supposed
to be cached in standard RAM.
config DM_EBS
tristate "Emulated block size target (EXPERIMENTAL)"
depends on BLK_DEV_DM
select DM_BUFIO
help
dm-ebs emulates smaller logical block size on backing devices
with larger ones (e.g. 512 byte sectors on 4K native disks).
config DM_ERA
tristate "Era target (EXPERIMENTAL)"
depends on BLK_DEV_DM
......@@ -443,6 +452,17 @@ config DM_MULTIPATH_ST
If unsure, say N.
config DM_MULTIPATH_HST
tristate "I/O Path Selector based on historical service time"
depends on DM_MULTIPATH
help
This path selector is a dynamic load balancer which selects
the path expected to complete the incoming I/O in the shortest
time by comparing estimated service time (based on historical
service time).
If unsure, say N.
config DM_DELAY
tristate "I/O delaying target"
depends on BLK_DEV_DM
......
......@@ -17,6 +17,7 @@ dm-thin-pool-y += dm-thin.o dm-thin-metadata.o
dm-cache-y += dm-cache-target.o dm-cache-metadata.o dm-cache-policy.o \
dm-cache-background-tracker.o
dm-cache-smq-y += dm-cache-policy-smq.o
dm-ebs-y += dm-ebs-target.o
dm-era-y += dm-era-target.o
dm-clone-y += dm-clone-target.o dm-clone-metadata.o
dm-verity-y += dm-verity-target.o
......@@ -54,6 +55,7 @@ obj-$(CONFIG_DM_FLAKEY) += dm-flakey.o
obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o
obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o
obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o
obj-$(CONFIG_DM_MULTIPATH_HST) += dm-historical-service-time.o
obj-$(CONFIG_DM_SWITCH) += dm-switch.o
obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o
obj-$(CONFIG_DM_PERSISTENT_DATA) += persistent-data/
......@@ -65,6 +67,7 @@ obj-$(CONFIG_DM_THIN_PROVISIONING) += dm-thin-pool.o
obj-$(CONFIG_DM_VERITY) += dm-verity.o
obj-$(CONFIG_DM_CACHE) += dm-cache.o
obj-$(CONFIG_DM_CACHE_SMQ) += dm-cache-smq.o
obj-$(CONFIG_DM_EBS) += dm-ebs.o
obj-$(CONFIG_DM_ERA) += dm-era.o
obj-$(CONFIG_DM_CLONE) += dm-clone.o
obj-$(CONFIG_DM_LOG_WRITES) += dm-log-writes.o
......
......@@ -256,12 +256,35 @@ static struct dm_buffer *__find(struct dm_bufio_client *c, sector_t block)
if (b->block == block)
return b;
n = (b->block < block) ? n->rb_left : n->rb_right;
n = block < b->block ? n->rb_left : n->rb_right;
}
return NULL;
}
static struct dm_buffer *__find_next(struct dm_bufio_client *c, sector_t block)
{
struct rb_node *n = c->buffer_tree.rb_node;
struct dm_buffer *b;
struct dm_buffer *best = NULL;
while (n) {
b = container_of(n, struct dm_buffer, node);
if (b->block == block)
return b;
if (block <= b->block) {
n = n->rb_left;
best = b;
} else {
n = n->rb_right;
}
}
return best;
}
static void __insert(struct dm_bufio_client *c, struct dm_buffer *b)
{
struct rb_node **new = &c->buffer_tree.rb_node, *parent = NULL;
......@@ -276,8 +299,8 @@ static void __insert(struct dm_bufio_client *c, struct dm_buffer *b)
}
parent = *new;
new = (found->block < b->block) ?
&((*new)->rb_left) : &((*new)->rb_right);
new = b->block < found->block ?
&found->node.rb_left : &found->node.rb_right;
}
rb_link_node(&b->node, parent, new);
......@@ -631,6 +654,19 @@ static void use_bio(struct dm_buffer *b, int rw, sector_t sector,
submit_bio(bio);
}
static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block)
{
sector_t sector;
if (likely(c->sectors_per_block_bits >= 0))
sector = block << c->sectors_per_block_bits;
else
sector = block * (c->block_size >> SECTOR_SHIFT);
sector += c->start;
return sector;
}
static void submit_io(struct dm_buffer *b, int rw, void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned n_sectors;
......@@ -639,11 +675,7 @@ static void submit_io(struct dm_buffer *b, int rw, void (*end_io)(struct dm_buff
b->end_io = end_io;
if (likely(b->c->sectors_per_block_bits >= 0))
sector = b->block << b->c->sectors_per_block_bits;
else
sector = b->block * (b->c->block_size >> SECTOR_SHIFT);
sector += b->c->start;
sector = block_to_sector(b->c, b->block);
if (rw != REQ_OP_WRITE) {
n_sectors = b->c->block_size >> SECTOR_SHIFT;
......@@ -1325,6 +1357,30 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);
/*
* Use dm-io to send a discard request to flush the device.
*/
int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t count)
{
struct dm_io_request io_req = {
.bi_op = REQ_OP_DISCARD,
.bi_op_flags = REQ_SYNC,
.mem.type = DM_IO_KMEM,
.mem.ptr.addr = NULL,
.client = c->dm_io,
};
struct dm_io_region io_reg = {
.bdev = c->bdev,
.sector = block_to_sector(c, block),
.count = block_to_sector(c, count),
};
BUG_ON(dm_bufio_in_request());
return dm_io(&io_req, 1, &io_reg, NULL);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);
/*
* We first delete any other buffer that may be at that new location.
*
......@@ -1401,6 +1457,14 @@ void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block)
}
EXPORT_SYMBOL_GPL(dm_bufio_release_move);
static void forget_buffer_locked(struct dm_buffer *b)
{
if (likely(!b->hold_count) && likely(!b->state)) {
__unlink_buffer(b);
__free_buffer_wake(b);
}
}
/*
* Free the given buffer.
*
......@@ -1414,15 +1478,36 @@ void dm_bufio_forget(struct dm_bufio_client *c, sector_t block)
dm_bufio_lock(c);
b = __find(c, block);
if (b && likely(!b->hold_count) && likely(!b->state)) {
__unlink_buffer(b);
__free_buffer_wake(b);
}
if (b)
forget_buffer_locked(b);
dm_bufio_unlock(c);
}
EXPORT_SYMBOL_GPL(dm_bufio_forget);
void dm_bufio_forget_buffers(struct dm_bufio_client *c, sector_t block, sector_t n_blocks)
{
struct dm_buffer *b;
sector_t end_block = block + n_blocks;
while (block < end_block) {
dm_bufio_lock(c);
b = __find_next(c, block);
if (b) {
block = b->block + 1;
forget_buffer_locked(b);
}
dm_bufio_unlock(c);
if (!b)
break;
}
}
EXPORT_SYMBOL_GPL(dm_bufio_forget_buffers);
void dm_bufio_set_minimum_buffers(struct dm_bufio_client *c, unsigned n)
{
c->minimum_buffers = n;
......
......@@ -34,7 +34,9 @@
#include <crypto/aead.h>
#include <crypto/authenc.h>
#include <linux/rtnetlink.h> /* for struct rtattr and RTA macros only */
#include <linux/key-type.h>
#include <keys/user-type.h>
#include <keys/encrypted-type.h>
#include <linux/device-mapper.h>
......@@ -212,7 +214,7 @@ struct crypt_config {
struct mutex bio_alloc_lock;
u8 *authenc_key; /* space for keys in authenc() format (if used) */
u8 key[0];
u8 key[];
};
#define MIN_IOS 64
......@@ -2215,12 +2217,47 @@ static bool contains_whitespace(const char *str)
return false;
}
static int set_key_user(struct crypt_config *cc, struct key *key)
{
const struct user_key_payload *ukp;
ukp = user_key_payload_locked(key);
if (!ukp)
return -EKEYREVOKED;
if (cc->key_size != ukp->datalen)
return -EINVAL;
memcpy(cc->key, ukp->data, cc->key_size);
return 0;
}
#if defined(CONFIG_ENCRYPTED_KEYS) || defined(CONFIG_ENCRYPTED_KEYS_MODULE)
static int set_key_encrypted(struct crypt_config *cc, struct key *key)
{
const struct encrypted_key_payload *ekp;
ekp = key->payload.data[0];
if (!ekp)
return -EKEYREVOKED;
if (cc->key_size != ekp->decrypted_datalen)
return -EINVAL;
memcpy(cc->key, ekp->decrypted_data, cc->key_size);
return 0;
}
#endif /* CONFIG_ENCRYPTED_KEYS */
static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string)
{
char *new_key_string, *key_desc;
int ret;
struct key_type *type;
struct key *key;
const struct user_key_payload *ukp;
int (*set_key)(struct crypt_config *cc, struct key *key);
/*
* Reject key_string with whitespace. dm core currently lacks code for
......@@ -2236,16 +2273,26 @@ static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string
if (!key_desc || key_desc == key_string || !strlen(key_desc + 1))
return -EINVAL;
if (strncmp(key_string, "logon:", key_desc - key_string + 1) &&
strncmp(key_string, "user:", key_desc - key_string + 1))
if (!strncmp(key_string, "logon:", key_desc - key_string + 1)) {
type = &key_type_logon;
set_key = set_key_user;
} else if (!strncmp(key_string, "user:", key_desc - key_string + 1)) {
type = &key_type_user;
set_key = set_key_user;
#if defined(CONFIG_ENCRYPTED_KEYS) || defined(CONFIG_ENCRYPTED_KEYS_MODULE)
} else if (!strncmp(key_string, "encrypted:", key_desc - key_string + 1)) {
type = &key_type_encrypted;
set_key = set_key_encrypted;
#endif
} else {
return -EINVAL;
}
new_key_string = kstrdup(key_string, GFP_KERNEL);
if (!new_key_string)
return -ENOMEM;
key = request_key(key_string[0] == 'l' ? &key_type_logon : &key_type_user,
key_desc + 1, NULL);
key = request_key(type, key_desc + 1, NULL);
if (IS_ERR(key)) {
kzfree(new_key_string);
return PTR_ERR(key);
......@@ -2253,23 +2300,14 @@ static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string
down_read(&key->sem);
ukp = user_key_payload_locked(key);
if (!ukp) {
up_read(&key->sem);
key_put(key);
kzfree(new_key_string);
return -EKEYREVOKED;
}
if (cc->key_size != ukp->datalen) {
ret = set_key(cc, key);
if (ret < 0) {
up_read(&key->sem);
key_put(key);
kzfree(new_key_string);
return -EINVAL;
return ret;
}
memcpy(cc->key, ukp->data, cc->key_size);
up_read(&key->sem);
key_put(key);
......@@ -2323,7 +2361,7 @@ static int get_key_size(char **key_string)
return (*key_string[0] == ':') ? -EINVAL : strlen(*key_string) >> 1;
}
#endif
#endif /* CONFIG_KEYS */
static int crypt_set_key(struct crypt_config *cc, char *key)
{
......@@ -3274,7 +3312,7 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
limits->max_segment_size = PAGE_SIZE;
limits->logical_block_size =
max_t(unsigned short, limits->logical_block_size, cc->sector_size);
max_t(unsigned, limits->logical_block_size, cc->sector_size);
limits->physical_block_size =
max_t(unsigned, limits->physical_block_size, cc->sector_size);
limits->io_min = max_t(unsigned, limits->io_min, cc->sector_size);
......@@ -3282,7 +3320,7 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
static struct target_type crypt_target = {
.name = "crypt",
.version = {1, 20, 0},
.version = {1, 21, 0},
.module = THIS_MODULE,
.ctr = crypt_ctr,
.dtr = crypt_dtr,
......
This diff is collapsed.
This diff is collapsed.
......@@ -92,7 +92,7 @@ struct journal_entry {
} s;
__u64 sector;
} u;
commit_id_t last_bytes[0];
commit_id_t last_bytes[];
/* __u8 tag[0]; */
};
......@@ -1553,8 +1553,6 @@ static void integrity_metadata(struct work_struct *w)
char checksums_onstack[max((size_t)HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
sector_t sector;
unsigned sectors_to_process;
sector_t save_metadata_block;
unsigned save_metadata_offset;
if (unlikely(ic->mode == 'R'))
goto skip_io;
......@@ -1605,8 +1603,6 @@ static void integrity_metadata(struct work_struct *w)
goto skip_io;
}
save_metadata_block = dio->metadata_block;
save_metadata_offset = dio->metadata_offset;
sector = dio->range.logical_sector;
sectors_to_process = dio->range.n_sectors;
......
......@@ -127,7 +127,7 @@ struct pending_block {
char *data;
u32 datalen;
struct list_head list;
struct bio_vec vecs[0];
struct bio_vec vecs[];
};
struct per_bio_data {
......
......@@ -439,7 +439,7 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
}
/*
* dm_report_EIO() is a macro instead of a function to make pr_debug()
* dm_report_EIO() is a macro instead of a function to make pr_debug_ratelimited()
* report the function name and line number of the function from which
* it has been invoked.
*/
......@@ -447,43 +447,25 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes)
do { \
struct mapped_device *md = dm_table_get_md((m)->ti->table); \
\
pr_debug("%s: returning EIO; QIFNP = %d; SQIFNP = %d; DNFS = %d\n", \
dm_device_name(md), \
test_bit(MPATHF_QUEUE_IF_NO_PATH, &(m)->flags), \
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &(m)->flags), \
dm_noflush_suspending((m)->ti)); \
DMDEBUG_LIMIT("%s: returning EIO; QIFNP = %d; SQIFNP = %d; DNFS = %d", \
dm_device_name(md), \
test_bit(MPATHF_QUEUE_IF_NO_PATH, &(m)->flags), \
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &(m)->flags), \
dm_noflush_suspending((m)->ti)); \
} while (0)
/*
* Check whether bios must be queued in the device-mapper core rather
* than here in the target.
*
* If MPATHF_QUEUE_IF_NO_PATH and MPATHF_SAVED_QUEUE_IF_NO_PATH hold
* the same value then we are not between multipath_presuspend()
* and multipath_resume() calls and we have no need to check
* for the DMF_NOFLUSH_SUSPENDING flag.
*/
static bool __must_push_back(struct multipath *m, unsigned long flags)
static bool __must_push_back(struct multipath *m)
{
return ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &flags) !=
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &flags)) &&
dm_noflush_suspending(m->ti));
return dm_noflush_suspending(m->ti);
}
/*
* Following functions use READ_ONCE to get atomic access to
* all m->flags to avoid taking spinlock
*/
static bool must_push_back_rq(struct multipath *m)
{
unsigned long flags = READ_ONCE(m->flags);
return test_bit(MPATHF_QUEUE_IF_NO_PATH, &flags) || __must_push_back(m, flags);
}
static bool must_push_back_bio(struct multipath *m)
{
unsigned long flags = READ_ONCE(m->flags);
return __must_push_back(m, flags);
return test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) || __must_push_back(m);
}
/*
......@@ -567,7 +549,8 @@ static void multipath_release_clone(struct request *clone,
if (pgpath && pgpath->pg->ps.type->end_io)
pgpath->pg->ps.type->end_io(&pgpath->pg->ps,
&pgpath->path,
mpio->nr_bytes);
mpio->nr_bytes,
clone->io_start_time_ns);
}
blk_put_request(clone);
......@@ -619,7 +602,7 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio,
return DM_MAPIO_SUBMITTED;
if (!pgpath) {
if (must_push_back_bio(m))
if (__must_push_back(m))
return DM_MAPIO_REQUEUE;
dm_report_EIO(m);
return DM_MAPIO_KILL;
......@@ -709,15 +692,38 @@ static void process_queued_bios(struct work_struct *work)
* If we run out of usable paths, should we queue I/O or error it?
*/
static int queue_if_no_path(struct multipath *m, bool queue_if_no_path,
bool save_old_value)
bool save_old_value, const char *caller)
{
unsigned long flags;
bool queue_if_no_path_bit, saved_queue_if_no_path_bit;
const char *dm_dev_name = dm_device_name(dm_table_get_md(m->ti->table));
DMDEBUG("%s: %s caller=%s queue_if_no_path=%d save_old_value=%d",
dm_dev_name, __func__, caller, queue_if_no_path, save_old_value);
spin_lock_irqsave(&m->lock, flags);
assign_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags,
(save_old_value && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) ||
(!save_old_value && queue_if_no_path));
queue_if_no_path_bit = test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags);
saved_queue_if_no_path_bit = test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags);
if (save_old_value) {
if (unlikely(!queue_if_no_path_bit && saved_queue_if_no_path_bit)) {
DMERR("%s: QIFNP disabled but saved as enabled, saving again loses state, not saving!",
dm_dev_name);
} else
assign_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags, queue_if_no_path_bit);
} else if (!queue_if_no_path && saved_queue_if_no_path_bit) {
/* due to "fail_if_no_path" message, need to honor it. */
clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags);
}
assign_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags, queue_if_no_path);
DMDEBUG("%s: after %s changes; QIFNP = %d; SQIFNP = %d; DNFS = %d",
dm_dev_name, __func__,
test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags),
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags),
dm_noflush_suspending(m->ti));
spin_unlock_irqrestore(&m->lock, flags);
if (!queue_if_no_path) {
......@@ -738,7 +744,7 @@ static void queue_if_no_path_timeout_work(struct timer_list *t)
struct mapped_device *md = dm_table_get_md(m->ti->table);
DMWARN("queue_if_no_path timeout on %s, failing queued IO", dm_device_name(md));
queue_if_no_path(m, false, false);
queue_if_no_path(m, false, false, __func__);
}
/*
......@@ -1078,7 +1084,7 @@ static int parse_features(struct dm_arg_set *as, struct multipath *m)
argc--;
if (!strcasecmp(arg_name, "queue_if_no_path")) {
r = queue_if_no_path(m, true, false);
r = queue_if_no_path(m, true, false, __func__);
continue;
}
......@@ -1279,7 +1285,9 @@ static int fail_path(struct pgpath *pgpath)
if (!pgpath->is_active)
goto out;
DMWARN("Failing path %s.", pgpath->path.dev->name);
DMWARN("%s: Failing path %s.",
dm_device_name(dm_table_get_md(m->ti->table)),
pgpath->path.dev->name);
pgpath->pg->ps.type->fail_path(&pgpath->pg->ps, &pgpath->path);
pgpath->is_active = false;
......@@ -1318,7 +1326,9 @@ static int reinstate_path(struct pgpath *pgpath)
if (pgpath->is_active)
goto out;
DMWARN("Reinstating path %s.", pgpath->path.dev->name);
DMWARN("%s: Reinstating path %s.",
dm_device_name(dm_table_get_md(m->ti->table)),
pgpath->path.dev->name);
r = pgpath->pg->ps.type->reinstate_path(&pgpath->pg->ps, &pgpath->path);
if (r)
......@@ -1617,7 +1627,8 @@ static int multipath_end_io(struct dm_target *ti, struct request *clone,
struct path_selector *ps = &pgpath->pg->ps;
if (ps->type->end_io)
ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes);
ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes,
clone->io_start_time_ns);
}
return r;
......@@ -1640,7 +1651,7 @@ static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone,
if (atomic_read(&m->nr_valid_paths) == 0 &&
!test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) {
if (must_push_back_bio(m)) {
if (__must_push_back(m)) {
r = DM_ENDIO_REQUEUE;
} else {
dm_report_EIO(m);
......@@ -1661,23 +1672,27 @@ static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone,
struct path_selector *ps = &pgpath->pg->ps;
if (ps->type->end_io)
ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes);
ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes,
dm_start_time_ns_from_clone(clone));
}
return r;
}
/*
* Suspend can't complete until all the I/O is processed so if
* the last path fails we must error any remaining I/O.
* Note that if the freeze_bdev fails while suspending, the
* queue_if_no_path state is lost - userspace should reset it.
* Suspend with flush can't complete until all the I/O is processed
* so if the last path fails we must error any remaining I/O.
* - Note that if the freeze_bdev fails while suspending, the
* queue_if_no_path state is lost - userspace should reset it.
* Otherwise, during noflush suspend, queue_if_no_path will not change.
*/
static void multipath_presuspend(struct dm_target *ti)
{
struct multipath *m = ti->private;
queue_if_no_path(m, false, true);
/* FIXME: bio-based shouldn't need to always disable queue_if_no_path */
if (m->queue_mode == DM_TYPE_BIO_BASED || !dm_noflush_suspending(m->ti))
queue_if_no_path(m, false, true, __func__);
}
static void multipath_postsuspend(struct dm_target *ti)
......@@ -1698,8 +1713,16 @@ static void multipath_resume(struct dm_target *ti)
unsigned long flags;
spin_lock_irqsave(&m->lock, flags);
assign_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags,
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags));
if (test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) {
set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags);
clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags);
}
DMDEBUG("%s: %s finished; QIFNP = %d; SQIFNP = %d",
dm_device_name(dm_table_get_md(m->ti->table)), __func__,
test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags),
test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags));
spin_unlock_irqrestore(&m->lock, flags);
}
......@@ -1859,13 +1882,13 @@ static int multipath_message(struct dm_target *ti, unsigned argc, char **argv,
if (argc == 1) {
if (!strcasecmp(argv[0], "queue_if_no_path")) {
r = queue_if_no_path(m, true, false);
r = queue_if_no_path(m, true, false, __func__);
spin_lock_irqsave(&m->lock, flags);
enable_nopath_timeout(m);
spin_unlock_irqrestore(&m->lock, flags);
goto out;
} else if (!strcasecmp(argv[0], "fail_if_no_path")) {
r = queue_if_no_path(m, false, false);
r = queue_if_no_path(m, false, false, __func__);
disable_nopath_timeout(m);
goto out;
}
......@@ -1918,7 +1941,7 @@ static int multipath_prepare_ioctl(struct dm_target *ti,
int r;
current_pgpath = READ_ONCE(m->current_pgpath);
if (!current_pgpath)
if (!current_pgpath || !test_bit(MPATHF_QUEUE_IO, &m->flags))
current_pgpath = choose_pgpath(m, 0);
if (current_pgpath) {
......
......@@ -74,7 +74,7 @@ struct path_selector_type {
int (*start_io) (struct path_selector *ps, struct dm_path *path,
size_t nr_bytes);
int (*end_io) (struct path_selector *ps, struct dm_path *path,
size_t nr_bytes);
size_t nr_bytes, u64 start_time);
};
/* Register a path selector */
......
......@@ -227,7 +227,7 @@ static int ql_start_io(struct path_selector *ps, struct dm_path *path,
}
static int ql_end_io(struct path_selector *ps, struct dm_path *path,
size_t nr_bytes)
size_t nr_bytes, u64 start_time)
{
struct path_info *pi = path->pscontext;
......
......@@ -254,7 +254,7 @@ struct raid_set {
int mode;
} journal_dev;
struct raid_dev dev[0];
struct raid_dev dev[];
};
static void rs_config_backup(struct raid_set *rs, struct rs_layout *l)
......
......@@ -83,7 +83,7 @@ struct mirror_set {
struct work_struct trigger_event;
unsigned nr_mirrors;
struct mirror mirror[0];
struct mirror mirror[];
};
DECLARE_DM_KCOPYD_THROTTLE_WITH_MODULE_PARM(raid1_resync_throttle,
......
......@@ -309,7 +309,7 @@ static int st_start_io(struct path_selector *ps, struct dm_path *path,
}
static int st_end_io(struct path_selector *ps, struct dm_path *path,
size_t nr_bytes)
size_t nr_bytes, u64 start_time)
{
struct path_info *pi = path->pscontext;
......
......@@ -56,7 +56,7 @@ struct dm_stat {
size_t percpu_alloc_size;
size_t histogram_alloc_size;
struct dm_stat_percpu *stat_percpu[NR_CPUS];
struct dm_stat_shared stat_shared[0];
struct dm_stat_shared stat_shared[];
};
#define STAT_PRECISE_TIMESTAMPS 1
......
......@@ -41,7 +41,7 @@ struct stripe_c {
/* Work struct used for triggering events*/
struct work_struct trigger_event;
struct stripe stripe[0];
struct stripe stripe[];
};
/*
......
......@@ -53,7 +53,7 @@ struct switch_ctx {
/*
* Array of dm devices to switch between.
*/
struct switch_path path_list[0];
struct switch_path path_list[];
};
static struct switch_ctx *alloc_switch_ctx(struct dm_target *ti, unsigned nr_paths,
......
......@@ -234,10 +234,6 @@ static int persistent_memory_claim(struct dm_writecache *wc)
wc->memory_vmapped = false;
if (!wc->ssd_dev->dax_dev) {
r = -EOPNOTSUPP;
goto err1;
}
s = wc->memory_map_size;
p = s >> PAGE_SHIFT;
if (!p) {
......@@ -1143,6 +1139,42 @@ static int writecache_message(struct dm_target *ti, unsigned argc, char **argv,
return r;
}
static void memcpy_flushcache_optimized(void *dest, void *source, size_t size)
{
/*
* clflushopt performs better with block size 1024, 2048, 4096
* non-temporal stores perform better with block size 512
*
* block size 512 1024 2048 4096
* movnti 496 MB/s 642 MB/s 725 MB/s 744 MB/s
* clflushopt 373 MB/s 688 MB/s 1.1 GB/s 1.2 GB/s
*
* We see that movnti performs better for 512-byte blocks, and
* clflushopt performs better for 1024-byte and larger blocks. So, we
* prefer clflushopt for sizes >= 768.
*
* NOTE: this happens to be the case now (with dm-writecache's single
* threaded model) but re-evaluate this once memcpy_flushcache() is
* enabled to use movdir64b which might invalidate this performance
* advantage seen with cache-allocating-writes plus flushing.
*/
#ifdef CONFIG_X86
if (static_cpu_has(X86_FEATURE_CLFLUSHOPT) &&
likely(boot_cpu_data.x86_clflush_size == 64) &&
likely(size >= 768)) {
do {
memcpy((void *)dest, (void *)source, 64);
clflushopt((void *)dest);
dest += 64;
source += 64;
size -= 64;
} while (size >= 64);
return;
}
#endif
memcpy_flushcache(dest, source, size);
}
static void bio_copy_block(struct dm_writecache *wc, struct bio *bio, void *data)
{
void *buf;
......@@ -1168,7 +1200,7 @@ static void bio_copy_block(struct dm_writecache *wc, struct bio *bio, void *data
}
} else {
flush_dcache_page(bio_page(bio));
memcpy_flushcache(data, buf, size);
memcpy_flushcache_optimized(data, buf, size);
}
bvec_kunmap_irq(buf, &flags);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -45,34 +45,50 @@
#define dmz_bio_block(bio) dmz_sect2blk((bio)->bi_iter.bi_sector)
#define dmz_bio_blocks(bio) dmz_sect2blk(bio_sectors(bio))
struct dmz_metadata;
struct dmz_reclaim;
/*
* Zoned block device information.
*/
struct dmz_dev {
struct block_device *bdev;
struct dmz_metadata *metadata;
struct dmz_reclaim *reclaim;
char name[BDEVNAME_SIZE];
uuid_t uuid;
sector_t capacity;
unsigned int dev_idx;
unsigned int nr_zones;
unsigned int zone_offset;
unsigned int flags;
sector_t zone_nr_sectors;
unsigned int zone_nr_sectors_shift;
sector_t zone_nr_blocks;
sector_t zone_nr_blocks_shift;
unsigned int nr_rnd;
atomic_t unmap_nr_rnd;
struct list_head unmap_rnd_list;
struct list_head map_rnd_list;
unsigned int nr_seq;
atomic_t unmap_nr_seq;
struct list_head unmap_seq_list;
struct list_head map_seq_list;
};
#define dmz_bio_chunk(dev, bio) ((bio)->bi_iter.bi_sector >> \
(dev)->zone_nr_sectors_shift)
#define dmz_chunk_block(dev, b) ((b) & ((dev)->zone_nr_blocks - 1))
#define dmz_bio_chunk(zmd, bio) ((bio)->bi_iter.bi_sector >> \
dmz_zone_nr_sectors_shift(zmd))
#define dmz_chunk_block(zmd, b) ((b) & (dmz_zone_nr_blocks(zmd) - 1))
/* Device flags. */
#define DMZ_BDEV_DYING (1 << 0)
#define DMZ_CHECK_BDEV (2 << 0)
#define DMZ_BDEV_REGULAR (4 << 0)
/*
* Zone descriptor.
......@@ -81,12 +97,18 @@ struct dm_zone {
/* For listing the zone depending on its state */
struct list_head link;
/* Device containing this zone */
struct dmz_dev *dev;
/* Zone type and state */
unsigned long flags;
/* Zone activation reference count */
atomic_t refcount;
/* Zone id */
unsigned int id;
/* Zone write pointer block (relative to the zone start block) */
unsigned int wp_block;
......@@ -109,6 +131,7 @@ struct dm_zone {
*/
enum {
/* Zone write type */
DMZ_CACHE,
DMZ_RND,
DMZ_SEQ,
......@@ -120,22 +143,28 @@ enum {
DMZ_META,
DMZ_DATA,
DMZ_BUF,
DMZ_RESERVED,
/* Zone internal state */
DMZ_RECLAIM,
DMZ_SEQ_WRITE_ERR,
DMZ_RECLAIM_TERMINATE,
};
/*
* Zone data accessors.
*/
#define dmz_is_cache(z) test_bit(DMZ_CACHE, &(z)->flags)
#define dmz_is_rnd(z) test_bit(DMZ_RND, &(z)->flags)
#define dmz_is_seq(z) test_bit(DMZ_SEQ, &(z)->flags)
#define dmz_is_empty(z) ((z)->wp_block == 0)
#define dmz_is_offline(z) test_bit(DMZ_OFFLINE, &(z)->flags)
#define dmz_is_readonly(z) test_bit(DMZ_READ_ONLY, &(z)->flags)
#define dmz_in_reclaim(z) test_bit(DMZ_RECLAIM, &(z)->flags)
#define dmz_is_reserved(z) test_bit(DMZ_RESERVED, &(z)->flags)
#define dmz_seq_write_err(z) test_bit(DMZ_SEQ_WRITE_ERR, &(z)->flags)
#define dmz_reclaim_should_terminate(z) \
test_bit(DMZ_RECLAIM_TERMINATE, &(z)->flags)
#define dmz_is_meta(z) test_bit(DMZ_META, &(z)->flags)
#define dmz_is_buf(z) test_bit(DMZ_BUF, &(z)->flags)
......@@ -158,13 +187,11 @@ enum {
#define dmz_dev_debug(dev, format, args...) \
DMDEBUG("(%s): " format, (dev)->name, ## args)
struct dmz_metadata;
struct dmz_reclaim;
/*
* Functions defined in dm-zoned-metadata.c
*/
int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd);
int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
struct dmz_metadata **zmd, const char *devname);
void dmz_dtr_metadata(struct dmz_metadata *zmd);
int dmz_resume_metadata(struct dmz_metadata *zmd);
......@@ -175,23 +202,38 @@ void dmz_unlock_metadata(struct dmz_metadata *zmd);
void dmz_lock_flush(struct dmz_metadata *zmd);
void dmz_unlock_flush(struct dmz_metadata *zmd);
int dmz_flush_metadata(struct dmz_metadata *zmd);
const char *dmz_metadata_label(struct dmz_metadata *zmd);
unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone);
sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
bool dmz_check_dev(struct dmz_metadata *zmd);
bool dmz_dev_is_dying(struct dmz_metadata *zmd);
#define DMZ_ALLOC_RND 0x01
#define DMZ_ALLOC_RECLAIM 0x02
#define DMZ_ALLOC_CACHE 0x02
#define DMZ_ALLOC_SEQ 0x04
#define DMZ_ALLOC_RECLAIM 0x10
struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags);
struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd,
unsigned int dev_idx, unsigned long flags);
void dmz_free_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *zone,
unsigned int chunk);
void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
unsigned int dmz_nr_zones(struct dmz_metadata *zmd);
unsigned int dmz_nr_cache_zones(struct dmz_metadata *zmd);
unsigned int dmz_nr_unmap_cache_zones(struct dmz_metadata *zmd);
unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd, int idx);
unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd, int idx);
unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd, int idx);
unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd, int idx);
unsigned int dmz_zone_nr_blocks(struct dmz_metadata *zmd);
unsigned int dmz_zone_nr_blocks_shift(struct dmz_metadata *zmd);
unsigned int dmz_zone_nr_sectors(struct dmz_metadata *zmd);
unsigned int dmz_zone_nr_sectors_shift(struct dmz_metadata *zmd);
/*
* Activate a zone (increment its reference count).
......@@ -201,26 +243,10 @@ static inline void dmz_activate_zone(struct dm_zone *zone)
atomic_inc(&zone->refcount);
}
/*
* Deactivate a zone. This decrement the zone reference counter
* indicating that all BIOs to the zone have completed when the count is 0.
*/
static inline void dmz_deactivate_zone(struct dm_zone *zone)
{
atomic_dec(&zone->refcount);
}
/*
* Test if a zone is active, that is, has a refcount > 0.
*/
static inline bool dmz_is_active(struct dm_zone *zone)
{
return atomic_read(&zone->refcount);
}
int dmz_lock_zone_reclaim(struct dm_zone *zone);
void dmz_unlock_zone_reclaim(struct dm_zone *zone);
struct dm_zone *dmz_get_zone_for_reclaim(struct dmz_metadata *zmd);
struct dm_zone *dmz_get_zone_for_reclaim(struct dmz_metadata *zmd,
unsigned int dev_idx, bool idle);
struct dm_zone *dmz_get_chunk_mapping(struct dmz_metadata *zmd,
unsigned int chunk, int op);
......@@ -244,8 +270,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
/*
* Functions defined in dm-zoned-reclaim.c
*/
int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
struct dmz_reclaim **zrc);
int dmz_ctr_reclaim(struct dmz_metadata *zmd, struct dmz_reclaim **zrc, int idx);
void dmz_dtr_reclaim(struct dmz_reclaim *zrc);
void dmz_suspend_reclaim(struct dmz_reclaim *zrc);
void dmz_resume_reclaim(struct dmz_reclaim *zrc);
......@@ -258,4 +283,22 @@ void dmz_schedule_reclaim(struct dmz_reclaim *zrc);
bool dmz_bdev_is_dying(struct dmz_dev *dmz_dev);
bool dmz_check_bdev(struct dmz_dev *dmz_dev);
/*
* Deactivate a zone. This decrement the zone reference counter
* indicating that all BIOs to the zone have completed when the count is 0.
*/
static inline void dmz_deactivate_zone(struct dm_zone *zone)
{
dmz_reclaim_bio_acc(zone->dev->reclaim);
atomic_dec(&zone->refcount);
}
/*
* Test if a zone is active, that is, has a refcount > 0.
*/
static inline bool dmz_is_active(struct dm_zone *zone)
{
return atomic_read(&zone->refcount);
}
#endif /* DM_ZONED_H */
......@@ -676,6 +676,15 @@ static bool md_in_flight(struct mapped_device *md)
return md_in_flight_bios(md);
}
u64 dm_start_time_ns_from_clone(struct bio *bio)
{
struct dm_target_io *tio = container_of(bio, struct dm_target_io, clone);
struct dm_io *io = tio->io;
return jiffies_to_nsecs(io->start_time);
}
EXPORT_SYMBOL_GPL(dm_start_time_ns_from_clone);
static void start_io_acct(struct dm_io *io)
{
struct mapped_device *md = io->md;
......@@ -2610,7 +2619,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
if (noflush)
set_bit(DMF_NOFLUSH_SUSPENDING, &md->flags);
else
pr_debug("%s: suspending with flush\n", dm_device_name(md));
DMDEBUG("%s: suspending with flush", dm_device_name(md));
/*
* This gets reverted if there's an error later and the targets
......
......@@ -38,7 +38,7 @@ struct node_header {
struct btree_node {
struct node_header header;
__le64 keys[0];
__le64 keys[];
} __packed;
......@@ -68,7 +68,7 @@ struct ro_spine {
};
void init_ro_spine(struct ro_spine *s, struct dm_btree_info *info);
int exit_ro_spine(struct ro_spine *s);
void exit_ro_spine(struct ro_spine *s);
int ro_step(struct ro_spine *s, dm_block_t new_child);
void ro_pop(struct ro_spine *s);
struct btree_node *ro_node(struct ro_spine *s);
......
......@@ -132,15 +132,13 @@ void init_ro_spine(struct ro_spine *s, struct dm_btree_info *info)
s->nodes[1] = NULL;
}
int exit_ro_spine(struct ro_spine *s)
void exit_ro_spine(struct ro_spine *s)
{
int r = 0, i;
int i;
for (i = 0; i < s->count; i++) {
unlock_block(s->info, s->nodes[i]);
}
return r;
}
int ro_step(struct ro_spine *s, dm_block_t new_child)
......
......@@ -332,6 +332,8 @@ void *dm_per_bio_data(struct bio *bio, size_t data_size);
struct bio *dm_bio_from_per_bio_data(void *data, size_t data_size);
unsigned dm_bio_get_target_bio_nr(const struct bio *bio);
u64 dm_start_time_ns_from_clone(struct bio *bio);
int dm_register_target(struct target_type *t);
void dm_unregister_target(struct target_type *t);
......@@ -557,13 +559,8 @@ void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size);
#define DMINFO(fmt, ...) pr_info(DM_FMT(fmt), ##__VA_ARGS__)
#define DMINFO_LIMIT(fmt, ...) pr_info_ratelimited(DM_FMT(fmt), ##__VA_ARGS__)
#ifdef CONFIG_DM_DEBUG
#define DMDEBUG(fmt, ...) printk(KERN_DEBUG DM_FMT(fmt), ##__VA_ARGS__)
#define DMDEBUG(fmt, ...) pr_debug(DM_FMT(fmt), ##__VA_ARGS__)
#define DMDEBUG_LIMIT(fmt, ...) pr_debug_ratelimited(DM_FMT(fmt), ##__VA_ARGS__)
#else
#define DMDEBUG(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
#define DMDEBUG_LIMIT(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
#endif
#define DMEMIT(x...) sz += ((sz >= maxlen) ? \
0 : scnprintf(result + sz, maxlen - sz, x))
......
......@@ -118,6 +118,11 @@ int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c);
*/
int dm_bufio_issue_flush(struct dm_bufio_client *c);
/*
* Send a discard request to the underlying device.
*/
int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t count);
/*
* Like dm_bufio_release but also move the buffer to the new
* block. dm_bufio_write_dirty_buffers is needed to commit the new block.
......@@ -131,6 +136,13 @@ void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block);
*/
void dm_bufio_forget(struct dm_bufio_client *c, sector_t block);
/*
* Free the given range of buffers.
* This is just a hint, if the buffer is in use or dirty, this function
* does nothing.
*/
void dm_bufio_forget_buffers(struct dm_bufio_client *c, sector_t block, sector_t n_blocks);
/*
* Set the minimum number of buffers before cleanup happens.
*/
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment