Commit 883069e3 authored by Matthew Sakai's avatar Matthew Sakai Committed by Mike Snitzer

dm vdo: add the slab summary

The slab depot maintains an additional small data structure, the "slab
summary," which is used to reduce the amount of work needed to come back
online after a crash. The slab summary maintains an entry for each slab
indicating whether or not the slab has ever been used, whether it is clean
(i.e. all of its reference count updates have been persisted to storage),
and approximately how full it is. During recovery, each physical zone will
attempt to recover at least one slab, stopping whenever it has recovered a
slab which has some free blocks. Once each zone has some space (or has
determined that none is available), the target can resume normal operation
in a degraded mode. Read and write requests can be serviced, perhaps with
degraded performance, while the remainder of the dirty slabs are recovered.
Co-developed-by: default avatarJ. corwin Coburn <corwin@hurlbutnet.net>
Signed-off-by: default avatarJ. corwin Coburn <corwin@hurlbutnet.net>
Co-developed-by: default avatarMichael Sclafani <dm-devel@lists.linux.dev>
Signed-off-by: default avatarMichael Sclafani <dm-devel@lists.linux.dev>
Co-developed-by: default avatarSweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: default avatarSweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: default avatarMatthew Sakai <msakai@redhat.com>
Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
parent 09eff388
This diff is collapsed.
......@@ -24,6 +24,23 @@
#include "vio.h"
#include "wait-queue.h"
/*
* A slab_depot is responsible for managing all of the slabs and block allocators of a VDO. It has
* a single array of slabs in order to eliminate the need for additional math in order to compute
* which physical zone a PBN is in. It also has a block_allocator per zone.
*
* Load operations are required to be performed on a single thread. Normal operations are assumed
* to be performed in the appropriate zone. Allocations and reference count updates must be done
* from the thread of their physical zone. Requests to commit slab journal tail blocks from the
* recovery journal must be done on the journal zone thread. Save operations are required to be
* launched from the same thread as the original load operation.
*/
enum {
/* The number of vios in the vio pool is proportional to the throughput of the VDO. */
BLOCK_ALLOCATOR_VIO_POOL_SIZE = 128,
};
/*
* Represents the possible status of a block.
*/
......@@ -240,6 +257,63 @@ struct vdo_slab {
struct reference_block *reference_blocks;
};
/*
* The slab_summary provides hints during load and recovery about the state of the slabs in order
* to avoid the need to read the slab journals in their entirety before a VDO can come online.
*
* The information in the summary for each slab includes the rough number of free blocks (which is
* used to prioritize scrubbing), the cleanliness of a slab (so that clean slabs containing free
* space will be used on restart), and the location of the tail block of the slab's journal.
*
* The slab_summary has its own partition at the end of the volume which is sized to allow for a
* complete copy of the summary for each of up to 16 physical zones.
*
* During resize, the slab_summary moves its backing partition and is saved once moved; the
* slab_summary is not permitted to overwrite the previous recovery journal space.
*
* The slab_summary does not have its own version information, but relies on the VDO volume version
* number.
*/
/*
* A slab status is a very small structure for use in determining the ordering of slabs in the
* scrubbing process.
*/
struct slab_status {
slab_count_t slab_number;
bool is_clean;
u8 emptiness;
};
struct slab_summary_block {
/* The block_allocator to which this block belongs */
struct block_allocator *allocator;
/* The index of this block in its zone's summary */
block_count_t index;
/* Whether this block has a write outstanding */
bool writing;
/* Ring of updates waiting on the outstanding write */
struct wait_queue current_update_waiters;
/* Ring of updates waiting on the next write */
struct wait_queue next_update_waiters;
/* The active slab_summary_entry array for this block */
struct slab_summary_entry *entries;
/* The vio used to write this block */
struct vio vio;
/* The packed entries, one block long, backing the vio */
char *outgoing_entries;
};
/*
* The statistics for all the slab summary zones owned by this slab summary. These fields are all
* mutated only by their physical zone threads, but are read by other threads when gathering
* statistics for the entire depot.
*/
struct atomic_slab_summary_statistics {
/* Number of blocks written */
atomic64_t blocks_written;
};
bool __must_check vdo_attempt_replay_into_slab(struct vdo_slab *slab,
physical_block_number_t pbn,
enum journal_operation operation,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment