- 16 Dec, 2013 10 commits
-
-
Kent Overstreet authored
The old writeback PD controller could get into states where it had throttled all the way down and take way too long to recover - it was too complicated to really understand what it was doing. This rewrites a good chunk of it to hopefully be simpler and make more sense, and it also pays more attention to units which should make the behaviour a bit easier to understand. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
There is a possibility for a bucket to be invalidated by the allocator while moving_gc was copying it's contents to another bucket, if the bucket only held cached data. To prevent this moving checks for a stale ptr (to an invalidated bucket), before and after reads. It it finds one, it simply ignores moving that data. This only affects bcache if the moving_gc was turned on, note that it's off by default. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Nicholas Swenson authored
Garbage collector needs to check keys in the writeback keybuf to make sure it's not invalidating buckets to which the writeback keys point to. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Nicholas Swenson authored
Removed gc_move_threshold because picking buckets only by threshold could lead moving extra buckets (ei. if there are buckets at the threshold that aren't supposed to be moved do to space considerations). This is replaced by a GC_MOVE bit in the gc_mark bitmask. Now only marked buckets get moved. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Nicholas Swenson authored
Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Nicholas Swenson authored
Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Nicholas Swenson authored
Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Dirty data accounting wasn't quite right - firstly, we were adding the key we're inserting after it could have merged with another dirty key already in the btree, and secondly we could sometimes pass the wrong offset to bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is important when tracking dirty data by stripe. NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes necessary that got squashed in with other patches; the full patch against 3.10 is 408cc2f47eeac93a, available at: git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 2a46036..4a12b2f 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert, if (KEY_START(k) > KEY_START(insert) + sectors_found) goto check_failed; - if (KEY_PTRS(replace_key) != KEY_PTRS(k)) + if (KEY_PTRS(k) != KEY_PTRS(replace_key) || + KEY_DIRTY(k) != KEY_DIRTY(replace_key)) goto check_failed; /* skip past gen */
-
Kent Overstreet authored
We're just waiting on kthread_should_stop(), nothing else, so interruptible sleep was wrong here. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Stefan Priebe authored
at the beginning (schedule_timout_interuptible) and others do his on their own This prevents wrong load average calculation (load of 1 per thread) Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
- 29 Nov, 2013 1 commit
-
-
Wei Yongjun authored
Fixes the following sparse warning: drivers/md/bcache/btree.c:2220:5: warning: symbol 'btree_insert_fn' was not declared. Should it be static? Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
- 11 Nov, 2013 29 commits
-
-
Kees Cook authored
Just to be safe, call the error reporting function with "%s" to avoid any possible future format string leak. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
More testing ftw! Also, now verify mode doesn't break if you read dirty data. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Never saw a profile of bset_search_tree() where it wasn't bottlenecked on memory until I got my new Haswell machine, but when I tried it there it was suddenly burning 20% of the cpu in the inner loop on shrd... Turns out, the version of shrd that takes 64 bit operands has a 9 cycle latency. hah. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
The old scanning-by-stripe code burned too much CPU, this should be better. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
The flow control in btree_insert_node() was... fragile... before, this'll use more stack (but since our btrees are never more than depth 1, that shouldn't matter) and it should be significantly clearer and less fragile. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Minor cleanup. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
It never really made sense to expose this, so just kill it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
This dates from before the btree iterator, and now it's finally gone Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Not a complete fix - we could still deadlock if btree_insert_node() has to split... Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Big garbage collection rewrite; now, garbage collection uses the same mechanisms as used elsewhere for inserting/updating btree node pointers, instead of rewriting interior btree nodes in place. This makes the code significantly cleaner and less fragile, and means we can now make garbage collection incremental - it doesn't have to hold a write lock on the root of the btree for the entire duration of garbage collection. This means that there's less of a latency hit for doing garbage collection, which means we can gc more frequently (and do a better job of reclaiming from the cache), and we can coalesce across more btree nodes (improving our space efficiency). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Refactoring, prep work for incremental garbage collection. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
More refactoring - mostly making the interfaces more explicit about what we actually want to do. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
btree_insert_key() was open coding this, this is just refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Trying to treat btree pointers and leaf node pointers the same way was a mistake - going to start being more explicit about the type of key/pointer we're dealing with. This is the first part of that refactoring; this patch shouldn't change any actual behaviour. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
The bucket refcount (dropped with bkey_put()) is only needed to prevent the newly allocated bucket from being garbage collected until we've added a pointer to it somewhere. But for btree node allocations, the fact that we have btree nodes locked is enough to guard against races with garbage collection. Eventually the per bucket refcount is going to be replaced with something specific to bch_alloc_sectors(). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Couple changes: * Consolidate bch_check_keys() and bch_check_key_order(), and move the checks that only check_key_order() could do to bch_btree_iter_next(). * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to flip on the EDEBUG checks at runtime. * Dropped an old not terribly useful check in rw_unlock(), and refactored/improved a some of the other debug code. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Previously, bch_ptr_bad() could return false when there was a pointer to a nonexistant device... it only filtered out keys with PTR_CHECK_DEV pointers. This behaviour was intended for multiple cache device support; for that, just because the device for one of the pointers has gone away doesn't mean we want to filter out the rest of the pointers. But we don't yet explicitly filter/check individual pointers, so without that this behaviour was wrong - a corrupt bkey with a bad device pointer could cause us to deref a bad pointer. Doh. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Now, the on disk data structures are in a header that can be exported to userspace - and having them all centralized is nice too. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Just reorganizing things a bit. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
With all the recent refactoring around struct btree op struct search has gotten rather large. But we can now easily break it up in a different way - we break out struct btree_insert_op which is for inserting data into the cache, and that's now what the copying gc code uses - struct search is now specific to request.c Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Last of the btree_map() conversions. Main visible effect is bch_btree_insert() is no longer taking a struct btree_op as an argument anymore - there's no fancy state machine stuff going on, it's just a normal function. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
When we convert bch_btree_insert() to bch_btree_map_leaf_nodes(), we won't be passing struct btree_op to bch_btree_insert() anymore - so we need a different way of returning whether there was a collision (really, a replace collision). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
This is prep work for converting bch_btree_insert to bch_btree_map_leaf_nodes() - we have to convert all its arguments to actual arguments. Bunch of churn, but should be straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
With a the recent bcache refactoring, some of the closure code isn't needed anymore. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
This isn't used for waiting asynchronously anymore - so this is a fairly trivial refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
Eventual goal is for struct btree_op to contain only what is necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-
Kent Overstreet authored
There was some looping in submit_partial_cache_hit() and submit_partial_cache_hit() that isn't needed anymore - originally, we wouldn't necessarily process the full hit or miss all at once because when splitting the bio, we took into account the restrictions of the device we were sending it to. But, device bio size restrictions are now handled elsewhere, with a wrapper around generic_make_request() - so that looping has been unnecessary for awhile now and we can now do quite a bit of cleanup. And if we trim the key we're reading from to match the subset we're actually reading, we don't have to explicitly calculate bi_sector anymore. Neat. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
-