Commits · 947174476701fbc84ea8c7ec9664270f9d80b076 · Kirill Smelkov / linux

29 Jan, 2014 1 commit

bcache: fix BUG_ON due to integer overflow with GC_SECTORS_USED · 94717447

Darrick J. Wong authored Jan 28, 2014

The BUG_ON at the end of __bch_btree_mark_key can be triggered due to
an integer overflow error:

BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13);
...
SET_GC_SECTORS_USED(g, min_t(unsigned,
	     GC_SECTORS_USED(g) + KEY_SIZE(k),
	     (1 << 14) - 1));
BUG_ON(!GC_SECTORS_USED(g));

In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide.
While the SET_ code tries to ensure that the field doesn't overflow by
clamping it to (1<<14)-1 == 16383, this is incorrect because 16383
requires 14 bits.  Therefore, if GC_SECTORS_USED() + KEY_SIZE() =
8192, the SET_ statement tries to store 8192 into a 13-bit field.  In
a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON.

Therefore, create a field width constant and a max value constant, and
use those to create the bitfield and check the inputs to
SET_GC_SECTORS_USED.  Arguably the BITMASK() template ought to have
BUG_ON checks for too-large values, but that's a separate patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

94717447

08 Jan, 2014 35 commits

bcache: Fix auxiliary search trees for key size > cacheline size · 9dd6358a
Kent Overstreet authored Dec 17, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
9dd6358a

bcache: Don't return -EINTR when insert finished · 3b3e9e50

Kent Overstreet authored Dec 07, 2013

We need to return -EINTR after a split because we invalidated iterators
(and freed the btree node) - but if we were finished inserting, we don't
want to redo the traversal.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

3b3e9e50

bcache: Improve bucket_prio() calculation · e0a985a4

Kent Overstreet authored Nov 12, 2013

When deciding what order to reuse buckets we take into account both the bucket's
priority (which indicates lru order) and also the amount of live data in that
bucket. The way they were scaled together wasn't as correct as it could be...
this patch improves and documents it.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

e0a985a4

bcache: Add bch_bkey_equal_header() · 3bdad1e4

Nicholas Swenson authored Nov 11, 2013

Checks if two keys have equivalent header fields.
(good enough for replacement or merging)

Used in bch_bkey_try_merge, and replacing a key
in the btree.
Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

3bdad1e4

bcache: update bch_bkey_try_merge · 0f49cf3d

Nicholas Swenson authored Oct 14, 2013

Added generic header checks to bch_bkey_try_merge,
which then calls the bkey specific function

Removed extraneous checks from bch_extent_merge
Signed-off-by: Nicholas Swenson <nks@daterainc.com>

0f49cf3d

bcache: Move insert_fixup() to btree_keys_ops · 829a60b9

Kent Overstreet authored Nov 11, 2013

Now handling overlapping extents/keys is a method that's specific to what the
btree node contains.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

829a60b9

bcache: Convert sorting to btree_keys · 89ebb4a2

Kent Overstreet authored Nov 11, 2013

More work to disentangle various code from struct btree
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

89ebb4a2

bcache: Convert debug code to btree_keys · dc9d98d6

Kent Overstreet authored Dec 17, 2013

More work to disentangle various code from struct btree
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

dc9d98d6

bcache: Convert btree_iter to struct btree_keys · c052dd9a

Kent Overstreet authored Nov 11, 2013

More work to disentangle bset.c from struct btree
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

c052dd9a

bcache: Refactor bset_tree sysfs stats · f67342dd

Kent Overstreet authored Nov 11, 2013

We're in the process of turning bset.c into library code, so none of the code in
that file should know about struct cache_set or struct btree - so, move the
btree traversal part of the stats code to sysfs.c.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

f67342dd

bcache: Add bch_btree_keys_u64s_remaining() · 59158fde

Kent Overstreet authored Nov 11, 2013

Helper function to explicitly check how much space is free in a btree node
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

59158fde

bcache: Add struct btree_keys · a85e968e

Kent Overstreet authored Dec 20, 2013

Soon, bset.c won't need to depend on struct btree.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

a85e968e

bcache: Abstract out stuff needed for sorting · 65d45231
Kent Overstreet authored Dec 20, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
65d45231

bcache: Rename/shuffle various code around · ee811287

Kent Overstreet authored Dec 17, 2013

More work to disentangle bset.c from the rest of the code:
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

ee811287

bcache: Add struct bset_sort_state · 67539e85

Kent Overstreet authored Sep 10, 2013

More disentangling bset.c from the rest of the bcache code - soon, the
sorting routines won't have any dependencies on any outside structs.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

67539e85

bcache: Split out sort_extent_cmp() · 911c9610

Kent Overstreet authored Jul 28, 2013

Only use extent comparison for comparing extents, so we're not using
START_KEY() on other key types (i.e. btree pointers)
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

911c9610

bcache: Bkey indexing renaming · fafff81c

Kent Overstreet authored Dec 17, 2013

More refactoring:

node() -> bset_bkey_idx()
end() -> bset_bkey_last()
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

fafff81c

bcache: Make bch_keylist_realloc() take u64s, not nptrs · 085d2a3d

Kent Overstreet authored Nov 11, 2013

Getting away from KEY_PTRS and moving toward KEY_U64s - and getting rid of magic
2s

Also - split out the part that checks against journal entry size so as to avoid
a dependancy on struct cache_set in bset.c
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

085d2a3d

bcache: Remove/fix some header dependencies · 9a02b7ee

Kent Overstreet authored Dec 20, 2013

In the process of disentagling/libraryizing bset.c from the rest of the
bcache code.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

9a02b7ee

bcache: Use a mempool for mergesort temporary space · 0a451145

Kent Overstreet authored Dec 18, 2013

It was a single element mempool before, it's slightly cleaner to just use a real
mempool.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

0a451145

bcache: Btree verify code improvements · 78b77bf8

Kent Overstreet authored Dec 17, 2013

Used this fixed code to find and fix the bug fixed by
a4d885097b0ac0cd1337f171f2d4b83e946094d4.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

78b77bf8

bcache: kill index() · 88b9f8c4

Kent Overstreet authored Dec 17, 2013

That was a terrible name for a macro, add some better helpers to replace it.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

88b9f8c4

bcache: Trivial error handling fix · 5c41c8a7
Kent Overstreet authored Jul 08, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
5c41c8a7

bcache/md: Use raid stripe size · c78afc62

Kent Overstreet authored Jul 11, 2013

Now that we've got code for raid5/6 stripe awareness, bcache just needs
to know about the stripes and when writing partial stripes is expensive
- we probably don't want to enable this optimization for raid1 or 10,
even though they have stripes. So add a flag to queue_limits.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

c78afc62

bcache: Do bkey_put() in btree_split() error path · 5f5837d2

Kent Overstreet authored Dec 16, 2013

This error path shouldn't have been hit in practice.. and we've got reworked
reserve code coming soon so that it shouldn't _ever_ be bit... but if we've got
code for this error path it should be correct.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

5f5837d2

bcache: Rework allocator reserves · 78365411

Kent Overstreet authored Dec 17, 2013

We need a reserve for allocating buckets for new btree nodes - and now that
we've got multiple btrees, it really needs to be per btree.

This reworks the reserves so we've got separate freelists for each reserve
instead of watermarks, which seems to make things a bit cleaner, and it adds
some code so that btree_split() can make sure the reserve is available before it
starts.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

78365411

bcache: kill closure locking code · 1dd13c8d

Kent Overstreet authored Dec 20, 2013

Also flesh out the documentation a bit
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

1dd13c8d

bcache: kill closure locking usage · cb7a583e
Kent Overstreet authored Dec 16, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
cb7a583e

bcache: Zero less memory · a5ae4300

Kent Overstreet authored Sep 10, 2013

Another minor performance optimization
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

a5ae4300

bcache: Don't touch bucket gen for dirty ptrs · d56d000a

Kent Overstreet authored Aug 09, 2013

Unnecessary since a bucket that has dirty pointers pointing to it can
never be invalidated - and skipping it is a measurable performance
boost, since the bucket gen will usually be a cache miss.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

d56d000a

bcache: Minor btree cache fix · b0f32a56
Kent Overstreet authored Dec 10, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
b0f32a56

bcache: Performance fix for when journal entry is full · 5775e213

Kent Overstreet authored Dec 10, 2013

We were unnecessarily waiting on a journal write to complete when we just needed
to start a journal write and start setting up the next one.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

5775e213

bcache: Minor journal fix · b3fa7e77

Kent Overstreet authored Aug 05, 2013

The real fix is where we check the bytes we need against how much is
remaining - we also need to check for a journal entry bigger than our
buffer, we'll never write those and it would be bad if we tried to read
one.

Also improve the diagnostic messages.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

b3fa7e77

bcache: Data corruption fix · ef71ec00

Kent Overstreet authored Dec 17, 2013

The code that handles overlapping extents that we've just read back in from disk
was depending on the behaviour of the code that handles overlapping extents as
we're inserting into a btree node in the case of an insert that forced an
existing extent to be split: on insert, if we had to split we'd also insert a
new extent to represent the top part of the old extent - and then that new
extent would get written out.

The code that read the extents back in thus not bother with splitting extents -
if it saw an extent that ovelapped in the middle of an older extent, it would
trim the old extent to only represent the bottom part, assuming that the
original insert would've inserted a new extent to represent the top part.

I still haven't figured out _how_ it can happen, but I'm now pretty convinced
(and testing has confirmed) that there's some kind of an obscure corner case
(probably involving extent merging, and multiple overwrites in different sets)
that breaks this. The fix is to change the mergesort fixup code to split extents
itself when required.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

ef71ec00

Merge branch 'for-3.14/core' into for-3.14/drivers · 54a387cb
Jens Axboe authored Jan 08, 2014
```
We need the updated code to make bcache easier to merge.
```
54a387cb

03 Jan, 2014 2 commits

blk-mq: fix initializing request's start time · 0fec08b4

Ming Lei authored Jan 03, 2014

blk_rq_init() is called in req's complete handler to initialize
the request, so the members of start_time and start_time_ns might
become inaccurate when it is allocated in future.

The patch initializes the two members in blk_mq_rq_ctx_init() to
fix the problem.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0fec08b4

pktcdvd: fix error return code · 8586ea96

Julia Lawall authored Dec 29, 2013

Set the return variable to an error code as done elsewhere in the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

8586ea96

31 Dec, 2013 2 commits

block: blk-mq: don't export blk_mq_free_queue() · 3edcc0ce

Ming Lei authored Dec 26, 2013

blk_mq_free_queue() is called from release handler of
queue kobject, so it needn't be called from drivers.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3edcc0ce

block: blk-mq: make blk_sync_queue support mq · f04c1fe7

Ming Lei authored Dec 26, 2013

This patch moves synchronization on mq->delay_work
from blk_mq_free_queue() to blk_sync_queue(), so that
blk_sync_queue can work on mq.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f04c1fe7