Commits · 1ddd5049545e0aa1a0ed19bca4d9c9c3ce1ac8a2 · nexedi / linux

23 Mar, 2011 1 commit

cciss: fix lost command issue · 1ddd5049

Bud Brown authored Mar 23, 2011

Under certain workloads a command may seem to get lost. IOW, the Smart Array
thinks all commands have been completed but we still have commands in our
completion queue. This may lead to system instability, filesystems going
read-only, or even panics depending on the affected filesystem. We add an
extra read to force the write to complete.

Testing shows this extra read avoids the problem.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

1ddd5049

17 Mar, 2011 1 commit

drbd: need include for bitops functions declarations · f0ff1357

Stephen Rothwell authored Mar 17, 2011

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

f0ff1357

12 Mar, 2011 6 commits

Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation" · b6653801

Jens Axboe authored Mar 12, 2011

This reverts commit 978eb516.

The commit was broken, relying on other changes that have not been
committed yet.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

b6653801

cciss: fix missed command status value CMD_UNABORTABLE · 6d9a4f9e

Stephen M. Cameron authored Mar 12, 2011

and fix a nearby typo, "do" that should have been "due"
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

6d9a4f9e

cciss: remove unnecessary casts · fcab1c11

Stephen M. Cameron authored Mar 12, 2011

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

fcab1c11

cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent · 16011131
Stephen M. Cameron authored Mar 12, 2011
```
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
```
16011131

cciss: Inform controller we are using 32-bit tags. · 0498cc2a

Stephen M. Cameron authored Mar 12, 2011

Controller will DMA only 32-bits of the tag per command
on completion if it knows we are only using 32-bit tags.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

0498cc2a

cciss: hoist tag masking out of loop · 4a765046

Stephen M. Cameron authored Mar 12, 2011

In process_nonindexed_cmd, hoist figuring of masked tag out of loop since
it is the same throughout.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

4a765046

11 Mar, 2011 2 commits

cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation · 978eb516

Stephen M. Cameron authored Mar 11, 2011

This bit got lost somewhere along the way.  Without this, panic.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

978eb516

cciss: export resettable host attribute · 957c2ec5

Stephen M. Cameron authored Mar 11, 2011

This attribute, requested by Redhat, allows kexec-tools to know
whether the controller can honor the reset_devices kernel parameter
and actually reset the controller. For kdump to work properly it
is necessary that the reset_devices parameter be honored. This
attribute enables kexec-tools to warn the user if they attempt to
designate a non-resettable controller as the dump device.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

957c2ec5

10 Mar, 2011 30 commits

drbd: drop code present under #ifdef which is relevant to 2.6.28 and below · 03567812

Or Gerlitz authored Jan 13, 2011

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

03567812

drbd: Fixed handling of read errors on a 'VerifyS' node · 7961243b

Philipp Reisner authored Mar 02, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

7961243b

drbd: Fixed handling of read errors on a 'VerifyT' node · 8f21420e

Philipp Reisner authored Mar 01, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

8f21420e

drbd: Implemented real timeout checking for request processing time · 7fde2be9

Philipp Reisner authored Mar 01, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

7fde2be9

drbd: Remove unused function atodb_endio() · c5a91619

Andreas Gruenbacher authored Jan 25, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

c5a91619

drbd: improve log message if received sector offset exceeds local capacity · fdda6544
Lars Ellenberg authored Jan 24, 2011
```
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
```
fdda6544

drbd: kill dead code · e99dc367

Lars Ellenberg authored Jan 24, 2011

This code became obsolete and unused last December with
 drbd: bitmap keep track of changes vs on-disk bitmap
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

e99dc367

drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails · 10f6d992

Lars Ellenberg authored Jan 24, 2011

Just deal with it more gracefully, if we fail to add even a single page
to an empty bio. We used to BUG_ON() there, but it has been observed in
some Xen deployment, so we need to handle that case more robustly now.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

10f6d992

drbd: Removed left over, now wrong comments · 039312b6

Philipp Reisner authored Jan 21, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

039312b6

drbd: serialize admin requests for new verify run with pending bitmap io · 873b0d5f

Lars Ellenberg authored Jan 21, 2011

This is an addendum to
 drbd: serialize admin requests for new resync with pending bitmap io

It avoids a race that could trigger "FIXME" assert log messages.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

873b0d5f

drbd: fix potential imbalance of ap_in_flight · e636db5b

Lars Ellenberg authored Jan 21, 2011

When we receive a barrier ack, we walk the ring list of drbd requests
in the transfer log of the respective epoch, do some housekeeping,
and free those objects.

We tried to keep epochs of mirrored and unmirrored drbd requests
separate, and assert that no local-only requests are present in a
barrier_acked epoch.

It turns out that this has quite a number of corner cases and would
add bloated code without functional benefit.

We now revert the (insufficient) commits
 drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions
 drbd: Ensure that an epoch contains only requests of one kind
and instead fix the processing of barrier acks to cope with
a mix of local-only and mirrored requests.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

e636db5b

drbd: silence some noisy log messages during disconnect · 0ddc5549

Lars Ellenberg authored Jan 21, 2011

If we fail to send the information that we lost our disk,
we have no connection, and no disk: no access to data anymore.
That is either expected (deconfiguration), or there will be so much
noise in the logs that "Sending state failed" is not useful at all.
Drop it.

If the reason for a shorter than expected receive was a signal,
which we sent because we already decided to disconnect,
these additional log messages are confusing and useless.

This patch follows this pattern:
 - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);
 + if (!signal_pending(current))
 + 	dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);

Also make them all dev_warn for consistency.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

0ddc5549

drbd: describe bitmap locking for bulk operation in finer detail · 20ceb2b2

Lars Ellenberg authored Jan 21, 2011

Now that we do no longer in-place endian-swap the bitmap, we allow
selected bitmap operations (testing bits, sometimes even settting bits)
during some bulk operations.

This caused us to hit a lot of FIXME asserts similar to
	FIXME asender in drbd_bm_count_bits,
	bitmap locked for 'write from resync_finished' by worker
Which now is nonsense: looking at the bitmap is perfectly legal
as long as it is not being resized.

This cosmetic patch defines some flags to describe expectations in finer
detail, so the asserts in e.g. bm_change_bits_to() can be skipped if
appropriate.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

20ceb2b2

drbd: log UUIDs whenever they change · 62b0da3a

Lars Ellenberg authored Jan 20, 2011

All decisions about sync, sync direction, and wether or not to
allow a connect or attach are based on our set of UUIDs to tag a
data generation.

Log changes to the UUIDs whenever they occur,
logging "new current UUID P:Q:R:S" is more useful
than "Creating new current UUID".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

62b0da3a

drbd: We can not process BIOs with a size of 0 · d07c9c10

Philipp Reisner authored Jan 20, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

d07c9c10

drbd: Provide hints with the error message when clearing the sync pause flag · cd88d030

Philipp Reisner authored Jan 20, 2011

When the user clears the sync-pause flag, and sync stays in pause
state, give hints to the user, why it still is in pause state.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

cd88d030

drbd: queue bitmap writeout more intelligently · 79a30d2d

Lars Ellenberg authored Jan 20, 2011

The "lazy writeout" of cleared bitmap pages happens during resync, and
should happen again once the resync finishes cleanly, or is aborted.

If resync finished cleanly, or was aborted because of peer disk
failure, we trigger the writeout from worker context in the after
state change work.

If resync was aborted because of connection failure, we should not
immediately trigger bitmap writeout, but rather postpone the
writeout to after the connection cleanup happened.  We now do it
in the receiver context from drbd_disconnect().

If resync was aborted because of local disk failure, well, there
is nothing to write to anymore.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

79a30d2d

drbd: don't pointlessly queue bitmap send, if we lost connection · 54b956ab

Lars Ellenberg authored Jan 20, 2011

This is a minor optimization and cleanup,
and also considerably reduces some harmless (but noisy) race with
the connection cleanup code.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

54b956ab

drbd: serialize admin requests for new resync with pending bitmap io · 194bfb32

Lars Ellenberg authored Jan 18, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

194bfb32

drbd: only generate and send a new sync uuid after a successful state change · 6c922ed5
Lars Ellenberg authored Jan 12, 2011
```
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
```
6c922ed5
drbd: cleaned up __set_current_state() followed by schedule_timeout() calls · 20ee6390
Philipp Reisner authored Jan 18, 2011
```
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
```
20ee6390

drbd: Ensure that an epoch contains only requests of one kind · 6a35c45f

Philipp Reisner authored Jan 17, 2011

The assert in drbd_req.c:755 forces us to have only requests of
one kind in an epoch. The two kinds we distinguish here are:
local-only or mirrored.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

6a35c45f

drbd: Fixed P_NEG_ACK processing for protocol A and B · 2deb8336

Philipp Reisner authored Jan 17, 2011

Protocol A has no P_WRITE_ACKs, but has P_NEG_ACKs.
The master bio might already be completed, therefore the
request is no longer in the collision hash.
=> Do not try to validate block_id as request

In Protocol B we might already have got a P_RECV_ACK
but then get a P_NEG_ACK after wards.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

2deb8336

drbd: Killed an assert that is no longer valid · 94f2b05f

Philipp Reisner authored Jan 17, 2011

The point is that drbd_disconnect() can be called with a cstate of
WFConnection.

That happens if the user issues "drbdsetup disconnect" while the
drbd_connect() function executes. Then drbdd_init() will call
drbdd(), which in turn will return without receiving any
packets. Then drbdd_init() will end up calling drbd_disconnect()
with a cstate of WFConnection.

Bottom line: This assertion is wrong as it is, and we do not
see value in fixing it. => Removing it.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

94f2b05f

drbd: Do not drop net config if sending in drbd_send_protocol() fails · 148efa16
Philipp Reisner authored Jan 15, 2011
```
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
```
148efa16

drbd: Work on the Ahead -> SyncSource transition · 370a43e7

Philipp Reisner authored Jan 14, 2011

The test if rs_pending_cnt == 0 was too weak. Using Test for
unacked_cnt == 0 instead. Moved that into the worker.

Since unacked_cnt gets already increased when an P_RS_DATA_REQ
comes in.

Also using a timer to make Ahead -> SyncSource -> Ahead cycles
slower...
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

370a43e7

drbd: Nothing should stop SyncSource -> Ahead transitions · 71c78cfb

Philipp Reisner authored Jan 14, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

71c78cfb

drbd: Do not full sync if a P_SYNC_UUID packet gets lost · 4a23f264

Philipp Reisner authored Jan 11, 2011

See also commit from 2009-08-15
"drbd_uuid_compare(): Do not full sync in case a P_SYNC_UUID packet gets lost."

We saw cases where the History UUIDs where not as expected. So the
detection of the special case did not trigger. With the sync UUID
no longer being a random number, but deducible from the previous
bitmap UUID, the detection of this special case becomes more
reliable.

The SyncUUID now is the previous bitmap UUID + 0x1000000000000.

Rule 5a:
Cs = H1p & H1p + Offset = Bp
  Connection was lost before SyncUUID Packet came through.
  Corrent (peer) UUIDs:
   Bp = H1p
   H1p = H2p
   H2p = 0
  Become Sync target.

Rule 7a:
Cp = H1s & H1s + Offset = Bs
  Connection was lost before SyncUUID Packet came through.
  Correct (own) UUIDs:
   Bs = H1s
   H1s = H2s
   H2s = 0
  Become Sync source.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

4a23f264

drbd: Corrected off-by-one error in DRBD_MINOR_COUNT_MAX · 2b8a90b5

Philipp Reisner authored Jan 10, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

2b8a90b5

drbd: Remove useless / wrong comments · 110a204a

Andreas Gruenbacher authored Jan 03, 2011

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

110a204a