Commits · 3156d267016627fe427a6b0d4ed8a9678557e91e · Kirill Smelkov / linux

26 Jun, 2006 40 commits

ocfs2: move dlm work to a private work queue · 3156d267

Kurt Hackel authored May 01, 2006

The work that is done can block for long periods of time and so is not
appropriate for keventd.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

3156d267

ocfs2: fix incorrect error returns · 495ac96e

Kurt Hackel authored May 01, 2006

Use DLM_REJECTED instead of DLM_RECOVERING.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

495ac96e

ocfs2: tune down some noisy messages during dlm recovery · 3b3b84a8

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

3b3b84a8

ocfs2: display message before waiting for recovery to complete · 56a7c104
Kurt Hackel authored May 01, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
56a7c104
ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR · 44a7f1d0
Kurt Hackel authored May 01, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
44a7f1d0

ocfs2: retry operations when a lock is marked in recovery · b220532a

Kurt Hackel authored May 01, 2006

Before checking for a nonexistent lock, make sure the lockres is not marked
RECOVERING. The caller will just retry and the state should be fixed up when
recovery completes.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

b220532a

ocfs2: use cond_resched() in dlm_thread() · f85cd47a

Kurt Hackel authored May 01, 2006

yield() does not yield.  cond_resched() does.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

f85cd47a

ocfs2: use GFP_NOFS in some dlm operations · ad8100e0

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ad8100e0

ocfs2: wait for recovery when starting lock mastery · b7084ab5

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

b7084ab5

ocfs2: continue recovery when a dead node is encountered · c27069e6

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c27069e6

ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks() · 67a18741
Kurt Hackel authored May 01, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
67a18741

ocfs2: dlm_remaster_locks() should never exit without completing · 6a413211

Kurt Hackel authored May 01, 2006

We cannot restart recovery. Once we begin to recover a node, keep the state
of the recovery intact and follow through, regardless of any other node
deaths that may occur.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

6a413211

ocfs2: special case recovery lock in dlmlock_remote() · c8df412e

Kurt Hackel authored May 01, 2006

If the previous master of the recovery lock dies, let calc_usage take it
down completely and let the caller completely redo the dlmlock() call.
Otherwise, there will never be an opportunity to re-master the lockres and
recovery wont be able to progress.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c8df412e

ocfs2: pending mastery asserts and migrations should block each other · 36407488

Kurt Hackel authored May 01, 2006

Use the existing structure for blocking migrations when ASTs are pending to
achieve the same result. If we can catch the assert before it goes on the
wire, just cancel it and let the migration continue.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

36407488

ocfs2: temporarily disable automatic lock migration · c87a9ae7

Kurt Hackel authored May 01, 2006

Now we never change the owner of a lock resource until unmount or node
death. This will be re-enabled once some issues in the algorithm used have
been resolved.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c87a9ae7

ocfs2: do not unconditionally purge the lockres in dlmlock_remote() · 2abaf97e

Kurt Hackel authored May 01, 2006

In dlmlock_remote(), do not call purge_lockres until the lock resource
actually changes. otherwise, the mastery info on the lockres will go away
underneath the caller.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

2abaf97e

ocfs2: increase backoff before waiting for recovery · aa087b84

Kurt Hackel authored May 01, 2006

When mastering non-recovery lock resources, additional time was frequently
needed to allow the disk heartbeat to catch up with the network timeout. the
recovery lock resource is time critical and avoids this path.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

aa087b84

ocfs2: have dlm_pre_master_reco_lockres() ignore dead nodes · f42a100b

Kurt Hackel authored May 01, 2006

Recovery will spin in dlm_pre_master_reco_lockres if we do not ignore
timed-out network responses from dead nodes.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

f42a100b

ocfs2: give the dlm dirty list a reference on the lockres · 6ff06a93

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

6ff06a93

ocfs2: teach dlm_restart_lock_mastery() to wait on recovery · e7e69eb3

Kurt Hackel authored May 01, 2006

Change behavior of dlm_restart_lock_mastery() when a node goes down.  Dump
all responses that have been collected and start over.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e7e69eb3

ocfs2: gracefully handle stale create_lock messages. · e4eb0368

Kurt Hackel authored May 01, 2006

This is an error on the sending side, so gracefully error out on the
receiving end.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e4eb0368

ocfs2: update lvb immediately during recovery · ccd8b1f9

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ccd8b1f9

ocfs2: do not send master requests to localhost · 588e0090

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

588e0090

ocfs2: purge lockres' sooner · 8b219809

Kurt Hackel authored May 01, 2006

Immediately purge a lockress that the local node is not the master of.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

8b219809

ocfs2: dump mismatching migrated lvbs before BUG() · 343e26a4

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

343e26a4

ocfs2: make dlm recovery finalization 2 stage · 466d1a45

Kurt Hackel authored May 01, 2006

Makes it easier for the recovery process to deal with node death.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

466d1a45

ocfs2: dlm recovery / lockres reference count fix · 69d72b06

Kurt Hackel authored May 01, 2006

Take a reference on lockres structures while they are on the recovery list.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

69d72b06

ocfs2: better error handling during assert master message · a9ee4c8a

Kurt Hackel authored Apr 27, 2006

handle errors during lock assert master by either killing self or other node
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a9ee4c8a

ocfs2: dump lockres info before we BUG() on a bad reference · a7f90d83
Kurt Hackel authored Apr 27, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
a7f90d83

ocfs2: do LVB puts in place · c0a8520c

Mark Fasheh authored Apr 27, 2006

Don't wait until the AST will be fired to do the LVB copy into the lock
resource.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c0a8520c

ocfs2: mle ref count debugging · aa852354

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

aa852354

ocfs2: allow for an assert message during lock mastery · dc2ed195

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

dc2ed195

ocfs2: take mle reference during migration · 2d1a868c

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

2d1a868c

ocfs2: properly initialize the mle structure · 41b8c8a1

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

41b8c8a1

ocfs2: detach mle from heartbeat events · da01ad05

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

da01ad05

ocfs2: mle ref counting fixes · a2bf0477

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a2bf0477

ocfs2: better mle debugging · 95883719

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

95883719

ocfs2: clean up recovery related messages · d6dea6e9

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

d6dea6e9

ocfs2: handle network errors during recovery · 29c0fa0f

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

29c0fa0f

ocfs2: only recover one dead node at a time · c3187ce5

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c3187ce5