Commits · f30cb757f680f965ba8a2e53cb3588052a01aeb5 · Kirill Smelkov / linux

21 Apr, 2017 6 commits

NFS: Always wait for I/O completion before unlock · f30cb757

Benjamin Coddington authored Apr 11, 2017

NFS attempts to wait for read and write completion before unlocking in
order to ensure that the data returned was protected by the lock.  When
this waiting is interrupted by a signal, the unlock may be skipped, and
messages similar to the following are seen in the kernel ring buffer:

[20.167876] Leaked locks on dev=0x0:0x2b ino=0x8dd4c3:
[20.168286] POSIX: fl_owner=ffff880078b06940 fl_flags=0x1 fl_type=0x0 fl_pid=20183
[20.168727] POSIX: fl_owner=ffff880078b06680 fl_flags=0x1 fl_type=0x0 fl_pid=20185

For NFSv3, the missing unlock will cause the server to refuse conflicting
locks indefinitely.  For NFSv4, the leftover lock will be removed by the
server after the lease timeout.

This patch fixes this issue by skipping the usual wait in
nfs_iocounter_wait if the FL_CLOSE flag is set when signaled.  Instead, the
wait happens in the unlock RPC task on the NFS UOC rpc_waitqueue.

For NFSv3, use lockd's new nlmclnt_operations along with
nfs_async_iocounter_wait to defer NLM's unlock task until the lock
context's iocounter reaches zero.

For NFSv4, call nfs_async_iocounter_wait() directly from unlock's
current rpc_call_prepare.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

f30cb757

lockd: Introduce nlmclnt_operations · b1ece737

Benjamin Coddington authored Apr 11, 2017

NFS would enjoy the ability to modify the behavior of the NLM client's
unlock RPC task in order to delay the transmission of the unlock until IO
that was submitted under that lock has completed.  This ability can ensure
that the NLM client will always complete the transmission of an unlock even
if the waiting caller has been interrupted with fatal signal.

For this purpose, a pointer to a struct nlmclnt_operations can be assigned
in a nfs_module's nfs_rpc_ops that will install those nlmclnt_operations on
the nlm_host.  The struct nlmclnt_operations defines three callback
operations that will be used in a following patch:

nlmclnt_alloc_call - used to call back after a successful allocation of
	a struct nlm_rqst in nlmclnt_proc().

nlmclnt_unlock_prepare - used to call back during NLM unlock's
	rpc_call_prepare.  The NLM client defers calling rpc_call_start()
	until this callback returns false.

nlmclnt_release_call - used to call back when the NLM client's struct
	nlm_rqst is freed.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

b1ece737

NFS: Add an iocounter wait function for async RPC tasks · 7d6ddf88

Benjamin Coddington authored Apr 11, 2017

By sleeping on a new NFS Unlock-On-Close waitqueue, rpc tasks may wait for
a lock context's iocounter to reach zero. The rpc waitqueue is only woken
when the open_context has the NFS_CONTEXT_UNLOCK flag set in order to
mitigate spurious wake-ups for any iocounter reaching zero.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

7d6ddf88

locks: Set FL_CLOSE when removing flock locks on close() · 50f2112c

Benjamin Coddington authored Apr 11, 2017

Set FL_CLOSE in fl_flags as in locks_remove_posix() when clearing locks.
NFS will check for this flag to ensure an unlock is sent in a following
patch.

Fuse handles flock and posix locks differently for FL_CLOSE, and so
requires a fixup to retain the existing behavior for flock.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

50f2112c

NFS: Move the flock open mode check into nfs_flock() · e1293727

Benjamin Coddington authored Apr 11, 2017

We only need to check lock exclusive/shared types against open mode when
flock() is used on NFS, so move it into the flock-specific path instead of
checking it for all locks.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

e1293727

NFS4: remove a redundant lock range check · 12a16d15

Benjamin Coddington authored Apr 11, 2017

flock64_to_posix_lock() is already doing this check
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

12a16d15

20 Apr, 2017 34 commits

pNFS: unexport nfs4_pnfs_v3_ds_connect_unload · 675e508f

Trond Myklebust authored Apr 20, 2017

It is not used outside the NFSv4 module.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

675e508f

pNFS: Unexport pnfs_put_lseg_locked and _pnfs_return_layout · b9419688
Trond Myklebust authored Apr 20, 2017
```
They are not used outside the NFSv4 module.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
```
b9419688

pNFS: Remove unused layout driver callbacks · 73504740

Trond Myklebust authored Apr 20, 2017

encode_layoutreturn and encode_layoutcommit are now unused. Let's
remove them.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

73504740

nfs: remove the objlayout driver · 6d22323b

Christoph Hellwig authored Apr 12, 2017

The objlayout code has been in the tree, but it's been unmaintained and
no server product for it actually ever shipped.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

6d22323b

pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect · 260f32ad

Trond Myklebust authored Apr 20, 2017

The check in nfs4_ff_layout_prepare_ds() seems to be missing.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Fixes: a33e4b03 ("pNFS: return status from nfs4_pnfs_ds_connect")
Cc: Weston Andros Adamson <dros@primarydata.com>
Cc: stable@vger.kernel.org # v4.11

260f32ad

NFSv4: Fix a hang in OPEN related to server reboot · 56e0d71e

Trond Myklebust authored Apr 15, 2017

If the server fails to return the attributes as part of an OPEN
reply, and then reboots, we can end up hanging. The reason is that
the client attempts to send a GETATTR in order to pick up the
missing OPEN call, but fails to release the slot first, causing
reboot recovery to deadlock.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Fixes: 2e80dbe7 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Cc: stable@vger.kernel.org # v4.8+

56e0d71e

NFS: move rw_mode to nfs_pageio_header · fbe77c30

Benjamin Coddington authored Apr 19, 2017

Let's try to have it in a cacheline in nfs4_proc_pgio_rpc_prepare().
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

fbe77c30

NFS: move nfs_pgarray_set() to open code · 8ef9b0b9

Benjamin Coddington authored Apr 19, 2017

Since commit 00bfa30a ("NFS: Create a common pgio_alloc and
pgio_release function"), nfs_pgarray_set() has only a single caller.  Let's
open code it.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

8ef9b0b9

NFS: Use GFP_NOIO for two allocations in writeback · ae97aa52

Benjamin Coddington authored Apr 19, 2017

Prevent a deadlock that can occur if we wait on allocations
that try to write back our pages.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 00bfa30a ("NFS: Create a common pgio_alloc and pgio_release...")
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

ae97aa52

NFS: Fix use after free in write error path · 1f84ccdf

Fred Isaman authored Apr 14, 2017

Signed-off-by: Fred Isaman <fred.isaman@gmail.com>
Fixes: 0bcbf039 ("nfs: handle request add failure properly")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

1f84ccdf

NFS: Fix missing pg_cleanup after nfs_pageio_cond_complete() · 43b7d964

Benjamin Coddington authored Apr 14, 2017

Commit a7d42ddb ("nfs: add mirroring
support to pgio layer") moved pg_cleanup out of the path when there was
non-sequental I/O that needed to be flushed.  The result is that for
layouts that have more than one layout segment per file, the pg_lseg is not
cleared, so we can end up hitting the WARN_ON_ONCE(req_start >= seg_end) in
pnfs_generic_pg_test since the pg_lseg will be pointing to that
previously-flushed layout segment.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: a7d42ddb ("nfs: add mirroring support to pgio layer")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

43b7d964

sunrpc: don't check for failure from mempool_alloc() · 62b2417e

NeilBrown authored Apr 10, 2017

When mempool_alloc() is allowed to sleep (GFP_NOIO allows
sleeping) it cannot fail.
So rpc_alloc_task() cannot fail, so rpc_new_task doesn't need
to test for failure.
Consequently rpc_new_task() cannot fail, so the callers
don't need to test.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

62b2417e

NFS: fix usage of mempools. · 518662e0

NeilBrown authored Apr 10, 2017

When passed GFP flags that allow sleeping (such as
GFP_NOIO), mempool_alloc() will never return NULL, it will
wait until memory is available.

This means that we don't need to handle failure, but that we
do need to ensure one thread doesn't call mempool_alloc()
twice on the one pool without queuing or freeing the first
allocation.  If multiple threads did this during times of
high memory pressure, the pool could be exhausted and a
deadlock could result.

pnfs_generic_alloc_ds_commits() attempts to allocate from
the nfs_commit_mempool while already holding an allocation
from that pool.  This is not safe.  So change
nfs_commitdata_alloc() to take a flag that indicates whether
failure is acceptable.

In pnfs_generic_alloc_ds_commits(), accept failure and
handle it as we currently do.  Else where, do not accept
failure, and do not handle it.

Even when failure is acceptable, we want to succeed if
possible.  That means both
 - using an entry from the pool if there is one
 - waiting for direct reclaim is there isn't.

We call mempool_alloc(GFP_NOWAIT) to achieve the first, then
kmem_cache_alloc(GFP_NOIO|__GFP_NORETRY) to achieve the
second.  Each of these can fail, but together they do the
best they can without blocking indefinitely.

The objects returned by kmem_cache_alloc() will still be freed
by mempool_free().  This is safe as mempool_alloc() uses
exactly the same function to allocate objects (since the mempool
was created with mempool_create_slab_pool()).  The object returned
by mempool_alloc() and kmem_cache_alloc() are indistinguishable
so mempool_free() will handle both identically, either adding to the
pool or calling kmem_cache_free().

Also, don't test for failure when allocating from
nfs_wdata_mempool.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

518662e0

NFS: Clean up nfs4_proc_get_lease_time() · f6148713

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

f6148713

NFS: Clean up _nfs4_proc_exchange_id() · e917f0d1

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

e917f0d1

NFS: Clean up nfs4_proc_bind_one_conn_to_session() · c7ae7639

Anna Schumaker authored Apr 07, 2017

Returning errors directly even lets us remove the goto
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

c7ae7639

NFS: Remove extra dprintk()s from nfs4namespace.c · 3183783b

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

3183783b

NFS: Clean up nfs4_get_rootfh() · 539fd1d1

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

539fd1d1

NFS: Remove extra dprintk()s from nfs4client.c · 4fe6b366

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

4fe6b366

NFS: Clean up nfs4_init_server() · 1073d9b4

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

1073d9b4

NFS: Clean up nfs4_set_client() · 2dc42c0d

Anna Schumaker authored Apr 07, 2017

If we cut out the dprintk()s, then we can return error codes directly
and cut out the goto.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

2dc42c0d

NFS: Clean up nfs4_check_server_scope() · 8da0f934

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

8da0f934

NFS: Clean up nfs4_check_serverowner_major_id() · ddfa0d48

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

ddfa0d48

NFS: Create a common nfs4_match_client() function · 14d1bbb0

Anna Schumaker authored Apr 07, 2017

This puts all the common code in a single place for the
walk_client_list() functions.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

14d1bbb0

NFS: Clean up nfs4_check_serverowner_minor_id() · 5b6d3ff6

Anna Schumaker authored Apr 07, 2017

Once again, we can remove the function and compare integer values
directly.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

5b6d3ff6

NFS: Clean up nfs4_match_clientids() · f251fd9e

Anna Schumaker authored Apr 07, 2017

If we cut out the dprintk()s, then we don't even need this to be a
separate function.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

f251fd9e

NFS: Clean up nfs42_layoutstat_done() · 5be1810a

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

5be1810a

NFS: Remove extra dprintk()s from namespace.c · e36d48e9

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

e36d48e9

NFS: Clean up nfs_direct_commit_complete() · fe4f844d

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

fe4f844d

NFS: Remove nfs_direct_readpage_release() · beeb5338

Anna Schumaker authored Apr 07, 2017

Just remove the function and have the caller use nfs_release_request()
instead.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

beeb5338

NFS: Clean up extra dprintk()s in client.c · 4cbb9768

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

4cbb9768

NFS: Clean up nfs_init_client() · 2844b6ae

Anna Schumaker authored Apr 07, 2017

We always call nfs_mark_client_ready() even if nfs_create_rpc_client()
returns an error, so we can rearrange nfs_init_client() to mark the
client ready from a single place.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

2844b6ae

NFS: Remove extra dprintk()s from callback_xdr.c · 36718a66

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

36718a66

NFS: Clean up encode_cb_sequence_res() · 3d0bfaa6

Anna Schumaker authored Apr 07, 2017

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

3d0bfaa6