Commits · 5a01c805441bdc86e7af206d8a03735cc9394ffb · Kirill Smelkov / linux

14 Nov, 2022 1 commit

NFSD: Fix trace_nfsd_fh_verify_err() crasher · 5a01c805

Chuck Lever authored Nov 12, 2022

Now that the nfsd_fh_verify_err() tracepoint is always called on
error, it needs to handle cases where the filehandle is not yet
fully formed.

Fixes: 93c128e7 ("nfsd: ensure we always call fh_verify_error tracepoint")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

5a01c805

08 Nov, 2022 1 commit

nfsd: put the export reference in nfsd4_verify_deleg_dentry · 50256e47

Jeff Layton authored Nov 08, 2022

nfsd_lookup_dentry returns an export reference in addition to the dentry
ref. Ensure that we put it too.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866
Fixes: 876c553c ("NFSD: verify the opened dentry after setting a delegation")
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

50256e47

05 Nov, 2022 1 commit

nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint · bdd6b562

Jeff Layton authored Nov 05, 2022

When we fail to insert into the hashtable with a non-retryable error,
we'll free the object and then goto out_status. If the tracepoint is
enabled, it'll end up accessing the freed object when it tries to
grab the fields out of it.

Set nf to NULL after freeing it to avoid the issue.

Fixes: 243a5263 ("nfsd: rework hashtable handling in nfsd_do_file_acquire")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bdd6b562

01 Nov, 2022 1 commit

nfsd: fix net-namespace logic in __nfsd_file_cache_purge · d3aefd2b

Jeff Layton authored Oct 31, 2022

If the namespace doesn't match the one in "net", then we'll continue,
but that doesn't cause another rhashtable_walk_next call, so it will
loop infinitely.

Fixes: ce502f81 ("NFSD: Convert the filecache to use rhashtable")
Reported-by: Petr Vorel <pvorel@suse.cz>
Link: https://lore.kernel.org/ltp/Y1%2FP8gDAcWC%2F+VR3@pevik/Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

d3aefd2b

13 Oct, 2022 1 commit

nfsd: ensure we always call fh_verify_error tracepoint · 93c128e7

Jeff Layton authored Oct 12, 2022

This is a conditional tracepoint. Call it every time, not just when
nfs_permission fails.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

93c128e7

11 Oct, 2022 1 commit

NFSD: unregister shrinker when nfsd_init_net() fails · bd86c69d

Tetsuo Handa authored Oct 10, 2022

syzbot is reporting UAF read at register_shrinker_prepared() [1], for
commit 7746b32f ("NFSD: add shrinker to reap courtesy clients on
low memory condition") missed that nfsd4_leases_net_shutdown() from
nfsd_exit_net() is called only when nfsd_init_net() succeeded.
If nfsd_init_net() fails due to nfsd_reply_cache_init() failure,
register_shrinker() from nfsd4_init_leases_net() has to be undone
before nfsd_init_net() returns.

Link: https://syzkaller.appspot.com/bug?extid=ff796f04613b4c84ad89 [1]
Reported-by: syzbot <syzbot+ff796f04613b4c84ad89@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 7746b32f ("NFSD: add shrinker to reap courtesy clients on low memory condition")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bd86c69d

05 Oct, 2022 2 commits

nfsd: rework hashtable handling in nfsd_do_file_acquire · 243a5263

Jeff Layton authored Oct 04, 2022

nfsd_file is RCU-freed, so we need to hold the rcu_read_lock long enough
to get a reference after finding it in the hash. Take the
rcu_read_lock() and call rhashtable_lookup directly.

Switch to using rhashtable_lookup_insert_key as well, and use the usual
retry mechanism if we hit an -EEXIST. Rename the "retry" bool to
open_retry, and eliminiate the insert_err goto target.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

243a5263

nfsd: fix nfsd_file_unhash_and_dispose · 8d0d254b

Jeff Layton authored Sep 30, 2022

nfsd_file_unhash_and_dispose() is called for two reasons:

We're either shutting down and purging the filecache, or we've gotten a
notification about a file delete, so we want to go ahead and unhash it
so that it'll get cleaned up when we close.

We're either walking the hashtable or doing a lookup in it and we
don't take a reference in either case. What we want to do in both cases
is to try and unhash the object and put it on the dispose list if that
was successful. If it's no longer hashed, then we don't want to touch
it, with the assumption being that something else is already cleaning
up the sentinel reference.

Instead of trying to selectively decrement the refcount in this
function, just unhash it, and if that was successful, move it to the
dispose list. Then, the disposal routine will just clean that up as
usual.

Also, just make this a void function, drop the WARN_ON_ONCE, and the
comments about deadlocking since the nature of the purported deadlock
is no longer clear.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

8d0d254b

26 Sep, 2022 32 commits

nfsd: extra checks when freeing delegation stateids · 895ddf5e

Jeff Layton authored Sep 26, 2022

We've had some reports of problems in the refcounting for delegation
stateids that we've yet to track down. Add some extra checks to ensure
that we've removed the object from various lists before freeing it.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

895ddf5e

nfsd: make nfsd4_run_cb a bool return function · b95239ca

Jeff Layton authored Sep 26, 2022

queue_work can return false and not queue anything, if the work is
already queued. If that happens in the case of a CB_RECALL, we'll have
taken an extra reference to the stid that will never be put. Ensure we
throw a warning in that case.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

b95239ca

nfsd: fix comments about spinlock handling with delegations · 25fbe1fc
Jeff Layton authored Sep 26, 2022
```
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
25fbe1fc

nfsd: only fill out return pointer on success in nfsd4_lookup_stateid · 4d01416a

Jeff Layton authored Sep 26, 2022

In the case of a revoked delegation, we still fill out the pointer even
when returning an error, which is bad form. Only overwrite the pointer
on success.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4d01416a

NFSD: fix use-after-free on source server when doing inter-server copy · 019805fe

Dai Ngo authored Sep 26, 2022

Use-after-free occurred when the laundromat tried to free expired
cpntf_state entry on the s2s_cp_stateids list after inter-server
copy completed. The sc_cp_list that the expired copy state was
inserted on was already freed.

When COPY completes, the Linux client normally sends LOCKU(lock_state x),
FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server.
The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state
from the s2s_cp_stateids list before freeing the lock state's stid.

However, sometimes the CLOSE was sent before the FREE_STATEID request.
When this happens, the nfsd4_close_open_stateid call from nfsd4_close
frees all lock states on its st_locks list without cleaning up the copy
state on the sc_cp_list list. When the time the FREE_STATEID arrives the
server returns BAD_STATEID since the lock state was freed. This causes
the use-after-free error to occur when the laundromat tries to free
the expired cpntf_state.

This patch adds a call to nfs4_free_cpntf_statelist in
nfsd4_close_open_stateid to clean up the copy state before calling
free_ol_stateid_reaplist to free the lock state's stid on the reaplist.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

019805fe

NFSD: Cap rsize_bop result based on send buffer size · 76ce4dce

Chuck Lever authored Sep 01, 2022

Since before the git era, NFSD has conserved the number of pages
held by each nfsd thread by combining the RPC receive and send
buffers into a single array of pages. This works because there are
no cases where an operation needs a large RPC Call message and a
large RPC Reply at the same time.

Once an RPC Call has been received, svc_process() updates
svc_rqst::rq_res to describe the part of rq_pages that can be
used for constructing the Reply. This means that the send buffer
(rq_res) shrinks when the received RPC record containing the RPC
Call is large.

Add an NFSv4 helper that computes the size of the send buffer. It
replaces svc_max_payload() in spots where svc_max_payload() returns
a value that might be larger than the remaining send buffer space.
Callers who need to know the transport's actual maximum payload size
will continue to use svc_max_payload().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

76ce4dce

NFSD: Rename the fields in copy_stateid_t · 781fde1a

Chuck Lever authored Sep 22, 2022

Code maintenance: The name of the copy_stateid_t::sc_count field
collides with the sc_count field in struct nfs4_stid, making the
latter difficult to grep for when auditing stateid reference
counting.

No behavior change expected.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

781fde1a

nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops · 1342f9dd

ChenXiaoSong authored Sep 23, 2022

Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

1342f9dd

nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops · 64776611

ChenXiaoSong authored Sep 23, 2022

Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.

nfsd_net is converted from seq_file->file instead of seq_file->private in
nfsd_reply_cache_stats_show().
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
[ cel: reduce line length ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

64776611

nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops · 1d7f6b30

ChenXiaoSong authored Sep 23, 2022

Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.

inode is converted from seq_file->file instead of seq_file->private in
client_info_show().
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

1d7f6b30

nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops · 9beeaab8

ChenXiaoSong authored Sep 23, 2022

Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
[ cel: reduce line length ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

9beeaab8

nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops · 0cfb0c42

ChenXiaoSong authored Sep 23, 2022

Use DEFINE_PROC_SHOW_ATTRIBUTE helper macro to simplify the code.
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

0cfb0c42

NFSD: Pack struct nfsd4_compoundres · 9f553e61

Chuck Lever authored Sep 12, 2022

Remove a couple of 4-byte holes on platforms with 64-bit pointers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

9f553e61

NFSD: Remove unused nfsd4_compoundargs::cachetype field · 77e378cf

Chuck Lever authored Sep 12, 2022

This field was added by commit 1091006c ("nfsd: turn on reply
cache for NFSv4") but was never put to use.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

77e378cf

NFSD: Remove "inline" directives on op_rsize_bop helpers · 6604148c

Chuck Lever authored Sep 12, 2022

These helpers are always invoked indirectly, so the compiler can't
inline these anyway. While we're updating the synopses of these
helpers, defensively convert their parameters to const pointers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

6604148c

NFSD: Clean up nfs4svc_encode_compoundres() · 9993a663

Chuck Lever authored Sep 12, 2022

In today's Linux NFS server implementation, the NFS dispatcher
initializes each XDR result stream, and the NFSv4 .pc_func and
.pc_encode methods all use xdr_stream-based encoding. This keeps
rq_res.len automatically updated. There is no longer a need for
the WARN_ON_ONCE() check in nfs4svc_encode_compoundres().
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

9993a663

SUNRPC: Fix typo in xdr_buf_subsegment's kdoc comment · b8ab2a6f

Chuck Lever authored Sep 12, 2022

Fix a typo.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

b8ab2a6f

NFSD: Clean up WRITE arg decoders · d4da5baa

Chuck Lever authored Sep 12, 2022

xdr_stream_subsegment() already returns a boolean value.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

d4da5baa

NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks · c3d2a04f

Chuck Lever authored Sep 12, 2022

Replace the check for buffer over/underflow with a helper that is
commonly used for this purpose. The helper also sets xdr->nwords
correctly after successfully linearizing the symlink argument into
the stream's scratch buffer.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

c3d2a04f

NFSD: Refactor common code out of dirlist helpers · 98124f5b

Chuck Lever authored Sep 12, 2022

The dust has settled a bit and it's become obvious what code is
totally common between nfsd_init_dirlist_pages() and
nfsd3_init_dirlist_pages(). Move that common code to SUNRPC.

The new helper brackets the existing xdr_init_decode_pages() API.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

98124f5b

SUNRPC: Clarify comment that documents svc_max_payload() · f18d8afb

Chuck Lever authored Sep 12, 2022

Note the function returns a per-transport value, not a per-request
value (eg, one that is related to the size of the available send or
receive buffer space).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f18d8afb

NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing · 3fdc5464

Chuck Lever authored Sep 12, 2022

Have SunRPC clear everything except for the iops array. Then have
each NFSv4 XDR decoder clear it's own argument before decoding.

Now individual operations may have a large argument struct while not
penalizing the vast majority of operations with a small struct.

And, clearing the argument structure occurs as the argument fields
are initialized, enabling the CPU to do write combining on that
memory. In some cases, clearing is not even necessary because all
of the fields in the argument structure are initialized by the
decoder.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

3fdc5464

SUNRPC: Parametrize how much of argsize should be zeroed · 103cc1fa

Chuck Lever authored Sep 12, 2022

Currently, SUNRPC clears the whole of .pc_argsize before processing
each incoming RPC transaction. Add an extra parameter to struct
svc_procedure to enable upper layers to reduce the amount of each
operation's argument structure that is zeroed by SUNRPC.

The size of struct nfsd4_compoundargs, in particular, is a lot to
clear on each incoming RPC Call. A subsequent patch will cut this
down to something closer to what NFSv2 and NFSv3 uses.

This patch should cause no behavior changes.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

103cc1fa

SUNRPC: Optimize svc_process() · 81593c4d

Chuck Lever authored Sep 12, 2022

Move exception handling code out of the hot path, and avoid the need
for a bswap of a non-constant.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

81593c4d

NFSD: add shrinker to reap courtesy clients on low memory condition · 7746b32f

Dai Ngo authored Sep 14, 2022

Add courtesy_client_reaper to react to low memory condition triggered
by the system memory shrinker.

The delayed_work for the courtesy_client_reaper is scheduled on
the shrinker's count callback using the laundry_wq.

The shrinker's scan callback is not used for expiring the courtesy
clients due to potential deadlocks.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

7746b32f

NFSD: keep track of the number of courtesy clients in the system · 3a4ea23d

Dai Ngo authored Sep 14, 2022

Add counter nfs4_courtesy_client_count to nfsd_net to keep track
of the number of courtesy clients in the system.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

3a4ea23d

NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data · 06981d56

Anna Schumaker authored Sep 13, 2022

This was discussed with Chuck as part of this patch set. Returning
nfserr_resource was decided to not be the best error message here, and
he suggested changing to nfserr_serverfault instead.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Link: https://lore.kernel.org/linux-nfs/20220907195259.926736-1-anna@kernel.org/T/#tSigned-off-by: Chuck Lever <chuck.lever@oracle.com>

06981d56

NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY · 5f5f8b6d

Chuck Lever authored Sep 08, 2022

nfsd_unlink() can kick off a CB_RECALL (via
vfs_unlink() -> leases_conflict()) if a delegation is present.
Before returning NFS4ERR_DELAY, give the client holding that
delegation a chance to return it and then retry the nfsd_unlink()
again, once.

Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

5f5f8b6d

NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY · 68c522af

Chuck Lever authored Sep 08, 2022

nfsd_rename() can kick off a CB_RECALL (via
vfs_rename() -> leases_conflict()) if a delegation is present.
Before returning NFS4ERR_DELAY, give the client holding that
delegation a chance to return it and then retry the nfsd_rename()
again, once.

This version of the patch handles renaming an existing file,
but does not deal with renaming onto an existing file. That
case will still always trigger an NFS4ERR_DELAY.

Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

68c522af

NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY · 34b91dda

Chuck Lever authored Sep 08, 2022

nfsd_setattr() can kick off a CB_RECALL (via
notify_change() -> break_lease()) if a delegation is present. Before
returning NFS4ERR_DELAY, give the client holding that delegation a
chance to return it and then retry the nfsd_setattr() again, once.

Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

34b91dda

NFSD: Refactor nfsd_setattr() · c0aa1913

Chuck Lever authored Sep 08, 2022

Move code that will be retried (in a subsequent patch) into a helper
function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

c0aa1913

NFSD: Add a mechanism to wait for a DELEGRETURN · c035362e

Chuck Lever authored Sep 08, 2022

Subsequent patches will use this mechanism to wake up an operation
that is waiting for a client to return a delegation.

The new tracepoint records whether the wait timed out or was
properly awoken by the expected DELEGRETURN:

            nfsd-1155  [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out)
Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

c035362e