Commits · b6c71c66b0ad8f2b59d9bc08c7a5079b110bec01 · Kirill Smelkov / linux

02 Jun, 2022 1 commit

NFSD: Fix potential use-after-free in nfsd_file_put() · b6c71c66

Chuck Lever authored May 31, 2022

nfsd_file_put_noref() can free @nf, so don't dereference @nf
immediately upon return from nfsd_file_put_noref().
Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 99939792 ("nfsd: Clean up nfsd_file_put()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

b6c71c66

29 May, 2022 1 commit

MAINTAINERS: reciprocal co-maintainership for file locking and nfsd · 9ff9f77f

Jeff Layton authored May 24, 2022

Chuck has agreed to backstop me as maintainer of the file locking code,
and I'll do the same for him on knfsd.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

9ff9f77f

26 May, 2022 4 commits

NFSD: nfsd_file_put() can sleep · 08af54b3

Chuck Lever authored May 11, 2022

Now that there are no more callers of nfsd_file_put() that might
hold a spin lock, ensure the lockdep infrastructure can catch
newly introduced calls to nfsd_file_put() made while a spinlock
is held.

Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aadaSigned-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

08af54b3

NFSD: Add documenting comment for nfsd4_release_lockowner() · 043862b0

Chuck Lever authored May 22, 2022

And return explicit nfserr values that match what is documented in the
new comment / API contract.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

043862b0

NFSD: Modernize nfsd4_release_lockowner() · bd8fdb6e

Chuck Lever authored May 22, 2022

Refactor: Use existing helpers that other lock operations use. This
change removes several automatic variables, so re-organize the
variable declarations for readability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bd8fdb6e

NFSD: Fix possible sleep during nfsd4_release_lockowner() · ce3c4ad7

Chuck Lever authored May 21, 2022

nfsd4_release_lockowner() holds clp->cl_lock when it calls
check_for_locks(). However, check_for_locks() calls nfsd_file_get()
/ nfsd_file_put() to access the backing inode's flc_posix list, and
nfsd_file_put() can sleep if the inode was recently removed.

Let's instead rely on the stateowner's reference count to gate
whether the release is permitted. This should be a reliable
indication of locks-in-use since file lock operations and
->lm_get_owner take appropriate references, which are released
appropriately when file locks are removed.
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org

ce3c4ad7

23 May, 2022 10 commits

nfsd: destroy percpu stats counters after reply cache shutdown · fd5e363e

Julian Schroeder authored May 23, 2022

Upon nfsd shutdown any pending DRC cache is freed. DRC cache use is
tracked via a percpu counter. In the current code the percpu counter
is destroyed before. If any pending cache is still present,
percpu_counter_add is called with a percpu counter==NULL. This causes
a kernel crash.
The solution is to destroy the percpu counter after the cache is freed.

Fixes: e567b98c (“nfsd: protect concurrent access to nfsd stats counters”)
Signed-off-by: Julian Schroeder <jumaco@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

fd5e363e

nfsd: Fix null-ptr-deref in nfsd_fill_super() · 6f6f84aa

Zhang Xiaoxu authored May 21, 2022

KASAN report null-ptr-deref as follows:

  BUG: KASAN: null-ptr-deref in nfsd_fill_super+0xc6/0xe0 [nfsd]
  Write of size 8 at addr 000000000000005d by task a.out/852

  CPU: 7 PID: 852 Comm: a.out Not tainted 5.18.0-rc7-dirty #66
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   kasan_report+0xab/0x120
   ? nfsd_mkdir+0x71/0x1c0 [nfsd]
   ? nfsd_fill_super+0xc6/0xe0 [nfsd]
   nfsd_fill_super+0xc6/0xe0 [nfsd]
   ? nfsd_mkdir+0x1c0/0x1c0 [nfsd]
   get_tree_keyed+0x8e/0x100
   vfs_get_tree+0x41/0xf0
   __do_sys_fsconfig+0x590/0x670
   ? fscontext_read+0x180/0x180
   ? anon_inode_getfd+0x4f/0x70
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This can be reproduce by concurrent operations:
	1. fsopen(nfsd)/fsconfig
	2. insmod/rmmod nfsd

Since the nfsd file system is registered before than nfsd_net allocated,
the caller may get the file_system_type and use the nfsd_net before it
allocated, then null-ptr-deref occurred.

So init_nfsd() should call register_filesystem() last.

Fixes: bd5ae928 ("nfsd: register pernet ops last, unregister first")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

6f6f84aa

nfsd: Unregister the cld notifier when laundry_wq create failed · 62fdb65e

Zhang Xiaoxu authored May 21, 2022

If laundry_wq create failed, the cld notifier should be unregistered.
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

62fdb65e

SUNRPC: Use RMW bitops in single-threaded hot paths · 28df0988

Chuck Lever authored Apr 29, 2022

I noticed CPU pipeline stalls while using perf.

Once an svc thread is scheduled and executing an RPC, no other
processes will touch svc_rqst::rq_flags. Thus bus-locked atomics are
not needed outside the svc thread scheduler.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

28df0988

NFSD: Clean up the show_nf_flags() macro · bb283ca1

Chuck Lever authored Mar 27, 2022

The flags are defined using C macros, so TRACE_DEFINE_ENUM is
unnecessary.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

bb283ca1

NFSD: Trace filecache opens · 0122e882

Chuck Lever authored Mar 27, 2022

Instrument calls to nfsd_open_verified() to get a sense of the
filecache hit rate.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

0122e882

NFSD: Move documenting comment for nfsd4_process_open2() · 7e2ce0cc

Chuck Lever authored Mar 23, 2022

Clean up nfsd4_open() by converting a large comment at the only
call site for nfsd4_process_open2() to a kerneldoc comment in
front of that function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

7e2ce0cc

NFSD: Fix whitespace · 26320d7e

Chuck Lever authored Mar 21, 2022

Clean up: Pull case arms back one tab stop to conform every other
switch statement in fs/nfsd/nfs4proc.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

26320d7e

NFSD: Remove dprintk call sites from tail of nfsd4_open() · f67a16b1

Chuck Lever authored Mar 30, 2022

Clean up: These relics are not likely to benefit server
administrators.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f67a16b1

NFSD: Instantiate a struct file when creating a regular NFSv4 file · fb70bf12

Chuck Lever authored Mar 30, 2022

There have been reports of races that cause NFSv4 OPEN(CREATE) to
return an error even though the requested file was created. NFSv4
does not provide a status code for this case.

To mitigate some of these problems, reorganize the NFSv4
OPEN(CREATE) logic to allocate resources before the file is actually
created, and open the new file while the parent directory is still
locked.

Two new APIs are added:

+ Add an API that works like nfsd_file_acquire() but does not open
the underlying file. The OPEN(CREATE) path can use this API when it
already has an open file.

+ Add an API that is kin to dentry_open(). NFSD needs to create a
file and grab an open "struct file *" atomically. The
alloc_empty_file() has to be done before the inode create. If it
fails (for example, because the NFS server has exceeded its
max_files limit), we avoid creating the file and can still return
an error to the NFS client.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: JianHong Yin <jiyin@redhat.com>

fb70bf12

20 May, 2022 7 commits

NFSD: Clean up nfsd_open_verified() · f4d84c52

Chuck Lever authored Mar 27, 2022

Its only caller always passes S_IFREG as the @type parameter. As an
additional clean-up, add a kerneldoc comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

f4d84c52

NFSD: Remove do_nfsd_create() · 1c388f27

Chuck Lever authored Mar 28, 2022

Now that its two callers have their own version-specific instance of
this function, do_nfsd_create() is no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

1c388f27

NFSD: Refactor NFSv4 OPEN(CREATE) · 254454a5

Chuck Lever authored Mar 28, 2022

Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

254454a5

NFSD: Refactor NFSv3 CREATE · df9606ab

Chuck Lever authored Mar 28, 2022

The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to
diverge such that it makes sense to split do_nfsd_create() into one
version for NFSv3 and one for NFSv4.

As a first step, copy do_nfsd_create() to nfs3proc.c and remove
NFSv4-specific logic.

One immediate legibility benefit is that the logic for handling
NFSv3 createhow is now quite straightforward. NFSv4 createhow
has some subtleties that IMO do not belong in generic code.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

df9606ab

NFSD: Refactor nfsd_create_setattr() · 5f46e950

Chuck Lever authored Mar 28, 2022

I'd like to move do_nfsd_create() out of vfs.c. Therefore
nfsd_create_setattr() needs to be made publicly visible.

Note that both call sites in vfs.c commit both the new object and
its parent directory, so just combine those common metadata commits
into nfsd_create_setattr().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

5f46e950

NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() · 14ee45b7

Chuck Lever authored Mar 28, 2022

Clean up: The "out" label already invokes fh_drop_write().

Note that fh_drop_write() is already careful not to invoke
mnt_drop_write() if either it has already been done or there is
nothing to drop. Therefore no change in behavior is expected.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

14ee45b7

NFSD: Clean up nfsd3_proc_create() · e6156859

Chuck Lever authored Mar 25, 2022

As near as I can tell, mode bit masking and setting S_IFREG is
already done by do_nfsd_create() and vfs_create(). The NFSv4 path
(do_open_lookup), for example, does not bother with this special
processing.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

e6156859

19 May, 2022 15 commits

SUNRPC: Simplify synopsis of svc_pool_for_cpu() · 2059b698

Chuck Lever authored May 10, 2022

Clean up: There is one caller. The @cpu argument can be made
implicit now that a get_cpu/put_cpu pair is no longer needed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

2059b698

SUNRPC: Don't disable preemption while calling svc_pool_for_cpu(). · 586095d3

Sebastian Andrzej Siewior authored May 10, 2022

svc_xprt_enqueue() disables preemption via get_cpu() and then asks
for a pool of a specific CPU (current) via svc_pool_for_cpu().
While preemption is disabled, svc_xprt_enqueue() acquires
svc_pool::sp_lock with bottom-halfs disabled, which can sleep on
PREEMPT_RT.

Disabling preemption is not required here. The pool is protected with a
lock so the following list access is safe even cross-CPU. The following
iteration through svc_pool::sp_all_threads is under RCU-readlock and
remaining operations within the loop are atomic and do not rely on
disabled-preemption.

Use raw_smp_processor_id() as the argument for the requested CPU in
svc_pool_for_cpu().
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

586095d3

NFSD: Show state of courtesy client in client info · e9488d5a

Dai Ngo authored May 02, 2022

Update client_info_show to show state of courtesy client
and seconds since last renew.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

e9488d5a

NFSD: add support for lock conflict to courteous server · 27431aff

Dai Ngo authored May 02, 2022

This patch allows expired client with lock state to be in COURTESY
state. Lock conflict with COURTESY client is resolved by the fs/lock
code using the lm_lock_expirable and lm_expire_lock callback in the
struct lock_manager_operations.

If conflict client is in COURTESY state, set it to EXPIRABLE and
schedule the laundromat to run immediately to expire the client. The
callback lm_expire_lock waits for the laundromat to flush its work
queue before returning to caller.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

27431aff

fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict · 2443da22

Dai Ngo authored May 02, 2022

Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to
lock_manager_operations to allow the lock manager to take appropriate
action to resolve the lock conflict if possible.

A new field, lm_mod_owner, is also added to lock_manager_operations.
The lm_mod_owner is used by the fs/lock code to make sure the lock
manager module such as nfsd, is not freed while lock conflict is being
resolved.

lm_lock_expirable checks and returns true to indicate that the lock
conflict can be resolved else return false. This callback must be
called with the flc_lock held so it can not block.

lm_expire_lock is called to resolve the lock conflict if the returned
value from lm_lock_expirable is true. This callback is called without
the flc_lock held since it's allowed to block. Upon returning from
this callback, the lock conflict should be resolved and the caller is
expected to restart the conflict check from the beginnning of the list.

Lock manager, such as NFSv4 courteous server, uses this callback to
resolve conflict by destroying lock owner, or the NFSv4 courtesy client
(client that has expired but allowed to maintains its states) that owns
the lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

2443da22

fs/lock: add helper locks_owner_has_blockers to check for blockers · 591502c5

Dai Ngo authored May 02, 2022

Add helper locks_owner_has_blockers to check if there is any blockers
for a given lockowner.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>

591502c5

NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd · d76cc46b

Dai Ngo authored May 02, 2022

This patch moves create/destroy of laundry_wq from nfs4_state_start
and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent
the laundromat from being freed while a thread is processing a
conflicting lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

d76cc46b

NFSD: add support for share reservation conflict to courteous server · 3d694271

Dai Ngo authored May 02, 2022

This patch allows expired client with open state to be in COURTESY
state. Share/access conflict with COURTESY client is resolved by
setting COURTESY client to EXPIRABLE state, schedule laundromat
to run and returning nfserr_jukebox to the request client.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

3d694271

NFSD: add courteous server support for thread with only delegation · 66af2579

Dai Ngo authored May 02, 2022

This patch provides courteous server support for delegation only.
Only expired client with delegation but no conflict and no open
or lock state is allowed to be in COURTESY state.

Delegation conflict with COURTESY/EXPIRABLE client is resolved by
setting it to EXPIRABLE, queue work for the laundromat and return
delay to the caller. Conflict is resolved when the laudromat runs
and expires the EXIRABLE client while the NFS client retries the
OPEN request. Local thread request that gets conflict is doing the
retry in _break_lease.

Client in COURTESY or EXPIRABLE state is allowed to reconnect and
continues to have access to its state. Access to the nfs4_client by
the reconnecting thread and the laundromat is serialized via the
client_lock.
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

66af2579

SUNRPC: Remove svc_rqst::rq_xprt_hlen · 983084b2

Chuck Lever authored Apr 06, 2022

Clean up: This field is now always set to zero.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

983084b2

SUNRPC: Remove dead code in svc_tcp_release_rqst() · 4af8b42e

Chuck Lever authored Apr 03, 2022

Clean up: svc_tcp_sendto() always sets rq_xprt_ctxt to NULL.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4af8b42e

SUNRPC: Make cache_req::thread_wait an unsigned long · 0b6c14bd

Chuck Lever authored Apr 01, 2022

The second parameter of wait_for_completion_interruptible_timeout()
is a jiffies value whose type is "unsigned long". Avoid an
unnecessary and potentially fraught implicit type conversion at the
wait_for_completion_interruptible_timeout() call site in
cache_wait_req().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

0b6c14bd

SUNRPC: Cache deferral injection · 37324e6b

Chuck Lever authored Mar 31, 2022

Cache deferral injection stress-tests the cache deferral logic as
well as upper layer protocol deferred request handlers. This
facility is for developers and professional testers to ensure
coverage of the rqst deferral code paths. To date, we haven't
had an adequate way to ensure these code paths are covered
during testing, short of temporary code changes to force their
use.

A file called /sys/kernel/debug/fail_sunrpc/ignore-cache-wait
enables administrators to disable cache deferral injection while
allowing other types of sunrpc errors to be injected. The default
setting is that cache deferral injection is enabled (ignore=false).

To enable support for cache deferral injection,
CONFIG_FAULT_INJECTION, CONFIG_FAULT_INJECTION_DEBUG_FS, and
CONFIG_SUNRPC_DEBUG must all be set to "Y".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

37324e6b

SUNRPC: Clean up svc_deferred_class trace events · 45cb7955

Chuck Lever authored Apr 14, 2022

Replace the temporary fix from commit 4d500445 ("SUNRPC: Fix the
svc_deferred_event trace class") with the use of __sockaddr and
friends, which is the preferred solution (but only available in 5.18
and newer).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

45cb7955

NFSD: Clean up nfsd_splice_actor() · 91e23b1c

Chuck Lever authored Apr 07, 2022

nfsd_splice_actor() checks that the page being spliced does not
match the previous element in the svc_rqst::rq_pages array. We
believe this is to prevent a double put_page() in cases where the
READ payload is partially contained in the xdr_buf's head buffer.

However, the NFSD READ proc functions no longer place any part of
the READ payload in the head buffer, in order to properly support
NFS/RDMA READ with Write chunks. Therefore, simplify the logic in
nfsd_splice_actor() to remove this unnecessary check.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

91e23b1c

16 May, 2022 1 commit
- Linux 5.18-rc7 · 42226c98
  Linus Torvalds authored May 15, 2022
  
  42226c98
15 May, 2022 1 commit

Merge tag 'driver-core-5.18-rc7' of... · 0cdd776e

Linus Torvalds authored May 15, 2022

Merge tag 'driver-core-5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here is one fix, and three documentation updates for 5.18-rc7.

  The fix is for the firmware loader which resolves a long-reported
  problem where the credentials of the firmware loader could be set to a
  userspace process without enough permissions to actually load the
  firmware image. Many Android vendors have been reporting this for
  quite some time.

  The documentation updates are for the embargoed-hardware-issues.rst
  file to add a new entry, change an existing one, and sort the list to
  make changes easier in the future.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation/process: Update ARM contact for embargoed hardware issues
  Documentation/process: Add embargoed HW contact for Ampere Computing
  Documentation/process: Make groups alphabetical and use tabs consistently
  firmware_loader: use kernel credentials when reading firmware

0cdd776e