1. 25 Jun, 2024 2 commits
  2. 17 Jun, 2024 2 commits
    • Lorenzo Bianconi's avatar
      NFSD: grab nfsd_mutex in nfsd_nl_rpc_status_get_dumpit() · da2c8fef
      Lorenzo Bianconi authored
      Grab nfsd_mutex lock in nfsd_nl_rpc_status_get_dumpit routine and remove
      nfsd_nl_rpc_status_get_start() and nfsd_nl_rpc_status_get_done(). This
      patch fix the syzbot log reported below:
      
      INFO: task syz-executor.1:17770 blocked for more than 143 seconds.
            Not tainted 6.10.0-rc3-syzkaller-00022-gcea2a265 #0
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:syz-executor.1  state:D stack:23800 pid:17770 tgid:17767 ppid:11381  flags:0x00000006
      Call Trace:
       <TASK>
       context_switch kernel/sched/core.c:5408 [inline]
       __schedule+0x17e8/0x4a20 kernel/sched/core.c:6745
       __schedule_loop kernel/sched/core.c:6822 [inline]
       schedule+0x14b/0x320 kernel/sched/core.c:6837
       schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6894
       __mutex_lock_common kernel/locking/mutex.c:684 [inline]
       __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
       nfsd_nl_listener_get_doit+0x115/0x5d0 fs/nfsd/nfsctl.c:2124
       genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0xb16/0xec0 net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x1e5/0x430 net/netlink/af_netlink.c:2564
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ec/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x223/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
       ___sys_sendmsg net/socket.c:2639 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f24ed27cea9
      RSP: 002b:00007f24ee0080c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f24ed3b3f80 RCX: 00007f24ed27cea9
      RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000005
      RBP: 00007f24ed2ebff4 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      
      Fixes: 1bd773b4 ("nfsd: hold nfsd_mutex across entire netlink operation")
      Fixes: bd9d6a3e ("NFSD: add rpc_status netlink support")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      da2c8fef
    • Jeff Layton's avatar
      nfsd: fix oops when reading pool_stats before server is started · 8e948c36
      Jeff Layton authored
      Sourbh reported an oops that is triggerable by trying to read the
      pool_stats procfile before nfsd had been started. Move the check for a
      NULL serv in svc_pool_stats_start above the mutex acquisition, and fix
      the stop routine not to unlock the mutex if there is no serv yet.
      
      Fixes: 7b207ccd ("svc: don't hold reference for poolstats, only mutex.")
      Reported-by: default avatarSourabh Jain <sourabhjain@linux.ibm.com>
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Tested-by: default avatarSourabh Jain <sourabhjain@linux.ibm.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8e948c36
  3. 03 Jun, 2024 1 commit
  4. 09 May, 2024 2 commits
    • Chuck Lever's avatar
      NFSD: Force all NFSv4.2 COPY requests to be synchronous · 8d915bbf
      Chuck Lever authored
      We've discovered that delivering a CB_OFFLOAD operation can be
      unreliable in some pretty unremarkable situations. Examples
      include:
      
       - The server dropped the connection because it lost a forechannel
         NFSv4 request and wishes to force the client to retransmit
       - The GSS sequence number window under-flowed
       - A network partition occurred
      
      When that happens, all pending callback operations, including
      CB_OFFLOAD, are lost. NFSD does not retransmit them.
      
      Moreover, the Linux NFS client does not yet support sending an
      OFFLOAD_STATUS operation to probe whether an asynchronous COPY
      operation has finished. Thus, on Linux NFS clients, when a
      CB_OFFLOAD is lost, asynchronous COPY can hang until manually
      interrupted.
      
      I've tried a couple of remedies, but so far the side-effects are
      worse than the disease and they have had to be reverted. So
      temporarily force COPY operations to be synchronous so that the use
      of CB_OFFLOAD is avoided entirely. This is a fix that can easily be
      backported to LTS kernels. I am working on client patches that
      introduce an implementation of OFFLOAD_STATUS.
      
      Note that NFSD arbitrarily limits the size of a copy_file_range
      to 4MB to avoid indefinitely blocking an nfsd thread. A short
      COPY result is returned in that case, and the client can present
      a fresh COPY request for the remainder.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8d915bbf
    • Chuck Lever's avatar
      SUNRPC: Fix gss_free_in_token_pages() · bafa6b4d
      Chuck Lever authored
      Dan Carpenter says:
      > Commit 5866efa8 ("SUNRPC: Fix svcauth_gss_proxy_init()") from Oct
      > 24, 2019 (linux-next), leads to the following Smatch static checker
      > warning:
      >
      > 	net/sunrpc/auth_gss/svcauth_gss.c:1039 gss_free_in_token_pages()
      > 	warn: iterator 'i' not incremented
      >
      > net/sunrpc/auth_gss/svcauth_gss.c
      >     1034 static void gss_free_in_token_pages(struct gssp_in_token *in_token)
      >     1035 {
      >     1036         u32 inlen;
      >     1037         int i;
      >     1038
      > --> 1039         i = 0;
      >     1040         inlen = in_token->page_len;
      >     1041         while (inlen) {
      >     1042                 if (in_token->pages[i])
      >     1043                         put_page(in_token->pages[i]);
      >                                                          ^
      > This puts page zero over and over.
      >
      >     1044                 inlen -= inlen > PAGE_SIZE ? PAGE_SIZE : inlen;
      >     1045         }
      >     1046
      >     1047         kfree(in_token->pages);
      >     1048         in_token->pages = NULL;
      >     1049 }
      
      Based on the way that the ->pages[] array is constructed in
      gss_read_proxy_verf(), we know that once the loop encounters a NULL
      page pointer, the remaining array elements must also be NULL.
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Suggested-by: default avatarTrond Myklebust <trondmy@hammerspace.com>
      Fixes: 5866efa8 ("SUNRPC: Fix svcauth_gss_proxy_init()")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      bafa6b4d
  5. 06 May, 2024 27 commits
  6. 05 May, 2024 6 commits