1. 06 Jan, 2023 2 commits
    • Chuck Lever's avatar
      Revert "SUNRPC: Use RMW bitops in single-threaded hot paths" · 7827c81f
      Chuck Lever authored
      The premise that "Once an svc thread is scheduled and executing an
      RPC, no other processes will touch svc_rqst::rq_flags" is false.
      svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd
      threads when determining which thread to wake up next.
      
      Found via KCSAN.
      
      Fixes: 28df0988 ("SUNRPC: Use RMW bitops in single-threaded hot paths")
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      7827c81f
    • Jeff Layton's avatar
      nfsd: fix handling of cached open files in nfsd4_open codepath · 0b3a551f
      Jeff Layton authored
      Commit fb70bf12 ("NFSD: Instantiate a struct file when creating a
      regular NFSv4 file") added the ability to cache an open fd over a
      compound. There are a couple of problems with the way this currently
      works:
      
      It's racy, as a newly-created nfsd_file can end up with its PENDING bit
      cleared while the nf is hashed, and the nf_file pointer is still zeroed
      out. Other tasks can find it in this state and they expect to see a
      valid nf_file, and can oops if nf_file is NULL.
      
      Also, there is no guarantee that we'll end up creating a new nfsd_file
      if one is already in the hash. If an extant entry is in the hash with a
      valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with
      the value of op_file and the old nf_file will leak.
      
      Fix both issues by making a new nfsd_file_acquirei_opened variant that
      takes an optional file pointer. If one is present when this is called,
      we'll take a new reference to it instead of trying to open the file. If
      the nfsd_file already has a valid nf_file, we'll just ignore the
      optional file and pass the nfsd_file back as-is.
      
      Also rework the tracepoints a bit to allow for an "opened" variant and
      don't try to avoid counting acquisitions in the case where we already
      have a cached open file.
      
      Fixes: fb70bf12 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file")
      Cc: Trond Myklebust <trondmy@hammerspace.com>
      Reported-by: default avatarStanislav Saner <ssaner@redhat.com>
      Reported-and-Tested-by: default avatarRuben Vestergaard <rubenv@drcmr.dk>
      Reported-and-Tested-by: default avatarTorkil Svensgaard <torkil@drcmr.dk>
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      0b3a551f
  2. 02 Jan, 2023 1 commit
  3. 22 Dec, 2022 1 commit
  4. 14 Dec, 2022 1 commit
  5. 12 Dec, 2022 1 commit
    • Dan Aloni's avatar
      nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure · 3bc8edc9
      Dan Aloni authored
      On error situation `clp->cl_cb_conn.cb_xprt` should not be given
      a reference to the xprt otherwise both client cleanup and the
      error handling path of the caller call to put it. Better to
      delay handing over the reference to a later branch.
      
      [   72.530665] refcount_t: underflow; use-after-free.
      [   72.531933] WARNING: CPU: 0 PID: 173 at lib/refcount.c:28 refcount_warn_saturate+0xcf/0x120
      [   72.533075] Modules linked in: nfsd(OE) nfsv4(OE) nfsv3(OE) nfs(OE) lockd(OE) compat_nfs_ssc(OE) nfs_acl(OE) rpcsec_gss_krb5(OE) auth_rpcgss(OE) rpcrdma(OE) dns_resolver fscache netfs grace rdma_cm iw_cm ib_cm sunrpc(OE) mlx5_ib mlx5_core mlxfw pci_hyperv_intf ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set overlay nf_tables nfnetlink crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xfs serio_raw virtio_net virtio_blk net_failover failover fuse [last unloaded: sunrpc]
      [   72.540389] CPU: 0 PID: 173 Comm: kworker/u16:5 Tainted: G           OE     5.15.82-dan #1
      [   72.541511] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module+el8.7.0+1084+97b81f61 04/01/2014
      [   72.542717] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd]
      [   72.543575] RIP: 0010:refcount_warn_saturate+0xcf/0x120
      [   72.544299] Code: 55 00 0f 0b 5d e9 01 50 98 00 80 3d 75 9e 39 08 00 0f 85 74 ff ff ff 48 c7 c7 e8 d1 60 8e c6 05 61 9e 39 08 01 e8 f6 51 55 00 <0f> 0b 5d e9 d9 4f 98 00 80 3d 4b 9e 39 08 00 0f 85 4c ff ff ff 48
      [   72.546666] RSP: 0018:ffffb3f841157cf0 EFLAGS: 00010286
      [   72.547393] RAX: 0000000000000026 RBX: ffff89ac6231d478 RCX: 0000000000000000
      [   72.548324] RDX: ffff89adb7c2c2c0 RSI: ffff89adb7c205c0 RDI: ffff89adb7c205c0
      [   72.549271] RBP: ffffb3f841157cf0 R08: 0000000000000000 R09: c0000000ffefffff
      [   72.550209] R10: 0000000000000001 R11: ffffb3f841157ad0 R12: ffff89ac6231d180
      [   72.551142] R13: ffff89ac6231d478 R14: ffff89ac40c06180 R15: ffff89ac6231d4b0
      [   72.552089] FS:  0000000000000000(0000) GS:ffff89adb7c00000(0000) knlGS:0000000000000000
      [   72.553175] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   72.553934] CR2: 0000563a310506a8 CR3: 0000000109a66000 CR4: 0000000000350ef0
      [   72.554874] Call Trace:
      [   72.555278]  <TASK>
      [   72.555614]  svc_xprt_put+0xaf/0xe0 [sunrpc]
      [   72.556276]  nfsd4_process_cb_update.isra.11+0xb7/0x410 [nfsd]
      [   72.557087]  ? update_load_avg+0x82/0x610
      [   72.557652]  ? cpuacct_charge+0x60/0x70
      [   72.558212]  ? dequeue_entity+0xdb/0x3e0
      [   72.558765]  ? queued_spin_unlock+0x9/0x20
      [   72.559358]  nfsd4_run_cb_work+0xfc/0x270 [nfsd]
      [   72.560031]  process_one_work+0x1df/0x390
      [   72.560600]  worker_thread+0x37/0x3b0
      [   72.561644]  ? process_one_work+0x390/0x390
      [   72.562247]  kthread+0x12f/0x150
      [   72.562710]  ? set_kthread_struct+0x50/0x50
      [   72.563309]  ret_from_fork+0x22/0x30
      [   72.563818]  </TASK>
      [   72.564189] ---[ end trace 031117b1c72ec616 ]---
      [   72.566019] list_add corruption. next->prev should be prev (ffff89ac4977e538), but was ffff89ac4763e018. (next=ffff89ac4763e018).
      [   72.567647] ------------[ cut here ]------------
      
      Fixes: a4abc6b1 ("nfsd: Fix svc_xprt refcnt leak when setup callback client failed")
      Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarDan Aloni <dan.aloni@vastdata.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      3bc8edc9
  6. 11 Dec, 2022 1 commit
    • Jeff Layton's avatar
      nfsd: rework refcounting in filecache · ac3a2585
      Jeff Layton authored
      The filecache refcounting is a bit non-standard for something searchable
      by RCU, in that we maintain a sentinel reference while it's hashed. This
      in turn requires that we have to do things differently in the "put"
      depending on whether its hashed, which we believe to have led to races.
      
      There are other problems in here too. nfsd_file_close_inode_sync can end
      up freeing an nfsd_file while there are still outstanding references to
      it, and there are a number of subtle ToC/ToU races.
      
      Rework the code so that the refcount is what drives the lifecycle. When
      the refcount goes to zero, then unhash and rcu free the object. A task
      searching for a nfsd_file is allowed to bump its refcount, but only if
      it's not already 0. Ensure that we don't make any other changes to it
      until a reference is held.
      
      With this change, the LRU carries a reference. Take special care to deal
      with it when removing an entry from the list, and ensure that we only
      repurpose the nf_lru list_head when the refcount is 0 to ensure
      exclusive access to it.
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      ac3a2585
  7. 10 Dec, 2022 21 commits
  8. 28 Nov, 2022 12 commits