Commits · daab110e47f8d7aa6da66923e3ac1a8dbd2b2a72 · Kirill Smelkov / linux

09 Dec, 2020 12 commits

nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations · daab110e

Jeff Layton authored Nov 30, 2020

With NFSv3 nfsd will always attempt to send along WCC data to the
client. This generally involves saving off the in-core inode information
prior to doing the operation on the given filehandle, and then issuing a
vfs_getattr to it after the op.

Some filesystems (particularly clustered or networked ones) have an
expensive ->getattr inode operation. Atomicity is also often difficult
or impossible to guarantee on such filesystems. For those, we're best
off not trying to provide WCC information to the client at all, and to
simply allow it to poll for that information as needed with a GETATTR
RPC.

This patch adds a new flags field to struct export_operations, and
defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate
that nfsd should not attempt to provide WCC info in NFSv3 replies. It
also adds a blurb about the new flags field and flag to the exporting
documentation.

The server will also now skip collecting this information for NFSv2 as
well, since that info is never used there anyway.

Note that this patch does not add this flag to any filesystem
export_operations structures. This was originally developed to allow
reexporting nfs via nfsd.

Other filesystems may want to consider enabling this flag too. It's hard
to tell however which ones have export operations to enable export via
knfsd and which ones mostly rely on them for open-by-filehandle support,
so I'm leaving that up to the individual maintainers to decide. I am
cc'ing the relevant lists for those filesystems that I think may want to
consider adding this though.

Cc: HPDD-discuss@lists.01.org
Cc: ceph-devel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: fuse-devel@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

daab110e

Revert "nfsd4: support change_attr_type attribute" · 1631087b

J. Bruce Fields authored Nov 30, 2020

This reverts commit a8585763.

We're still factoring ctime into our change attribute even in the
IS_I_VERSION case.  If someone sets the system time backwards, a client
could see the change attribute go backwards.  Maybe we can just say
"well, don't do that", but there's some question whether that's good
enough, or whether we need a better guarantee.

Also, the client still isn't actually using the attribute.

While we're still figuring this out, let's just stop returning this
attribute.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

1631087b

nfsd4: don't query change attribute in v2/v3 case · 942b20dc

J. Bruce Fields authored Nov 30, 2020

inode_query_iversion() has side effects, and there's no point calling it
when we're not even going to use it.

We check whether we're currently processing a v4 request by checking
fh_maxsize, which is arguably a little hacky; we could add a flag to
svc_fh instead.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

942b20dc

nfsd: minor nfsd4_change_attribute cleanup · 4b03d997

J. Bruce Fields authored Nov 30, 2020

Minor cleanup, no change in behavior.

Also pull out a common helper that'll be useful elsewhere.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4b03d997

nfsd: simplify nfsd4_change_info · b2140338

J. Bruce Fields authored Nov 30, 2020

It doesn't make sense to carry all these extra fields around.  Just
make everything into change attribute from the start.

This is just cleanup, there should be no change in behavior.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

b2140338

nfsd: only call inode_query_iversion in the I_VERSION case · 70b87f77

J. Bruce Fields authored Nov 30, 2020

inode_query_iversion() can modify i_version.  Depending on the exported
filesystem, that may not be safe.  For example, if you're re-exporting
NFS, NFS stores the server's change attribute in i_version and does not
expect it to be modified locally.  This has been observed causing
unnecessary cache invalidations.

The way a filesystem indicates that it's OK to call
inode_query_iverson() is by setting SB_I_VERSION.

So, move the I_VERSION check out of encode_change(), where it's used
only in GETATTR responses, to nfsd4_change_attribute(), which is
also called for pre- and post- operation attributes.

(Note we could also pull the NFSEXP_V4ROOT case into
nfsd4_change_attribute() as well.  That would actually be a no-op,
since pre/post attrs are only used for metadata-modifying operations,
and V4ROOT exports are read-only.  But we might make the change in
the future just for simplicity.)
Reported-by: Daire Byrne <daire@dneg.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

70b87f77

nfs_common: need lock during iterate through the list · 4a9d81ca

Cheng Lin authored Dec 01, 2020

If the elem is deleted during be iterated on it, the iteration
process will fall into an endless loop.

kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [nfsd:17137]

PID: 17137  TASK: ffff8818d93c0000  CPU: 4   COMMAND: "nfsd"
    [exception RIP: __state_in_grace+76]
    RIP: ffffffffc00e817c  RSP: ffff8818d3aefc98  RFLAGS: 00000246
    RAX: ffff881dc0c38298  RBX: ffffffff81b03580  RCX: ffff881dc02c9f50
    RDX: ffff881e3fce8500  RSI: 0000000000000001  RDI: ffffffff81b03580
    RBP: ffff8818d3aefca0   R8: 0000000000000020   R9: ffff8818d3aefd40
    R10: ffff88017fc03800  R11: ffff8818e83933c0  R12: ffff8818d3aefd40
    R13: 0000000000000000  R14: ffff8818e8391068  R15: ffff8818fa6e4000
    CS: 0010  SS: 0018
 #0 [ffff8818d3aefc98] opens_in_grace at ffffffffc00e81e3 [grace]
 #1 [ffff8818d3aefca8] nfs4_preprocess_stateid_op at ffffffffc02a3e6c [nfsd]
 #2 [ffff8818d3aefd18] nfsd4_write at ffffffffc028ed5b [nfsd]
 #3 [ffff8818d3aefd80] nfsd4_proc_compound at ffffffffc0290a0d [nfsd]
 #4 [ffff8818d3aefdd0] nfsd_dispatch at ffffffffc027b800 [nfsd]
 #5 [ffff8818d3aefe08] svc_process_common at ffffffffc02017f3 [sunrpc]
 #6 [ffff8818d3aefe70] svc_process at ffffffffc0201ce3 [sunrpc]
 #7 [ffff8818d3aefe98] nfsd at ffffffffc027b117 [nfsd]
 #8 [ffff8818d3aefec8] kthread at ffffffff810b88c1
 #9 [ffff8818d3aeff50] ret_from_fork at ffffffff816d1607

The troublemake elem:
crash> lock_manager ffff881dc0c38298
struct lock_manager {
  list = {
    next = 0xffff881dc0c38298,
    prev = 0xffff881dc0c38298
  },
  block_opens = false
}

Fixes: c87fb4a3 ("lockd: NLM grace period shouldn't block NFSv4 opens")
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4a9d81ca

NFSD: Fix 5 seconds delay when doing inter server copy · ca9364dd

Dai Ngo authored Nov 30, 2020

Since commit b4868b44 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.

Fix by modifying nfs4_init_cp_state to return the stateid with seqid 1
instead of 0. This is also to conform with section 4.8 of RFC 7862.

Here is the relevant paragraph from section 4.8 of RFC 7862:

   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
   copy offload operation, it is inappropriate to indicate "the most
   recent copy offload operation" using a stateid with a seqid of zero
   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
   stateid refers to internal state in the server and there may be
   several asynchronous COPY operations being performed in parallel on
   the same file by the server.  Therefore, a copy offload stateid with
   a seqid of zero MUST be considered invalid.

Fixes: ce0887ac ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

ca9364dd

NFSD: Fix sparse warning in nfs4proc.c · eb162e17

Chuck Lever authored Nov 30, 2020

linux/fs/nfsd/nfs4proc.c:1542:24: warning: incorrect type in assignment (different base types)
linux/fs/nfsd/nfs4proc.c:1542:24: expected restricted __be32 [assigned] [usertype] status
linux/fs/nfsd/nfs4proc.c:1542:24: got int

Clean-up: The dup_copy_fields() function returns only zero, so make
it return void for now, and get rid of the return code check.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

eb162e17

SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall · 5e54dafb

Chuck Lever authored Nov 24, 2020

There's no need to defer allocation of pages for the receive buffer.

- This upcall is quite infrequent
- gssp_alloc_receive_pages() can allocate the pages with GFP_KERNEL,
  unlike the transport
- gssp_alloc_receive_pages() knows exactly how many pages are needed
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>

5e54dafb

sunrpc: clean-up cache downcall · 4b5cff7e

Roberto Bergantinos Corpas authored Nov 27, 2020

We can simplify code around cache_downcall unifying memory
allocations using kvmalloc. This has the benefit of getting rid of
cache_slow_downcall (and queue_io_mutex), and also matches userland
allocation size and limits.
Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4b5cff7e

nfsd: Fix message level for normal termination · 4420440c

kazuo ito authored Nov 27, 2020

The warning message from nfsd terminating normally
can confuse system adminstrators or monitoring software.

Though it's not exactly fair to pin-point a commit where it
originated, the current form in the current place started
to appear in:

Fixes: e096bbc6 ("knfsd: remove special handling for SIGHUP")
Signed-off-by: kazuo ito <kzpn200@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

4420440c

30 Nov, 2020 28 commits
- NFSD: Remove macros that are no longer used · 5cfc822f
  Chuck Lever authored Nov 04, 2020
```
Now that all the NFSv4 decoder functions have been converted to
make direct calls to the xdr helpers, remove the unused C macros.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  5cfc822f
- NFSD: Replace READ* macros in nfsd4_decode_compound() · d9b74bda
  Chuck Lever authored Nov 04, 2020
```
And clean-up: Now that we have removed the DECODE_TAIL macro from
nfsd4_decode_compound(), we observe that there's no benefit for
nfsd4_decode_compound() to return nfs_ok or nfserr_bad_xdr only to
have its sole caller convert those values to one or zero,
respectively. Have nfsd4_decode_compound() return 1/0 instead.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  d9b74bda
- NFSD: Make nfsd4_ops::opnum a u32 · 3a237b4a
  Chuck Lever authored Nov 22, 2020
```
Avoid passing a "pointer to int" argument to xdr_stream_decode_u32.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  3a237b4a
- NFSD: Replace READ* macros in nfsd4_decode_listxattrs() · 2212036c
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  2212036c
- NFSD: Replace READ* macros in nfsd4_decode_setxattr() · 403366a7
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  403366a7
- NFSD: Replace READ* macros in nfsd4_decode_xattr_name() · 830c7150
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  830c7150
- NFSD: Replace READ* macros in nfsd4_decode_clone() · 3dfd0b0e
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  3dfd0b0e
- NFSD: Replace READ* macros in nfsd4_decode_seek() · 9d32b412
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  9d32b412
- NFSD: Replace READ* macros in nfsd4_decode_offload_status() · 2846bb05
  Chuck Lever authored Nov 21, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  2846bb05
- NFSD: Replace READ* macros in nfsd4_decode_copy_notify() · f9a953fb
  Chuck Lever authored Nov 21, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  f9a953fb
- NFSD: Replace READ* macros in nfsd4_decode_copy() · e8febea7
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  e8febea7
- NFSD: Replace READ* macros in nfsd4_decode_nl4_server() · f49e4b4d
  Chuck Lever authored Nov 16, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  f49e4b4d
- NFSD: Replace READ* macros in nfsd4_decode_fallocate() · 6aef27aa
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  6aef27aa
- NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete() · 0d646784
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  0d646784
- NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid() · c95f2ec3
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  c95f2ec3
- NFSD: Replace READ* macros in nfsd4_decode_test_stateid() · b7a0c8f6
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  b7a0c8f6
- NFSD: Replace READ* macros in nfsd4_decode_sequence() · cf907b11
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  cf907b11
- NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name() · 53d70873
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  53d70873
- NFSD: Replace READ* macros in nfsd4_decode_layoutreturn() · 645fcad3
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  645fcad3
- NFSD: Replace READ* macros in nfsd4_decode_layoutget() · c8e88e3a
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  c8e88e3a
- NFSD: Replace READ* macros in nfsd4_decode_layoutcommit() · 5185980d
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  5185980d
- NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo() · 04495971
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  04495971
- NFSD: Replace READ* macros in nfsd4_decode_free_stateid() · aec387d5
  Chuck Lever authored Nov 01, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  aec387d5
- NFSD: Replace READ* macros in nfsd4_decode_destroy_session() · 94e254af
  Chuck Lever authored Nov 04, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  94e254af
- NFSD: Replace READ* macros in nfsd4_decode_create_session() · 81243e3f
  Chuck Lever authored Nov 03, 2020
```
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  81243e3f
- NFSD: Add a helper to decode channel_attrs4 · 3a3f1fba
  Chuck Lever authored Nov 16, 2020
```
De-duplicate some code.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  3a3f1fba
- NFSD: Add a helper to decode nfs_impl_id4 · 10ff8422
  Chuck Lever authored Nov 16, 2020
```
Refactor for clarity.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  10ff8422
- NFSD: Add a helper to decode state_protect4_a · 523ec6ed
  Chuck Lever authored Nov 02, 2020
```
Refactor for clarity.

Also, remove a stale comment. Commit ed941643 ("nfsd: implement
machine credential support for some operations") added support for
SP4_MACH_CRED, so state_protect_a is no longer completely ignored.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
```
  523ec6ed