Commits · 3803d6721baff3d5dd6cd6b8c7294e54d124bc41 · Kirill Smelkov / linux

03 Feb, 2020 14 commits

NFS: Use kmemdup_nul() in nfs_readdir_make_qstr() · 3803d672

Trond Myklebust authored Feb 02, 2020

The directory strings stored in the readdir cache may be used with
printk(), so it is better to ensure they are nul-terminated.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

3803d672

NFS: Directory page cache pages need to be locked when read · 114de382

Trond Myklebust authored Feb 02, 2020

When a NFS directory page cache page is removed from the page cache,
its contents are freed through a call to nfs_readdir_clear_array().
To prevent the removal of the page cache entry until after we've
finished reading it, we must take the page lock.

Fixes: 11de3b11 ("NFS: Fix a memory leak in nfs_readdir")
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

114de382

NFS: Fix memory leaks and corruption in readdir · 4b310319

Trond Myklebust authored Feb 02, 2020

nfs_readdir_xdr_to_array() must not exit without having initialised
the array, so that the page cache deletion routines can safely
call nfs_readdir_clear_array().
Furthermore, we should ensure that if we exit nfs_readdir_filler()
with an error, we free up any page contents to prevent a leak
if we try to fill the page again.

Fixes: 11de3b11 ("NFS: Fix a memory leak in nfs_readdir")
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

4b310319

SUNRPC: Use kmemdup_nul() in rpc_parse_scope_id() · 7ccbddbe

Trond Myklebust authored Feb 02, 2020

Using kmemdup_nul() is more efficient when the length is known.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

7ccbddbe

NFS: Replace various occurrences of kstrndup() with kmemdup_nul() · a8bd9ddf

Trond Myklebust authored Feb 02, 2020

When we already know the string length, it is more efficient to
use kmemdup_nul().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
[Anna - Changes to super.c were already made during fscontext conversion]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

a8bd9ddf

NFSv4: Limit the total number of cached delegations · 10717f45

Trond Myklebust authored Jan 27, 2020

Delegations can be expensive to return, and can cause scalability issues
for the server. Let's therefore try to limit the number of inactive
delegations we hold.
Once the number of delegations is above a certain threshold, start
to return them on close.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

10717f45

NFSv4: Add accounting for the number of active delegations held · d2269ea1

Trond Myklebust authored Jan 27, 2020

In order to better manage our delegation caching, add a counter
to track the number of active delegations.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d2269ea1

NFSv4: Try to return the delegation immediately when marked for return on close · b7b7dac6

Trond Myklebust authored Jan 27, 2020

Add a routine to return the delegation immediately upon close of the
file if it was marked for return-on-close.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b7b7dac6

NFS: Clear NFS_DELEGATION_RETURN_IF_CLOSED when the delegation is returned · 0d104167

Trond Myklebust authored Jan 27, 2020

If a delegation is marked as needing to be returned when the file is
closed, then don't clear that marking until we're ready to return
it.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

0d104167

NFSv4: nfs_inode_evict_delegation() should set NFS_DELEGATION_RETURNING · f885ea64

Trond Myklebust authored Jan 27, 2020

In particular, the pnfs return-on-close code will check for that flag,
so ensure we set it appropriately.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

f885ea64

NFS: nfs_find_open_context() should use cred_fscmp() · 65f51603

Trond Myklebust authored Jan 26, 2020

We want to find open contexts that match our filesystem access
properties. They don't have to exactly match the cred.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

65f51603

NFS: nfs_access_get_cached_rcu() should use cred_fscmp() · 9a206de2

Trond Myklebust authored Jan 26, 2020

We do not need to have the rcu lookup method fail in the case where
the fsuid/fsgid and supplemental groups match.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

9a206de2

NFSv4: pnfs_roc() must use cred_fscmp() to compare creds · 38712247

Trond Myklebust authored Jan 26, 2020

When comparing two 'struct cred' for equality w.r.t. behaviour under
filesystem access, we need to use cred_fscmp().

Fixes: a52458b4 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

38712247

NFS: remove unused macros · c0399cf6

Alex Shi authored Jan 21, 2020

MNT_fhs_status_sz/MNT_fhandle3_sz are never used after they were
introduced. So better to remove them.
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

c0399cf6

24 Jan, 2020 3 commits

nfs: Return EINVAL rather than ERANGE for mount parse errors · 3a21409a

David Howells authored Jan 17, 2020

Return EINVAL rather than ERANGE for mount parse errors as the userspace
mount command doesn't necessarily understand what to do with anything other
than EINVAL.

The old code returned -ERANGE as an intermediate error that then get
converted to -EINVAL, whereas the new code returns -ERANGE.

This was induced by passing minorversion=1 to a v4 mount where
CONFIG_NFS_V4_1 was disabled in the kernel build.

Fixes: 68f65ef40e1e ("NFS: Convert mount option parsing to use functionality from fs_parser.h")
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

3a21409a

NFS: allow deprecation of NFS UDP protocol · b24ee6c6

Olga Kornievskaia authored Dec 16, 2019

Add a kernel config CONFIG_NFS_DISABLE_UDP_SUPPORT to disallow NFS
UDP mounts and enable it by default.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b24ee6c6

NFS: Add softreval behaviour to nfs_lookup_revalidate() · f7b37b8b

Trond Myklebust authored Jan 14, 2020

If the server is unavaliable, we want to allow the revalidating
lookup to time out, and to default to validating the cached dentry
if the 'softreval' mount option is set.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

f7b37b8b

15 Jan, 2020 23 commits

NFSv3: FIx bug when using chacl and chmod to change acl · fe1e8dbe

Su Yanjun authored Dec 25, 2019

We find a bug when running test under nfsv3  as below.
1)
chacl u::r--,g::rwx,o:rw- file1
2)
chmod u+w file1
3)
chacl -l file1

We expect u::rw-, but it shows u::r--, more likely it returns the
cached acl in inode.

We dig the code find that the code path is different.

chacl->..->__nfs3_proc_setacls->nfs_zap_acl_cache
Then nfs_zap_acl_cache clears the NFS_INO_INVALID_ACL in
NFS_I(inode)->cache_validity.

chmod->..->nfs3_proc_setattr
Because NFS_INO_INVALID_ACL has been cleared by chacl path,
nfs_zap_acl_cache wont be called.

nfs_setattr_update_inode will set NFS_INO_INVALID_ACL so let it
before nfs_zap_acl_cache call.
Signed-off-by: Su Yanjun <suyanjun218@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

fe1e8dbe

NFSv4.x recover from pre-mature loss of openstateid · d826e5b8

Olga Kornievskaia authored Dec 18, 2019

Ever since the commit 0e0cb35b, it's possible to lose an open stateid
while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens,
operations that require openstateid fail with EAGAIN which is propagated
to the application then tests like generic/446 and generic/168 fail with
"Resource temporarily unavailable".

Instead of returning this error, initiate state recovery when possible to
recover the open stateid and then try calling nfs4_select_rw_stateid()
again.

Fixes: 0e0cb35b ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d826e5b8

NFSv4 fix acl retrieval over krb5i/krb5p mounts · 62a1573f

Olga Kornievskaia authored Jan 02, 2020

For the krb5i and krb5p mount, it was problematic to truncate the
received ACL to the provided buffer because an integrity check
could not be preformed.

Instead, provide enough pages to accommodate the largest buffer
bounded by the largest RPC receive buffer size.

Note: I don't think it's possible for the ACL to be truncated now.
Thus NFS4_ACL_TRUNC flag and related code could be possibly
removed but since I'm unsure, I'm leaving it.

v2: needs +1 page.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

62a1573f

NFS: Add mount option 'softreval' · c74dfe97

Trond Myklebust authored Jan 06, 2020

Add a mount option 'softreval' that allows attribute revalidation 'getattr'
calls to time out, and causes them to fall back to using the cached
attributes.
The use case for this option is for ensuring that we can still (slowly)
traverse paths and use cached information even when the server is down.
Once the server comes back up again, the getattr calls start succeeding,
and the caches will revalidate as usual.

The 'softreval' mount option is automatically enabled if you have
specified 'softerr'.  It can be turned off using the options
'nosoftreval', or 'hard'.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

c74dfe97

NFS: Trust cached access if we've already revalidated the inode once · 5c965db8

Trond Myklebust authored Jan 06, 2020

If we've already revalidated the inode once then don't distrust the
access cache unless the NFS_INO_INVALID_ACCESS flag is actually set.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

5c965db8

NFS: Fix nfs_direct_write_reschedule_io() · 4daaeba9

Trond Myklebust authored Jan 06, 2020

The 'hdr->good_bytes' is defined as the number of bytes we expect to
read or write starting at offset hdr->io_start. In the case of a partial
read/write we may end up adjusting hdr->args.offset and hdr->args.count
to skip I/O for data that was already read/written, and so we must ensure
the calculation takes that into account.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

4daaeba9

NFS: When resending after a short write, reset the reply count to zero · 8c9cb714

Trond Myklebust authored Jan 06, 2020

If we're resending a write due to a short read or write, ensure we
reset the reply count to zero.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

8c9cb714

NFS: Improve tracing of permission calls · e8194b7d

Trond Myklebust authored Jan 06, 2020

On exit from nfs_do_access(), record the mask representing the requested
permissions, as well as the server-supplied set of access rights for
this user.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

e8194b7d

pNFS/flexfiles: Add tracing for layout errors · 088f3e68

Trond Myklebust authored Jan 06, 2020

Trace layout errors for pNFS/flexfiles on read/write/commit operations.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

088f3e68

NFS: Clean up generic file commit tracepoint · 7bdd297e

Trond Myklebust authored Jan 06, 2020

Clean up the generic file commit tracepoints to use a 64-bit value
for the verifier, and to display the pNFS filehandle, if it exists.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

7bdd297e

NFS: Clean up generic writeback tracepoints · 5bb2a7cb

Trond Myklebust authored Jan 06, 2020

Clean up the generic writeback tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually written.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

5bb2a7cb

NFS: Clean up generic file read tracepoints · 2343172d

Trond Myklebust authored Jan 06, 2020

Clean up the generic file read tracepoints so they do pass the
full structures as arguments. Also ensure we report the number
of bytes actually read.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

2343172d

pNFS/flexfiles: Record resend attempts on I/O failure · 0722dc9f

Trond Myklebust authored Jan 06, 2020

If the attempt to do pNFS fails, then record what action we
take to recover (resend, reset to pnfs or reset to mds).
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

0722dc9f

NFS: Fix fix of show_nfs_errors · 118b6292

Trond Myklebust authored Jan 06, 2020

Casting a negative value to an unsigned long is not the same as
converting it to its absolute value.

Fixes: 96650e2e ("NFS: Fix show_nfs_errors macros again")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

118b6292

NFSv4: Improve read/write/commit tracing · 25925b00

Trond Myklebust authored Jan 06, 2020

Ensure we always return the number of bytes read/written. Also display
the pnfs filehandle if it is in use.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

25925b00

NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes() · 221203ce

Trond Myklebust authored Jan 06, 2020

Instead of making assumptions about the commit verifier contents, change
the commit code to ensure we always check that the verifier was set
by the XDR code.

Fixes: f54bcf2e ("pnfs: Prepare for flexfiles by pulling out common code")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

221203ce

NFS: Fix up fsync() when the server rebooted · 2197e9b0

Trond Myklebust authored Jan 06, 2020

Don't clear the NFS_CONTEXT_RESEND_WRITES flag until after calling
nfs_commit_inode(). Otherwise, if nfs_commit_inode() returns an
error, we end up with dirty pages in the page cache, but no tag
to tell us that those pages need resending.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

2197e9b0

SUNRPC: Remove broken gss_mech_list_pseudoflavors() · b32d2855

Trond Myklebust authored Jan 06, 2020

Remove gss_mech_list_pseudoflavors() and its callers. This is part of
an unused API, and could leak an RCU reference if it were ever called.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b32d2855

NFS: Revalidate the file mapping on all fatal writeback errors · b8946d7b

Trond Myklebust authored Jan 06, 2020

If a write or commit failed, and the mapping sees a fatal error, we
need to revalidate the contents of that mapping.

Fixes: 06c9fdf3 ("NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b8946d7b

NFS: Revalidate the file size on a fatal write error · 0df68ced

Trond Myklebust authored Jan 06, 2020

If we suffer a fatal error upon writing a file, which causes us to
need to revalidate the entire mapping, then we should also revalidate
the file size.

Fixes: d2ceb7e5 ("NFS: Don't use page_file_mapping after removing the page")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

0df68ced

xprtrdma: DMA map rr_rdma_buf as each rpcrdma_rep is created · e515dd9d

Chuck Lever authored Jan 03, 2020

Clean up: This simplifies the logic in rpcrdma_post_recvs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

e515dd9d

xprtrdma: Destroy reps from previous connection instance · b7ff0185

Chuck Lever authored Jan 03, 2020

To safely get rid of all rpcrdma_reps from a particular connection
instance, xprtrdma has to wait until each of those reps is finished
being used. A rep may be backing the rq_rcv_buf of an RPC that has
just completed, for example.

Since it is safe to invoke rpcrdma_rep_destroy() only in the Receive
completion handler, simply mark reps remaining in the rb_all_reps
list after the transport is drained. These will then be deleted as
rpcrdma_post_recvs pulls them off the rep free list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b7ff0185

xprtrdma: Destroy rpcrdma_rep when Receive is flushed · 85810388

Chuck Lever authored Jan 03, 2020

This reduces the hardware and memory footprint of an unconnected
transport.

At some point in the future, transport reconnect will allow
resolving the destination IP address through a different device. The
current change enables reps for the new connection to be allocated
on whichever NUMA node the new device affines to after a reconnect.

Note that this does not destroy _all_ the transport's reps... there
will be a few that are still part of a running RPC completion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

85810388