Commits · c80c98f0dc5dc709b04254b5f30145c6ab8800a4 · Kirill Smelkov / linux

07 Feb, 2020 11 commits

ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log · c80c98f0

Al Viro authored Dec 21, 2019

... and now errorf() et.al. are never called with NULL fs_context,
so we can get rid of conditional in those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c80c98f0

new primitive: __fs_parse() · 7f5d3814

Al Viro authored Dec 20, 2019

fs_parse() analogue taking p_log instead of fs_context.
fs_parse() turned into a wrapper, callers in ceph_common and rbd
switched to __fs_parse().

As the result, fs_parse() never gets NULL fs_context and neither
do fs_context-based logging primitives
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

7f5d3814

switch rbd and libceph to p_log-based primitives · 2c3f3dc3
Al Viro authored Dec 20, 2019
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
2c3f3dc3
struct p_log, variants of warnf() et.al. taking that one instead · 3fbb8d55
Al Viro authored Dec 20, 2019
```
primitives for prefixed logging
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
3fbb8d55
teach logfc() to handle prefices, give it saner calling conventions · 9f09f649
Al Viro authored Dec 20, 2019
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
9f09f649

get rid of cg_invalf() · fbc2d168

Al Viro authored Dec 20, 2019

pointless alias for invalf()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fbc2d168

get rid of fs_value_is_filename_empty · aa1918f9

Al Viro authored Dec 17, 2019

Its behaviour is identical to that of fs_value_is_filename.
It makes no sense, anyway - LOOKUP_EMPTY affects nothing
whatsoever once the pathname has been imported from userland.
And both fs_value_is_filename and fs_value_is_filename_empty
carry an already imported pathname.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

aa1918f9

don't bother with explicit length argument for __lookup_constant() · 34264ae3

Al Viro authored Dec 16, 2019

Have the arrays of constant_table self-terminated (by NULL ->name
in the final entry).  Simplifies lookup_constant() and allows to
reuse the search for enum params as well.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

34264ae3

fold struct fs_parameter_enum into struct constant_table · 5eede625
Al Viro authored Dec 16, 2019
```
no real difference now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
5eede625

fs_parse: get rid of ->enums · 2710c957

Al Viro authored Sep 06, 2019

Don't do a single array; attach them to fsparam_enum() entry
instead.  And don't bother trying to embed the names into those -
it actually loses memory, with no real speedup worth mentioning.

Simplifies validation as well.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

2710c957

Pass consistent param->type to fs_parse() · 0f89589a

Al Viro authored Dec 17, 2019

As it is, vfs_parse_fs_string() makes "foo" and "foo=" indistinguishable;
both get fs_value_is_string for ->type and NULL for ->string.  To make
it even more unpleasant, that combination is impossible to produce with
fsconfig().

Much saner rules would be
        "foo"           => fs_value_is_flag, NULL
	"foo="          => fs_value_is_string, ""
	"foo=bar"       => fs_value_is_string, "bar"
All cases are distinguishable, all results are expressable by fsconfig(),
->has_value checks are much simpler that way (to the point of the field
being useless) and quite a few regressions go away (gfs2 has no business
accepting -o nodebug=, for example).

Partially based upon patches from Miklos.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

0f89589a

04 Feb, 2020 4 commits

NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals · 7dc2993a

Robert Milkowski authored Jan 30, 2020

Currently, each time nfs4_do_fsinfo() is called it will do an implicit
NFS4 lease renewal, which is not compliant with the NFS4 specification.
This can result in a lease being expired by an NFS server.

Commit 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
introduced implicit client lease renewal in nfs4_do_fsinfo(),
which can result in the NFSv4.0 lease to expire on a server side,
and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

This can easily be reproduced by frequently unmounting a sub-mount,
then stat'ing it to get it mounted again, which will delay or even
completely prevent client from sending RENEW operations if no other
NFS operations are issued. Eventually nfs server will expire client's
lease and return an error on file access or next RENEW.

This can also happen when a sub-mount is automatically unmounted
due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
mounted again via stat(). This can result in a short window during
which client's lease will expire on a server but not on a client.
This specific case was observed on production systems.

This patch removes the implicit lease renewal from nfs4_do_fsinfo().

Fixes: 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

7dc2993a

NFSv4: try lease recovery on NFS4ERR_EXPIRED · 924491f2

Robert Milkowski authored Jan 28, 2020

Currently, if an nfs server returns NFS4ERR_EXPIRED to open(),
we return EIO to applications without even trying to recover.

Fixes: 272289a3 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

924491f2

NFS: Fix memory leaks · 123c23c6

Wenwen Wang authored Feb 03, 2020

In _nfs42_proc_copy(), 'res->commit_res.verf' is allocated through
kzalloc() if 'args->sync' is true. In the following code, if
'res->synchronous' is false, handle_async_copy() will be invoked. If an
error occurs during the invocation, the following code will not be executed
and the error will be returned . However, the allocated
'res->commit_res.verf' is not deallocated, leading to a memory leak. This
is also true if the invocation of process_copy_commit() returns an error.

To fix the above leaks, redirect the execution to the 'out' label if an
error is encountered.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

123c23c6

nfs: optimise readdir cache page invalidation · 227823d2

Dai Ngo authored Jan 22, 2020

When the directory is large and it's being modified by one client
while another client is doing the 'ls -l' on the same directory then
the cache page invalidation from nfs_force_use_readdirplus causes
the reading client to keep restarting READDIRPLUS from cookie 0
which causes the 'ls -l' to take a very long time to complete,
possibly never completing.

Currently when nfs_force_use_readdirplus is called to switch from
READDIR to READDIRPLUS, it invalidates all the cached pages of the
directory. This cache page invalidation causes the next nfs_readdir
to re-read the directory content from cookie 0.

This patch is to optimise the cache invalidation in
nfs_force_use_readdirplus by only truncating the cached pages from
last page index accessed to the end the file. It also marks the
inode to delay invalidating all the cached page of the directory
until the next initial nfs_readdir of the next 'ls' instance.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com>
[Anna - Fix conflicts with Trond's readdir patches]
[Anna - Remove redundant call to nfs_zap_mapping()]
[Anna - Replace d_inode(file_dentry(desc->file)) with file_inode(desc->file)]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

227823d2

03 Feb, 2020 15 commits

NFS: Switch readdir to using iterate_shared() · 93a6ab7b

Trond Myklebust authored Feb 02, 2020

Now that the page cache locking is repaired, we should be able to
switch to using iterate_shared() for improved concurrency when
doing readdir().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

93a6ab7b

NFS: Use kmemdup_nul() in nfs_readdir_make_qstr() · 3803d672

Trond Myklebust authored Feb 02, 2020

The directory strings stored in the readdir cache may be used with
printk(), so it is better to ensure they are nul-terminated.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

3803d672

NFS: Directory page cache pages need to be locked when read · 114de382

Trond Myklebust authored Feb 02, 2020

When a NFS directory page cache page is removed from the page cache,
its contents are freed through a call to nfs_readdir_clear_array().
To prevent the removal of the page cache entry until after we've
finished reading it, we must take the page lock.

Fixes: 11de3b11 ("NFS: Fix a memory leak in nfs_readdir")
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

114de382

NFS: Fix memory leaks and corruption in readdir · 4b310319

Trond Myklebust authored Feb 02, 2020

nfs_readdir_xdr_to_array() must not exit without having initialised
the array, so that the page cache deletion routines can safely
call nfs_readdir_clear_array().
Furthermore, we should ensure that if we exit nfs_readdir_filler()
with an error, we free up any page contents to prevent a leak
if we try to fill the page again.

Fixes: 11de3b11 ("NFS: Fix a memory leak in nfs_readdir")
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

4b310319

SUNRPC: Use kmemdup_nul() in rpc_parse_scope_id() · 7ccbddbe

Trond Myklebust authored Feb 02, 2020

Using kmemdup_nul() is more efficient when the length is known.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

7ccbddbe

NFS: Replace various occurrences of kstrndup() with kmemdup_nul() · a8bd9ddf

Trond Myklebust authored Feb 02, 2020

When we already know the string length, it is more efficient to
use kmemdup_nul().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
[Anna - Changes to super.c were already made during fscontext conversion]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

a8bd9ddf

NFSv4: Limit the total number of cached delegations · 10717f45

Trond Myklebust authored Jan 27, 2020

Delegations can be expensive to return, and can cause scalability issues
for the server. Let's therefore try to limit the number of inactive
delegations we hold.
Once the number of delegations is above a certain threshold, start
to return them on close.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

10717f45

NFSv4: Add accounting for the number of active delegations held · d2269ea1

Trond Myklebust authored Jan 27, 2020

In order to better manage our delegation caching, add a counter
to track the number of active delegations.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d2269ea1

NFSv4: Try to return the delegation immediately when marked for return on close · b7b7dac6

Trond Myklebust authored Jan 27, 2020

Add a routine to return the delegation immediately upon close of the
file if it was marked for return-on-close.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b7b7dac6

NFS: Clear NFS_DELEGATION_RETURN_IF_CLOSED when the delegation is returned · 0d104167

Trond Myklebust authored Jan 27, 2020

If a delegation is marked as needing to be returned when the file is
closed, then don't clear that marking until we're ready to return
it.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

0d104167

NFSv4: nfs_inode_evict_delegation() should set NFS_DELEGATION_RETURNING · f885ea64

Trond Myklebust authored Jan 27, 2020

In particular, the pnfs return-on-close code will check for that flag,
so ensure we set it appropriately.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

f885ea64

NFS: nfs_find_open_context() should use cred_fscmp() · 65f51603

Trond Myklebust authored Jan 26, 2020

We want to find open contexts that match our filesystem access
properties. They don't have to exactly match the cred.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

65f51603

NFS: nfs_access_get_cached_rcu() should use cred_fscmp() · 9a206de2

Trond Myklebust authored Jan 26, 2020

We do not need to have the rcu lookup method fail in the case where
the fsuid/fsgid and supplemental groups match.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

9a206de2

NFSv4: pnfs_roc() must use cred_fscmp() to compare creds · 38712247

Trond Myklebust authored Jan 26, 2020

When comparing two 'struct cred' for equality w.r.t. behaviour under
filesystem access, we need to use cred_fscmp().

Fixes: a52458b4 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

38712247

NFS: remove unused macros · c0399cf6

Alex Shi authored Jan 21, 2020

MNT_fhs_status_sz/MNT_fhandle3_sz are never used after they were
introduced. So better to remove them.
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

c0399cf6

24 Jan, 2020 3 commits

nfs: Return EINVAL rather than ERANGE for mount parse errors · 3a21409a

David Howells authored Jan 17, 2020

Return EINVAL rather than ERANGE for mount parse errors as the userspace
mount command doesn't necessarily understand what to do with anything other
than EINVAL.

The old code returned -ERANGE as an intermediate error that then get
converted to -EINVAL, whereas the new code returns -ERANGE.

This was induced by passing minorversion=1 to a v4 mount where
CONFIG_NFS_V4_1 was disabled in the kernel build.

Fixes: 68f65ef40e1e ("NFS: Convert mount option parsing to use functionality from fs_parser.h")
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

3a21409a

NFS: allow deprecation of NFS UDP protocol · b24ee6c6

Olga Kornievskaia authored Dec 16, 2019

Add a kernel config CONFIG_NFS_DISABLE_UDP_SUPPORT to disallow NFS
UDP mounts and enable it by default.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

b24ee6c6

NFS: Add softreval behaviour to nfs_lookup_revalidate() · f7b37b8b

Trond Myklebust authored Jan 14, 2020

If the server is unavaliable, we want to allow the revalidating
lookup to time out, and to default to validating the cached dentry
if the 'softreval' mount option is set.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

f7b37b8b

15 Jan, 2020 7 commits

NFSv3: FIx bug when using chacl and chmod to change acl · fe1e8dbe

Su Yanjun authored Dec 25, 2019

We find a bug when running test under nfsv3  as below.
1)
chacl u::r--,g::rwx,o:rw- file1
2)
chmod u+w file1
3)
chacl -l file1

We expect u::rw-, but it shows u::r--, more likely it returns the
cached acl in inode.

We dig the code find that the code path is different.

chacl->..->__nfs3_proc_setacls->nfs_zap_acl_cache
Then nfs_zap_acl_cache clears the NFS_INO_INVALID_ACL in
NFS_I(inode)->cache_validity.

chmod->..->nfs3_proc_setattr
Because NFS_INO_INVALID_ACL has been cleared by chacl path,
nfs_zap_acl_cache wont be called.

nfs_setattr_update_inode will set NFS_INO_INVALID_ACL so let it
before nfs_zap_acl_cache call.
Signed-off-by: Su Yanjun <suyanjun218@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

fe1e8dbe

NFSv4.x recover from pre-mature loss of openstateid · d826e5b8

Olga Kornievskaia authored Dec 18, 2019

Ever since the commit 0e0cb35b, it's possible to lose an open stateid
while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens,
operations that require openstateid fail with EAGAIN which is propagated
to the application then tests like generic/446 and generic/168 fail with
"Resource temporarily unavailable".

Instead of returning this error, initiate state recovery when possible to
recover the open stateid and then try calling nfs4_select_rw_stateid()
again.

Fixes: 0e0cb35b ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d826e5b8

NFSv4 fix acl retrieval over krb5i/krb5p mounts · 62a1573f

Olga Kornievskaia authored Jan 02, 2020

For the krb5i and krb5p mount, it was problematic to truncate the
received ACL to the provided buffer because an integrity check
could not be preformed.

Instead, provide enough pages to accommodate the largest buffer
bounded by the largest RPC receive buffer size.

Note: I don't think it's possible for the ACL to be truncated now.
Thus NFS4_ACL_TRUNC flag and related code could be possibly
removed but since I'm unsure, I'm leaving it.

v2: needs +1 page.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

62a1573f

NFS: Add mount option 'softreval' · c74dfe97

Trond Myklebust authored Jan 06, 2020

Add a mount option 'softreval' that allows attribute revalidation 'getattr'
calls to time out, and causes them to fall back to using the cached
attributes.
The use case for this option is for ensuring that we can still (slowly)
traverse paths and use cached information even when the server is down.
Once the server comes back up again, the getattr calls start succeeding,
and the caches will revalidate as usual.

The 'softreval' mount option is automatically enabled if you have
specified 'softerr'.  It can be turned off using the options
'nosoftreval', or 'hard'.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

c74dfe97

NFS: Trust cached access if we've already revalidated the inode once · 5c965db8

Trond Myklebust authored Jan 06, 2020

If we've already revalidated the inode once then don't distrust the
access cache unless the NFS_INO_INVALID_ACCESS flag is actually set.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

5c965db8

NFS: Fix nfs_direct_write_reschedule_io() · 4daaeba9

Trond Myklebust authored Jan 06, 2020

The 'hdr->good_bytes' is defined as the number of bytes we expect to
read or write starting at offset hdr->io_start. In the case of a partial
read/write we may end up adjusting hdr->args.offset and hdr->args.count
to skip I/O for data that was already read/written, and so we must ensure
the calculation takes that into account.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

4daaeba9

NFS: When resending after a short write, reset the reply count to zero · 8c9cb714

Trond Myklebust authored Jan 06, 2020

If we're resending a write due to a short read or write, ensure we
reset the reply count to zero.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

8c9cb714