Commits · 2825a7f90753012babe7ee292f4a1eadd3706f92 · Kirill Smelkov / linux

30 May, 2014 9 commits

nfsd4: allow encoding across page boundaries · 2825a7f9

J. Bruce Fields authored Aug 26, 2013

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

2825a7f9

nfsd4: size-checking cleanup · a8095f7e

J. Bruce Fields authored Mar 11, 2014

Better variable name, some comments, etc.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

a8095f7e

nfsd4: remove redundant encode buffer size checking · ea8d7720

J. Bruce Fields authored Mar 08, 2014

Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

ea8d7720

nfsd4: nfsd4_check_resp_size needn't recalculate length · 67492c99

J. Bruce Fields authored Mar 08, 2014

We're keeping the length updated as we go now, so there's no need for
the extra calculation here.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

67492c99

nfsd4: reserve space before inlining 0-copy pages · 4e21ac4b

J. Bruce Fields authored Mar 22, 2014

Once we've included page-cache pages in the encoding it's difficult to
remove them and restart encoding.  (xdr_truncate_encode doesn't handle
that case.)  So, make sure we'll have adequate space to finish the
operation first.

For now COMPOUND_SLACK_SPACE checks should prevent this case happening,
but we want to remove those checks.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

4e21ac4b

nfsd4: teach encoders to handle reserve_space failures · d0a381dd

J. Bruce Fields authored Jan 30, 2014

We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
	- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
	  more protocol.
	- BUG_ON or page faulting on failure seems overly fragile.
	- Especially in the 4.1 case, we prefer not to fail compounds
	  just because the returned result came *close* to session
	  limits.  (Though perfect enforcement here may be difficult.)
	- I'd prefer encoding to be uniform for all encoders instead of
	  having special exceptions for encoders containing, for
	  example, attributes.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

d0a381dd

nfsd4: "backfill" using write_bytes_to_xdr_buf · 082d4bd7

J. Bruce Fields authored Aug 29, 2013

Normally xdr encoding proceeds in a single pass from start of a buffer
to end, but sometimes we have to write a few bytes to an earlier
position.

Use write_bytes_to_xdr_buf for these cases rather than saving a pointer
to write to.  We plan to rewrite xdr_reserve_space to handle encoding
across page boundaries using a scratch buffer, and don't want to risk
writing to a pointer that was contained in a scratch buffer.

Also it will no longer be safe to calculate lengths by subtracting two
pointers, so use xdr_buf offsets instead.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

082d4bd7

nfsd4: use xdr_truncate_encode · 1fcea5b2

J. Bruce Fields authored Feb 26, 2014

Now that lengths are reliable, we can use xdr_truncate instead of
open-coding it everywhere.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

1fcea5b2

rpc: xdr_truncate_encode · 3e19ce76

J. Bruce Fields authored Feb 25, 2014

This will be used in the server side in a few cases:
	- when certain operations (read, readdir, readlink) fail after
	  encoding a partial response.
	- when we run out of space after encoding a partial response.
	- in readlink, where we initially reserve PAGE_SIZE bytes for
	  data, then truncate to the actual size.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

3e19ce76

28 May, 2014 5 commits

nfsd4: keep xdr buf length updated · 6ac90391
J. Bruce Fields authored Feb 26, 2014
```
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
```
6ac90391

nfsd4: no need for encode_compoundres to adjust lengths · dd97fdde

J. Bruce Fields authored Feb 26, 2014

xdr_reserve_space should now be calculating the length correctly as we
go, so there's no longer any need to fix it up here.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

dd97fdde

nfsd4: remove ADJUST_ARGS · f46d382a

J. Bruce Fields authored Jan 31, 2014

It's just uninteresting debugging code at this point.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

f46d382a

nfsd4: use xdr_stream throughout compound encoding · d3f627c8

J. Bruce Fields authored Feb 26, 2014

Note this makes ADJUST_ARGS useless; we'll remove it in the following
patch.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

d3f627c8

nfsd4: use xdr_reserve_space in attribute encoding · ddd1ea56

J. Bruce Fields authored Aug 27, 2013

This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres.  However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

When the attribute encoding fails partway through we have to undo the
length adjustments made so far.  We do it manually for now, but later
patches will add an xdr_truncate_encode() helper to handle cases like
this.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

ddd1ea56

27 May, 2014 2 commits

nfsd4: allow space for final error return · 5f4ab945

J. Bruce Fields authored Mar 07, 2014

This post-encoding check should be taking into account the need to
encode at least an out-of-space error to the following op (if any).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

5f4ab945

nfsd4: fix encoding of out-of-space replies · 07d1f802

J. Bruce Fields authored Mar 06, 2014

If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

07d1f802

23 May, 2014 9 commits

nfsd4: reserve head space for krb5 integ/priv info · 1802a678

J. Bruce Fields authored Jan 21, 2014

Currently if the nfs-level part of a reply would be too large, we'll
return an error to the client.  But if the nfs-level part fits and
leaves no room for krb5p or krb5i stuff, then we just drop the request
entirely.

That's no good.  Instead, reserve some slack space at the end of the
buffer and make sure we fail outright if we'd come close.

The slack space here is a massive overstimate of what's required, we
should probably try for a tighter limit at some point.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

1802a678

nfsd4: move proc_compound xdr encode init to helper · 2d124dfa

J. Bruce Fields authored Jan 15, 2014

Mechanical transformation with no change of behavior.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

2d124dfa

nfsd4: tweak nfsd4_encode_getattr to take xdr_stream · d5184658

J. Bruce Fields authored Aug 26, 2013

Just change the nfsd4_encode_getattr api.  Not changing any code or
adding any new functionality yet.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

d5184658

nfsd4: embed xdr_stream in nfsd4_compoundres · 4aea24b2

J. Bruce Fields authored Jan 15, 2014

This is a mechanical transformation with no change in behavior.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

4aea24b2

nfsd4: decoding errors can still be cached and require space · e372ba60

J. Bruce Fields authored May 19, 2014

Currently a non-idempotent op reply may be cached if it fails in the
proc code but not if it fails at xdr decoding.  I doubt there are any
xdr-decoding-time errors that would make this a problem in practice, so
this probably isn't a serious bug.

The space estimates should also take into account space required for
encoding of error returns.  Again, not a practical problem, though it
would become one after future patches which will tighten the space
estimates.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

e372ba60

nfsd4: fix write reply size estimate · f34e432b

J. Bruce Fields authored May 16, 2014

The write reply also includes count and stable_how.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

f34e432b

nfsd4: read size estimate should include padding · 622f560e
J. Bruce Fields authored May 16, 2014
```
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
```
622f560e

nfsd4: allow larger 4.1 session drc slots · 24906f32

J. Bruce Fields authored Mar 12, 2014

The client is actually asking for 2532 bytes. I suspect that's a
mistake. But maybe we can allow some more. In theory lock needs more
if it might return a maximum-length lockowner in the denied case.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

24906f32

nfsd4: READ, READDIR, etc., are idempotent · 5b648699

J. Bruce Fields authored Mar 07, 2014

OP_MODIFIES_SOMETHING flags operations that we should be careful not to
initiate without being sure we have the buffer space to encode a reply.

None of these ops fall into that category.

We could probably remove a few more, but this isn't a very important
problem at least for ops whose reply size is easy to estimate.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

5b648699

22 May, 2014 7 commits

nfsd: Only set PF_LESS_THROTTLE when really needed. · 8658452e

NeilBrown authored May 12, 2014

PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks
and live-locks while writing to the page cache in a loop-back
NFS mount situation.

It therefore makes sense to *only* set PF_LESS_THROTTLE in this
situation.
We now know when a request came from the local-host so it could be a
loop-back mount.  We already know when we are handling write requests,
and when we are doing anything else.

So combine those two to allow nfsd to still be throttled (like any
other process) in every situation except when it is known to be
problematic.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

8658452e

SUNRPC: track whether a request is coming from a loop-back interface. · ef11ce24

NeilBrown authored May 12, 2014

If an incoming NFS request is coming from the local host, then
nfsd will need to perform some special handling.  So detect that
possibility and make the source visible in rq_local.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

ef11ce24

SUNRPC: Fix a module reference leak in svc_handle_xprt · c789102c

Trond Myklebust authored May 18, 2014

If the accept() call fails, we need to put the module reference.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

c789102c

NFSD: Ignore client's source port on RDMA transports · 16e4d93f

Chuck Lever authored May 19, 2014

An NFS/RDMA client's source port is meaningless for RDMA transports.
The transport layer typically sets the source port value on the
connection to a random ephemeral port.

Currently, NFS server administrators must specify the "insecure"
export option to enable clients to access exports via RDMA.

But this means NFS clients can access such an export via IP using an
ephemeral port, which may not be desirable.

This patch eliminates the need to specify the "insecure" export
option to allow NFS/RDMA clients access to an export.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

16e4d93f

nfsd: remove nfsd4_free_slab · abf1135b

Christoph Hellwig authored May 21, 2014

No need for a kmem_cache_destroy wrapper in nfsd, just do proper
goto based unwinding.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

abf1135b

nfsd: Remove assignments inside conditions · d40aa337

Benoit Taine authored May 22, 2014

Assignments should not happen inside an if conditional, but in the line
before. This issue was reported by checkpatch.

The semantic patch that makes this change is as follows
(http://coccinelle.lip6.fr/):

// <smpl>

@@
identifier i1;
expression e1;
statement S;
@@
-if(!(i1 = e1)) S
+i1 = e1;
+if(!i1)
+S

// </smpl>

It has been tested by compilation.
Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

d40aa337

Merge 3.15 bugfixes for 3.16 · f35ea0d4
J. Bruce Fields authored May 22, 2014

f35ea0d4

21 May, 2014 4 commits

nfsd4: fix delegation cleanup on error · cbf7a75b

J. Bruce Fields authored Mar 03, 2014

We're not cleaning up everything we need to on error.  In particular,
we're not removing our lease.  Among other problems this can cause the
struct nfs4_file used as fl_owner to be referenced after it has been
destroyed.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

cbf7a75b

NFSD: Don't clear SUID/SGID after root writing data · 368fe39b

Kinglong Mee authored Apr 19, 2014

We're clearing the SUID/SGID bits on write by hand in nfsd_vfs_write,
even though the subsequent vfs_writev() call will end up doing this for
us (through file system write methods eventually calling
file_remove_suid(), e.g., from __generic_file_aio_write).

So, remove the redundant nfsd code.

The only change in behavior is when the write is by root, in which case
we previously cleared SUID/SGID, but will now leave it alone.  The new
behavior is the behavior of every filesystem we've checked.

It seems better to be consistent with local filesystem behavior.  And
the security advantage seems limited as root could always restore these
bits by hand if it wanted.

SUID/SGID is not cleared after writing data with (root, local ext4),
   File: ‘test’
   Size: 0               Blocks: 0          IO Block: 4096   regular
empty file
Device: 803h/2051d      Inode: 1200137     Links: 1
Access: (4777/-rwsrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2014-04-18 21:36:31.016029014 +0800
Modify: 2014-04-18 21:36:31.016029014 +0800
Change: 2014-04-18 21:36:31.026030285 +0800
  Birth: -
   File: ‘test’
   Size: 5               Blocks: 8          IO Block: 4096   regular file
Device: 803h/2051d      Inode: 1200137     Links: 1
Access: (4777/-rwsrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2014-04-18 21:36:31.016029014 +0800
Modify: 2014-04-18 21:36:31.040032065 +0800
Change: 2014-04-18 21:36:31.040032065 +0800
  Birth: -

With no_root_squash, (root, remote ext4), SUID/SGID are cleared,
   File: ‘test’
   Size: 0               Blocks: 0          IO Block: 262144 regular
empty file
Device: 24h/36d Inode: 786439      Links: 1
Access: (4777/-rwsrwxrwx)  Uid: ( 1000/    test)   Gid: ( 1000/    test)
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-18 21:45:32.155805097 +0800
Modify: 2014-04-18 21:45:32.155805097 +0800
Change: 2014-04-18 21:45:32.168806749 +0800
  Birth: -
   File: ‘test’
   Size: 5               Blocks: 8          IO Block: 262144 regular file
Device: 24h/36d Inode: 786439      Links: 1
Access: (0777/-rwxrwxrwx)  Uid: ( 1000/    test)   Gid: ( 1000/    test)
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-18 21:45:32.155805097 +0800
Modify: 2014-04-18 21:45:32.184808783 +0800
Change: 2014-04-18 21:45:32.184808783 +0800
  Birth: -
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

368fe39b

nfsd4: warn on finding lockowner without stateid's · 27b11428

J. Bruce Fields authored May 08, 2014

The current code assumes a one-to-one lockowner<->lock stateid
correspondance.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

27b11428

nfsd4: remove lockowner when removing lock stateid · a1b8ff4c

J. Bruce Fields authored May 20, 2014

The nfsv4 state code has always assumed a one-to-one correspondance
between lock stateid's and lockowners even if it appears not to in some
places.

We may actually change that, but for now when FREE_STATEID releases a
lock stateid it also needs to release the parent lockowner.

Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
calls same_lockowner_ino on a lockowner that unexpectedly has an empty
so_stateids list.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

a1b8ff4c

15 May, 2014 1 commit

nfsd4: fix corruption on setting an ACL. · 5513a510

J. Bruce Fields authored May 14, 2014

As of 06f9cc12 "nfsd4: don't create
unnecessary mask acl", any non-trivial ACL will be left with an
unitialized entry, and a trivial ACL may write one entry beyond what's
allocated.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

5513a510

08 May, 2014 3 commits

NFSD: Get rid of empty function nfs4_state_init · 9fa1959e

Kinglong Mee authored Apr 08, 2014

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

9fa1959e

NFSD: Use simple_read_from_buffer for coping data to userspace · f3e41ec5
Kinglong Mee authored Apr 08, 2014
```
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
```
f3e41ec5

SUNRPC: Fix printk that is not only for nfsd · ecca063b

Kinglong Mee authored Apr 15, 2014

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

ecca063b