- 15 Jan, 2020 40 commits
-
-
Trond Myklebust authored
Add a mount option 'softreval' that allows attribute revalidation 'getattr' calls to time out, and causes them to fall back to using the cached attributes. The use case for this option is for ensuring that we can still (slowly) traverse paths and use cached information even when the server is down. Once the server comes back up again, the getattr calls start succeeding, and the caches will revalidate as usual. The 'softreval' mount option is automatically enabled if you have specified 'softerr'. It can be turned off using the options 'nosoftreval', or 'hard'. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
If we've already revalidated the inode once then don't distrust the access cache unless the NFS_INO_INVALID_ACCESS flag is actually set. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
The 'hdr->good_bytes' is defined as the number of bytes we expect to read or write starting at offset hdr->io_start. In the case of a partial read/write we may end up adjusting hdr->args.offset and hdr->args.count to skip I/O for data that was already read/written, and so we must ensure the calculation takes that into account. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
If we're resending a write due to a short read or write, ensure we reset the reply count to zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
On exit from nfs_do_access(), record the mask representing the requested permissions, as well as the server-supplied set of access rights for this user. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Trace layout errors for pNFS/flexfiles on read/write/commit operations. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Clean up the generic file commit tracepoints to use a 64-bit value for the verifier, and to display the pNFS filehandle, if it exists. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Clean up the generic writeback tracepoints so they do pass the full structures as arguments. Also ensure we report the number of bytes actually written. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Clean up the generic file read tracepoints so they do pass the full structures as arguments. Also ensure we report the number of bytes actually read. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
If the attempt to do pNFS fails, then record what action we take to recover (resend, reset to pnfs or reset to mds). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Casting a negative value to an unsigned long is not the same as converting it to its absolute value. Fixes: 96650e2e ("NFS: Fix show_nfs_errors macros again") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Ensure we always return the number of bytes read/written. Also display the pnfs filehandle if it is in use. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Instead of making assumptions about the commit verifier contents, change the commit code to ensure we always check that the verifier was set by the XDR code. Fixes: f54bcf2e ("pnfs: Prepare for flexfiles by pulling out common code") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Don't clear the NFS_CONTEXT_RESEND_WRITES flag until after calling nfs_commit_inode(). Otherwise, if nfs_commit_inode() returns an error, we end up with dirty pages in the page cache, but no tag to tell us that those pages need resending. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
Remove gss_mech_list_pseudoflavors() and its callers. This is part of an unused API, and could leak an RCU reference if it were ever called. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
If a write or commit failed, and the mapping sees a fatal error, we need to revalidate the contents of that mapping. Fixes: 06c9fdf3 ("NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Trond Myklebust authored
If we suffer a fatal error upon writing a file, which causes us to need to revalidate the entire mapping, then we should also revalidate the file size. Fixes: d2ceb7e5 ("NFS: Don't use page_file_mapping after removing the page") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Clean up: This simplifies the logic in rpcrdma_post_recvs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
To safely get rid of all rpcrdma_reps from a particular connection instance, xprtrdma has to wait until each of those reps is finished being used. A rep may be backing the rq_rcv_buf of an RPC that has just completed, for example. Since it is safe to invoke rpcrdma_rep_destroy() only in the Receive completion handler, simply mark reps remaining in the rb_all_reps list after the transport is drained. These will then be deleted as rpcrdma_post_recvs pulls them off the rep free list. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
This reduces the hardware and memory footprint of an unconnected transport. At some point in the future, transport reconnect will allow resolving the destination IP address through a different device. The current change enables reps for the new connection to be allocated on whichever NUMA node the new device affines to after a reconnect. Note that this does not destroy _all_ the transport's reps... there will be a few that are still part of a running RPC completion. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Currently the underlying RDMA device is chosen at transport set-up time. But it will soon be at connect time instead. The maximum size of a transport header is based on device capabilities. Thus transport header buffers have to be allocated _after_ the underlying device has been chosen (via address and route resolution); ie, in the connect worker. Thus, move the allocation of transport header buffers to the connect worker, after the point at which the underlying RDMA device has been chosen. This also means the RDMA device is available to do a DMA mapping of these buffers at connect time, instead of in the hot I/O path. Make that optimization as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Refactor: Perform the "is supported" check in rpcrdma_ep_create() instead of in rpcrdma_ia_open(). frwr_open() is where most of the logic to query device attributes is already located. The current code displays a redundant error message when the device does not support FRWR. As an additional clean-up, this patch removes the extra message. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
To support device hotplug and migrating a connection between devices of different capabilities, we have to guarantee that all in-kernel devices can support the same max NFS payload size (1 megabyte). This means that possibly one or two in-tree devices are no longer supported for NFS/RDMA because they cannot support 1MB rsize/wsize. The only one I confirmed was cxgb3, but it has already been removed from the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Clean up: there is no need to keep two copies of the same value. Also, in subsequent patches, rpcrdma_ep_create() will be called in the connect worker rather than at set-up time. Minor fix: Initialize the transport's sendctx to the value based on the capabilities of the underlying device, not the maximum setting. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
The size of the sendctx queue depends on the value stored in ia->ri_max_send_sges. This value is determined by querying the underlying device. Eventually, rpcrdma_ia_open() and rpcrdma_ep_create() will be called in the connect worker rather than at transport set-up time. The underlying device will not have been chosen device set-up time. The sendctx queue will thus have to be created after the underlying device has been chosen via address and route resolution; in other words, in the connect worker. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Clean-up. The max_send_sge value also happens to be stored in ep->rep_attr. Let's keep just a single copy. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Colin Ian King authored
Currently the allocation of buf is not being null checked and a null pointer dereference can occur when the memory allocation fails. Fix this by adding a check and returning -ENOMEM. Addresses-Coverity: ("Dereference null return") Fixes: 6d972518b821 ("NFS: Add fs_context support.") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Geert Uytterhoeven authored
If CONFIG_SWAP=n, it does not make much sense to offer the user the option to enable support for swapping over NFS, as that will still fail at run time: # swapon /swap swapon: /swap: swapon failed: Function not implemented Fix this by adding a dependency on CONFIG_SWAP. Fixes: a564b8f0 ("nfs: enable swap on NFS") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Julia Lawall authored
The empty_iov structure is only copied into another structure, so make it const. The opportunity for this change was found using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Murphy Zhou authored
swapon over NFS does not go through generic_swapfile_activate code path when setting up extents. This makes holes in NFS swapfiles possible which is not expected for swapon. Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
The xprtrdma connect logic can return -EPROTO if the underlying device or network path does not support RDMA. This can happen after a device removal/insertion. - When SOFTCONN is set, EPROTO is a permanent error. - When SOFTCONN is not set, EPROTO is treated as a temporary error. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
This seems to be a somewhat common issue with Kerberos NFSv4.0 set-ups. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Try to capture the reason for the writeback path tagging an error on a page. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
zhengbin authored
In nfs3_proc_lookup, if nfs_alloc_fattr fails, will only print "NFS call lookup". This may be confusing, move dprintk after nfs_alloc_fattr. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
zhengbin authored
Fixes coccicheck warning: fs/nfs/nfs4state.c:1138:2-3: Unneeded semicolon fs/nfs/nfs4proc.c:6862:2-3: Unneeded semicolon fs/nfs/nfs4proc.c:8629:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Arnd Bergmann authored
On 32-bit architectures, xdr_encode_nfstime4() needlessly truncates timestamps to a 32-bit value in the range between year 1902 and 2038. Change it to use 'struct timespec64' to allow the entire range of values supported by the server. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Arnd Bergmann authored
For NFSv2 and NFSv3, timestamps are stored using 32-bit entities and overflow in y2038. For historic reasons we truncate the 64-bit timestamps by converting from a timespec64 to a timespec first. Remove this unnecessary conversion step and do the truncation in the final functions that take a timestamp. This is transparent to users, but avoids one of the last uses of 'timespec' and lets us remove it later. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Arnd Bergmann authored
nfs currently behaves differently on 32-bit and 64-bit kernels regarding the on-disk format of nfs_fscache_inode_auxdata. That format should really be the same on any kernel, and we should avoid the 'timespec' type in order to remove that from the kernel later on. Using plain 'timespec64' would not be good here, since that includes implied padding and would possibly leak kernel stack data to the on-disk format on 32-bit architectures. struct __kernel_timespec would work as a replacement, but open-coding the two struct members in nfs_fscache_inode_auxdata makes it more obvious what's going on here, and keeps the current format for 64-bit architectures. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Arnd Bergmann authored
Push down the use of timespec64 into NFS nfs_fattr, to avoid needless conversions, and get closer to having 64-bit time_t support on 32-bit NFSv4 and removing some old interfaces from the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-