- 28 Oct, 2013 25 commits
-
-
Chuck Lever authored
Currently the Linux NFS client ignores the operation status code for the RELEASE_LOCKOWNER operation. Like NFSv3's UMNT operation, RELEASE_LOCKOWNER is a courtesy to help servers manage their resources, and the outcome is not consequential for the client. During a migration, a server may report NFS4ERR_LEASE_MOVED, in which case the client really should retry, since typically LEASE_MOVED has nothing to do with the current operation, but does prevent it from going forward. Also, it's important for a client to respond as soon as possible to a moved lease condition, since the client's lease could expire on the destination without further action by the client. NFS4ERR_DELAY is not included in the list of valid status codes for RELEASE_LOCKOWNER in RFC 3530bis. However, rfc3530-migration-update does permit migration-capable servers to return DELAY to clients, but only in the context of an ongoing migration. In this case the server has frozen lock state in preparation for migration, and a client retry would help the destination server purge unneeded state once migration recovery is complete. Interestly, NFS4ERR_MOVED is not valid for RELEASE_LOCKOWNER, even though lock owners can be migrated with Transparent State Migration. Note that RFC 3530bis section 9.5 includes RELEASE_LOCKOWNER in the list of operations that renew a client's lease on the server if they succeed. Now that our client pays attention to the operation's status code, we can note that renewal appropriately. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Trigger lease-moved recovery when a request returns NFS4ERR_LEASE_MOVED. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
A migration on the FSID in play for the current NFS operation is reported via the error status code NFS4ERR_MOVED. "Lease moved" means that a migration has occurred on some other FSID than the one for the current operation. It's a signal that the client should take action immediately to handle a migration that it may not have noticed otherwise. This is so that the client's lease does not expire unnoticed on the destination server. In NFSv4.0, a moved lease is reported with the NFS4ERR_LEASE_MOVED error status code. To recover from NFS4ERR_LEASE_MOVED, check each FSID for that server to see if it is still present. Invoke nfs4_try_migration() if the FSID is no longer present on the server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Introduce a mechanism for probing a server to determine if an FSID is present or absent. The on-the-wire compound is different between minor version 0 and 1. Minor version 0 appends a RENEW operation to identify which client ID is probing. Minor version 1 has a SEQUENCE operation in the compound which effectively carries the same information. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
When a server returns NFS4ERR_MOVED during a delegation recall, trigger the new migration recovery logic in the state manager. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
When a server returns NFS4ERR_MOVED, trigger the new migration recovery logic in the state manager. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
I'm going to use this exit label also for migration recovery failures. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Migration recovery and state recovery must be serialized, so handle both in the state manager thread. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
NFS_SB() returns the pointer to an nfs_server struct, given a pointer to a super_block. But we have no way to go back the other way. Add a super_block backpointer field so that, given an nfs_server struct, it is easy to get to the filesystem's root dentry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The nfs4_proc_fs_locations() function is invoked during referral processing to perform a GETATTR(fs_locations) on an object's parent directory in order to discover the target of the referral. It performs a LOOKUP in the compound, so the client needs to know the parent's file handle a priori. Unfortunately this function is not adequate for handling migration recovery. We need to probe fs_locations information on an FSID, but there's no parent directory available for many operations that can return NFS4ERR_MOVED. Another subtlety: recovering from NFS4ERR_LEASE_MOVED is a process of walking over a list of known FSIDs that reside on the server, and probing whether they have migrated. Once the server has detected that the client has probed all migrated file systems, it stops returning NFS4ERR_LEASE_MOVED. A minor version zero server needs to know what client ID is requesting fs_locations information so it can clear the flag that forces it to continue returning NFS4ERR_LEASE_MOVED. This flag is set per client ID and per FSID. However, the client ID is not an argument of either the PUTFH or GETATTR operations. Later minor versions have client ID information embedded in the compound's SEQUENCE operation. Therefore, by convention, minor version zero clients send a RENEW operation in the same compound as the GETATTR(fs_locations), since RENEW's one argument is a clientid4. This allows a minor version zero server to identify correctly the client that is probing for a migration. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Allow code in nfsv4.ko to use _nfs_display_fhandle(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The differences between minor version 0 and minor version 1 migration will be abstracted by the addition of a set of migration recovery ops. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Introduce functions that can walk through an array of returned fs_locations information and connect a transport to one of the destination servers listed therein. Note that NFS minor version 1 introduces "fs_locations_info" which extends the locations array sorting criteria available to clients. This is not supported yet. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
New function nfs4_update_server() moves an nfs_server to a different nfs_client. This is done as part of migration recovery. Though it may be appealing to think of them as the same thing, migration recovery is not the same as following a referral. For a referral, the client has not descended into the file system yet: it has no nfs_server, no super block, no inodes or open state. It is enough to simply instantiate the nfs_server and super block, and perform a referral mount. For a migration, however, we have all of those things already, and they have to be moved to a different nfs_client. No local namespace changes are needed here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Add an RPC client API to redirect an rpc_clnt's transport from a source server to a destination server during a migration event. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ cel: forward ported to 3.12 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The rpc_client_register() helper was added in commit e73f4cc0, "SUNRPC: split client creation routine into setup and registration," Mon Jun 24 11:52:52 2013. In a subsequent patch, I'd like to invoke rpc_client_register() from a context where a struct rpc_create_args is not available. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Weston Andros Adamson authored
Cached opens have already been handled by _nfs4_opendata_reclaim_to_nfs4_state and can safely skip being reprocessed, but must still call update_open_stateid to make sure that all active fmodes are recovered. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Cc: stable@vger.kernel.org # 3.7.x: f494a607: NFSv4: fix NULL dereference Cc: stable@vger.kernel.org # 3.7.x: a43ec98b: NFSv4: don't fail on missin Cc: stable@vger.kernel.org # 3.7.x Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Currently, if the call to nfs_refresh_inode fails, then we end up leaking a reference count, due to the call to nfs4_get_open_state. While we're at it, replace nfs4_get_open_state with a simple call to atomic_inc(); there is no need to do a full lookup of the struct nfs_state since it is passed as an argument in the struct nfs4_opendata, and is already assigned to the variable 'state'. Cc: stable@vger.kernel.org # 3.7.x: a43ec98b: NFSv4: don't fail on missing Cc: stable@vger.kernel.org # 3.7.x Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Weston Andros Adamson authored
This is an unneeded check that could cause the client to fail to recover opens. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Weston Andros Adamson authored
_nfs4_opendata_reclaim_to_nfs4_state doesn't expect to see a cached open CLAIM_PREVIOUS, but this can happen. An example is when there are RDWR openers and RDONLY openers on a delegation stateid. The recovery path will first try an open CLAIM_PREVIOUS for the RDWR openers, this marks the delegation as not needing RECLAIM anymore, so the open CLAIM_PREVIOUS for the RDONLY openers will not actually send an rpc. The NULL dereference is due to _nfs4_opendata_reclaim_to_nfs4_state returning PTR_ERR(rpc_status) when !rpc_done. When the open is cached, rpc_done == 0 and rpc_status == 0, thus _nfs4_opendata_reclaim_to_nfs4_state returns NULL - this is unexpected by callers of nfs4_opendata_to_nfs4_state(). This can be reproduced easily by opening the same file two times on an NFSv4.0 mount with delegations enabled, once as RDWR and once as RDONLY then sleeping for a long time. While the files are held open, kick off state recovery and this NULL dereference will be hit every time. An example OOPS: [ 65.003602] BUG: unable to handle kernel NULL pointer dereference at 00000000 00000030 [ 65.005312] IP: [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4] [ 65.006820] PGD 7b0ea067 PUD 791ff067 PMD 0 [ 65.008075] Oops: 0000 [#1] SMP [ 65.008802] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache snd_ens1371 gameport nfsd snd_rawmidi snd_ac97_codec ac97_bus btusb snd_seq snd _seq_device snd_pcm ppdev bluetooth auth_rpcgss coretemp snd_page_alloc crc32_pc lmul crc32c_intel ghash_clmulni_intel microcode rfkill nfs_acl vmw_balloon serio _raw snd_timer lockd parport_pc e1000 snd soundcore parport i2c_piix4 shpchp vmw _vmci sunrpc ata_generic mperf pata_acpi mptspi vmwgfx ttm scsi_transport_spi dr m mptscsih mptbase i2c_core [ 65.018684] CPU: 0 PID: 473 Comm: 192.168.10.85-m Not tainted 3.11.2-201.fc19 .x86_64 #1 [ 65.020113] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 65.022012] task: ffff88003707e320 ti: ffff88007b906000 task.ti: ffff88007b906000 [ 65.023414] RIP: 0010:[<ffffffffa037d6ee>] [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4] [ 65.025079] RSP: 0018:ffff88007b907d10 EFLAGS: 00010246 [ 65.026042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 65.027321] RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000000 [ 65.028691] RBP: ffff88007b907d38 R08: 0000000000016f60 R09: 0000000000000000 [ 65.029990] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 65.031295] R13: 0000000000000050 R14: 0000000000000000 R15: 0000000000000001 [ 65.032527] FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000 [ 65.033981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 65.035177] CR2: 0000000000000030 CR3: 000000007b27f000 CR4: 00000000000407f0 [ 65.036568] Stack: [ 65.037011] 0000000000000000 0000000000000001 ffff88007b907d90 ffff88007a880220 [ 65.038472] ffff88007b768de8 ffff88007b907d48 ffffffffa037e4a5 ffff88007b907d80 [ 65.039935] ffffffffa036a6c8 ffff880037020e40 ffff88007a880000 ffff880037020e40 [ 65.041468] Call Trace: [ 65.042050] [<ffffffffa037e4a5>] nfs4_close_state+0x15/0x20 [nfsv4] [ 65.043209] [<ffffffffa036a6c8>] nfs4_open_recover_helper+0x148/0x1f0 [nfsv4] [ 65.044529] [<ffffffffa036a886>] nfs4_open_recover+0x116/0x150 [nfsv4] [ 65.045730] [<ffffffffa036d98d>] nfs4_open_reclaim+0xad/0x150 [nfsv4] [ 65.046905] [<ffffffffa037d979>] nfs4_do_reclaim+0x149/0x5f0 [nfsv4] [ 65.048071] [<ffffffffa037e1dc>] nfs4_run_state_manager+0x3bc/0x670 [nfsv4] [ 65.049436] [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4] [ 65.050686] [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4] [ 65.051943] [<ffffffff81088640>] kthread+0xc0/0xd0 [ 65.052831] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40 [ 65.054697] [<ffffffff8165686c>] ret_from_fork+0x7c/0xb0 [ 65.056396] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40 [ 65.058208] Code: 5c 41 5d 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 89 d5 41 54 53 48 89 fb <4c> 8b 67 30 f0 41 ff 44 24 44 49 8d 7c 24 40 e8 0e 0a 2d e1 44 [ 65.065225] RIP [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4] [ 65.067175] RSP <ffff88007b907d10> [ 65.068570] CR2: 0000000000000030 [ 65.070098] ---[ end trace 0d1fe4f5c7dd6f8b ]--- Cc: <stable@vger.kernel.org> #3.7+ Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
The current caching model calls for the security label to be set on first lookup and/or on any subsequent label changes. There is no need to do it as part of an open reclaim. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jeff Layton authored
nfs_parse_mount_options returns 0 on error, not -errno. Reported-by: Karel Zak <kzak@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jeff Layton authored
Reported-by: Eric Doutreleau <edoutreleau@genoscope.cns.fr> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Andy Adamson authored
As of commit 5d422301 we no longer zero the state. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 01 Oct, 2013 12 commits
-
-
Trond Myklebust authored
Currently, we go directly to call_transmit which sends us to call_status on error. If we know that the connect attempt failed, we should rather just jump straight back to call_bind and call_connect. Ditto for EAGAIN, except do not delay. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Now that we clear the rq_bytes_sent field on unlock, we don't need to set it on lock, so we just set it once when initialising the request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
A retransmit should be when you successfully transmit an RPC call to the server a second time. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
The spec states that the client should not resend requests because the server will disconnect if it needs to drop an RPC request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Otherwise the tests of req->rq_bytes_sent in xprt_prepare_transmit will fail if we're dealing with a resend. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
We're using the request connect_cookie to track whether or not a request was successfully transmitted on the current transport connection or not. For that reason we should ensure that it is only set after we've successfully transmitted the request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
For NFSv4 we want to avoid retransmitting RPC calls unless the TCP connection breaks. However we still want to detect TCP connection breakage as soon as possible. Do this by setting the keepalive option with the idle timeout and count set to the 'timeo' and 'retrans' mount options. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
In nfs4_proc_getlk(), when some error causes a retry of the call to _nfs4_proc_getlk(), we can end up with Oopses of the form BUG: unable to handle kernel NULL pointer dereference at 0000000000000134 IP: [<ffffffff8165270e>] _raw_spin_lock+0xe/0x30 <snip> Call Trace: [<ffffffff812f287d>] _atomic_dec_and_lock+0x4d/0x70 [<ffffffffa053c4f2>] nfs4_put_lock_state+0x32/0xb0 [nfsv4] [<ffffffffa053c585>] nfs4_fl_release_lock+0x15/0x20 [nfsv4] [<ffffffffa0522c06>] _nfs4_proc_getlk.isra.40+0x146/0x170 [nfsv4] [<ffffffffa052ad99>] nfs4_proc_lock+0x399/0x5a0 [nfsv4] The problem is that we don't clear the request->fl_ops after the first try and so when we retry, nfs4_set_lock_state() exits early without setting the lock stateid. Regression introduced by commit 70cc6487 (locks: make ->lock release private data before returning in GETLK case) Reported-by: Weston Andros Adamson <dros@netapp.com> Reported-by: Jorge Mora <mora@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: <stable@vger.kernel.org> #2.6.22+
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: - Stable fix for Oopses in the pNFS files layout driver - Fix a regression when doing a non-exclusive file create on NFSv4.x - NFSv4.1 security negotiation fixes when looking up the root filesystem - Fix a memory ordering issue in the pNFS files layout driver * tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Give "flavor" an initial value to fix a compile warning NFSv4.1: try SECINFO_NO_NAME flavs until one works NFSv4.1: Ensure memory ordering between nfs4_ds_connect and nfs4_fl_prepare_ds NFSv4.1: nfs4_fl_prepare_ds - fix bugs when the connect attempt fails NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method
-
- 30 Sep, 2013 3 commits
-
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits) pidns: fix free_pid() to handle the first fork failure ipc,msg: prevent race with rmid in msgsnd,msgrcv ipc/sem.c: update sem_otime for all operations mm/hwpoison: fix the lack of one reference count against poisoned page mm/hwpoison: fix false report on 2nd attempt at page recovery mm/hwpoison: fix test for a transparent huge page mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood block: change config option name for cmdline partition parsing mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration mm: avoid reinserting isolated balloon pages into LRU lists arch/parisc/mm/fault.c: fix uninitialized variable usage include/asm-generic/vtime.h: avoid zero-length file nilfs2: fix issue with race condition of competition between segments for dirty blocks Documentation/kernel-parameters.txt: replace kernelcore with Movable mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored kernel/kmod.c: check for NULL in call_usermodehelper_exec() ipc/sem.c: synchronize the proc interface ipc/sem.c: optimize sem_lock() ipc/sem.c: fix race in sem_lock() mm/compaction.c: periodically schedule when freeing pages ...
-
Oleg Nesterov authored
"case 0" in free_pid() assumes that disable_pid_allocation() should clear PIDNS_HASH_ADDING before the last pid goes away. However this doesn't happen if the first fork() fails to create the child reaper which should call disable_pid_allocation(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Davidlohr Bueso authored
This fixes a race in both msgrcv() and msgsnd() between finding the msg and actually dealing with the queue, as another thread can delete shmid underneath us if we are preempted before acquiring the kern_ipc_perm.lock. Manfred illustrates this nicely: Assume a preemptible kernel that is preempted just after msq = msq_obtain_object_check(ns, msqid) in do_msgrcv(). The only lock that is held is rcu_read_lock(). Now the other thread processes IPC_RMID. When the first task is resumed, then it will happily wait for messages on a deleted queue. Fix this by checking for if the queue has been deleted after taking the lock. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Reported-by: Manfred Spraul <manfred@colorfullife.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: <stable@vger.kernel.org> [3.11] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-