- 25 Oct, 2011 8 commits
-
-
Stanislav Kinsbursky authored
We don't need this code since rpcbind clients are creating during RPC service creation. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
We have to call svc_rpcb_cleanup() explicitly from nfsd_last_thread() since this function is registered as service shutdown callback and thus nobody else will done it for us. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
svc_unregister() call have to be removed from svc_destroy() since it will be called in sv_shutdown callback. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
New function ("svc_uses_rpcbind") will be used to detect, that new service will send portmapper register calls. For such services we will create rpcbind clients and remove all stale portmap registrations. Also, svc_rpcb_cleanup() will be set as sv_shutdown callback for such services in case of this field wasn't initialized earlier. This will allow to destroy rpcbind clients when no other users of them left. Note: Currently, any creating service will be detected as portmap user. Probably, this is wrong. But now it depends on program versions "vs_hidden" flag. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
This helpers will be used only for those services, that will send portmapper registration calls. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
All is simple: we just increase users counter if rpcbind clients has been created already. Otherwise we create them and set users counter to 1. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Stanislav Kinsbursky authored
v6: 1) added write memory barrier to rpcb_set_local to make sure, that rpcbind clients become valid before rpcb_users assignment 2) explicitly set rpcb_users to 1 instead of incrementing it (looks clearer from my pow). v5: fixed races with rpcb_users in rpcb_get_local() This helpers will be used for dynamical creation and destruction of rpcbind clients. Variable rpcb_users is actually a counter of lauched RPC services. If rpcbind clients has been created already, then we just increase rpcb_users. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
NeilBrown authored
The sunrpc layer keeps a cache of recently used credentials and 'unx_match' is used to find the credential which matches the current process. However unx_match allows a match when the cached credential has extra groups at the end of uc_gids list which are not in the process group list. So if a process with a list of (say) 4 group accesses a file and gains access because of the last group in the list, then another process with the same uid and gid, and a gid list being the first tree of the gids of the original process tries to access the file, it will be granted access even though it shouldn't as the wrong rpc credential will be used. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
-
- 20 Oct, 2011 1 commit
-
-
Malahal Naineni authored
As soon as the nfs_client gets created, its cl_rpcclient is set to ERR_PTR(-EINVAL). The rpc client structure is allocated later. Check if the client is ready before using the cl_rpcclient pointer. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 19 Oct, 2011 6 commits
-
-
Trond Myklebust authored
We don't need a mempool in order to guarantee reliable NFS read performance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Don't rely on the PageError flag to tell us if one of the partial reads of the page failed. Instead, replace that with a dedicated flag in the struct nfs_page. Then clean out redundant uses of the PageError flag: the VM no longer checks it for reads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
The generic file read code does that for us anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
It can trivially be replaced with rpc_restart_call_prepare. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 18 Oct, 2011 20 commits
-
-
Trond Myklebust authored
Both LOOKUP and OPEN operations may return NFS4ERR_BADNAME if we send a an invalid name as a filename argument. As far as the application is concerned, it just has to know that the file doesn't exist, and so ENOENT would be the appropriate reply. We should only return EINVAL if the filename is being used to _create_ a new object on the remote filesystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
...and also remove the associated nfs_v4_clientops entry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
It is only used internally by the RPC code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where it is used in a non-blockable context in at least one case. Add non-blocking capability by adding a gfp_t argument Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
H Hartley Sweeten authored
commit ae50c0b5 "pnfs: client stats" added additional information to the output of /proc/self/mountstats. The new functions introduced are only used in this file and should be marked static. If CONFIG_NFS_V4_1 is not defined, empty stub functions are used. If CONFIG_NFS_V4 is not defined these stub functions are not used at all. Adding static for the functions results in compile warnings: fs/nfs/super.c:743: warning: 'show_sessions' defined but not used fs/nfs/super.c:756: warning: 'show_pnfs' defined but not used Fix this by adding a #ifdef CONFIG_NFS_V4 guard around the two show_ functions. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
We should check if the sector is already initialized before trying to grab the page from page cache. Otherwise when two pages of the same block are written back by two threads each calling from writepage_locked, it can cause deadlock like bellow. [ 1080.972099] INFO: task kswapd0:25 blocked for more than 120 seconds. [ 1080.972377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1080.972812] kswapd0 D ffff88000c4926c0 0 25 2 0x00000000 [ 1080.972816] ffff88000df276b0 0000000000000046 ffff88000df27640 ffffffff81013ba7 [ 1080.972821] ffff88000c492310 ffff88000df27fd8 ffff88000df27fd8 00000000001d3440 [ 1080.972824] ffff88000c378000 ffff88000c492310 ffff8800175d3d40 ffff880017fc75a8 [ 1080.972828] Call Trace: [ 1080.972860] [<ffffffff81013ba7>] ? read_tsc+0x9/0x19 [ 1080.972877] [<ffffffff810e0b23>] ? lock_page+0x2b/0x2b [ 1080.972899] [<ffffffff81475a1d>] io_schedule+0x63/0x7e [ 1080.972902] [<ffffffff810e0b31>] sleep_on_page+0xe/0x12 [ 1080.972905] [<ffffffff81475fe8>] __wait_on_bit_lock+0x46/0x8f [ 1080.972916] [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72 [ 1080.972919] [<ffffffff810e0af6>] __lock_page+0x66/0x68 [ 1080.972928] [<ffffffff81072705>] ? autoremove_wake_function+0x3d/0x3d [ 1080.972932] [<ffffffff810e0b1f>] lock_page+0x27/0x2b [ 1080.972934] [<ffffffff810e0bcf>] find_lock_page+0x34/0x57 [ 1080.972937] [<ffffffff810e1738>] find_or_create_page+0x34/0x8a [ 1080.972947] [<ffffffffa034245b>] bl_write_pagelist+0x205/0x6da [blocklayoutdriver] [ 1080.972951] [<ffffffffa034145d>] ? bl_free_lseg+0x38/0x38 [blocklayoutdriver] [ 1080.972995] [<ffffffffa02e27b9>] ? nfs_write_rpcsetup+0x118/0x123 [nfs] [ 1080.973033] [<ffffffffa030246b>] pnfs_generic_pg_writepages+0x10b/0x1f4 [nfs] [ 1080.973089] [<ffffffffa02deaae>] nfs_pageio_doio+0x1a/0x43 [nfs] [ 1080.973098] [<ffffffffa02df035>] nfs_pageio_complete+0x16/0x2d [nfs] [ 1080.973108] [<ffffffffa02e2d8f>] nfs_writepage_locked+0xa0/0xbf [nfs] [ 1080.973119] [<ffffffffa02e36a1>] nfs_writepage+0x16/0x2b [nfs] [ 1080.973122] [<ffffffff810e8762>] ? clear_page_dirty_for_io+0x87/0x9a [ 1080.973133] [<ffffffff810efc5b>] shrink_page_list+0x39b/0x6c8 [ 1080.973139] [<ffffffff810f03bb>] shrink_inactive_list+0x22c/0x39e [ 1080.973144] [<ffffffff810822d7>] ? lock_release_holdtime.part.7+0x6b/0x72 [ 1080.973148] [<ffffffff810f0c33>] shrink_zone+0x445/0x588 [ 1080.973152] [<ffffffff810f1a11>] balance_pgdat+0x2c2/0x56b [ 1080.973170] [<ffffffff81254208>] ? __bitmap_weight+0x34/0x80 [ 1080.973175] [<ffffffff810f1f78>] kswapd+0x2be/0x2fa [ 1080.973179] [<ffffffff810726c8>] ? __init_waitqueue_head+0x4b/0x4b [ 1080.973183] [<ffffffff810f1cba>] ? balance_pgdat+0x56b/0x56b [ 1080.973187] [<ffffffff81071f69>] kthread+0xa8/0xb0 [ 1080.973200] [<ffffffff814806b4>] kernel_thread_helper+0x4/0x10 [ 1080.973205] [<ffffffff81071ec1>] ? __init_kthread_worker+0x5a/0x5a [ 1080.973210] [<ffffffff814806b0>] ? gs_change+0x13/0x13 [ 1080.973213] no locks held by kswapd0/25. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
bl_add_page_to_bio returns error pointer. bio should be reset to NULL in failure cases as the out path always calls bl_submit_bio. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
For pnfs pagelist read failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
For pnfs pagelist write failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
file layout and block layout both use it to set mark layout io failure bit. So make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Peng Tao authored
The same function is used by idmap, gss and blocklayout code. Make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jim Rees authored
Make the status field explicitly 32 bits. "...it's unlikely that the kernel and userspace would differ on the size of an int here, but it might be a good idea to go ahead and make that explicitly 32 bits in case we end up dealing with more exotic arches at some point in the future." Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jim Rees authored
Always return PTR_ERR, not NULL, from nfs4_blk_get_deviceinfo and nfs4_blk_decode_device. Check for IS_ERR, not NULL, in bl_set_layoutdriver when calling nfs4_blk_get_deviceinfo. Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jeff Layton authored
nfs_find_and_lock_request will take a reference to the nfs_page and will then put it if the req is already locked. It's possible though that the reference will be the last one. That put then can kick off a whole series of reference puts: nfs_page nfs_open_context dentry inode If the inode ends up being deleted, then the VFS will call truncate_inode_pages. That function will try to take the page lock, but it was already locked when migrate_page was called. The code deadlocks. Fix this by simply refusing the migration request if PagePrivate is already set, indicating that the page is already associated with an active read or write request. We've had a customer test a backported version of this patch and the preliminary results seem good. Cc: stable@kernel.org Cc: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Mi Jinlong authored
The result from ipv6_addr_scope() always not be a single SCOPE, so we can't use equal to compare the result with IPV6_ADDR_SCOPE_LINKLOCAL at nfs_sockaddr_match_ipaddr6. This patch fixs the problem, and lets checking address before scope_id. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Jeff Layton authored
commit 420e3646 allowed the kernel to reduce the number of unnecessary commit calls by skipping the commit when there are a large number of outstanding pages. However, the current test in nfs_commit_unstable_pages does not handle the edge condition properly. When ncommit == 0, then that means that the kernel doesn't need to do anything more for the inode. The current test though in the WB_SYNC_NONE case will return true, and the inode will end up being marked dirty. Once that happens the inode will never be clean until there's a WB_SYNC_ALL flush. Fix this by immediately returning from nfs_commit_unstable_pages when ncommit == 0. Mike noticed this problem initially in RHEL5 (2.6.18-based kernel) which has a backported version of 420e3646. The inode cache there was growing very large. The inode cache was unable to be shrunk since the inodes were all marked dirty. Calling sync() would essentially "fix" the problem -- the WB_SYNC_ALL flush would result in the inodes all being marked clean. What I'm not clear on is how big a problem this is in mainline kernels as the writeback code there is very different. Either way, it seems incorrect to re-mark the inode dirty in this case. Reported-by: Mike McLean <mikem@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org [2.6.34+] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
This reverts commit b80c3cb6. The reverted commit was rendered obsolete by a VFS fix: commit 5547e8aa (writeback: Update dirty flags in two steps). We now no longer need to worry about writeback_single_inode() missing our marking the inode for COMMIT in 'do_writepages()' call. Reverting this patch, fixes a performance regression in which the inode would continuously get queued to the dirty list, causing the writeback code to unnecessarily try to send a COMMIT. Signed-off-by: Trond Myklebust <Trond.Myklebust> Tested-by: Simon Kirby <sim@hostway.ca> Cc: stable@kernel.org [2.6.35+]
-
Linus Torvalds authored
-
- 17 Oct, 2011 1 commit
-
-
Linus Torvalds authored
The size is always valid, but variable-length arrays generate worse code for no good reason (unless the function happens to be inlined and the compiler sees the length for the simple constant it is). Also, there seems to be some code generation problem on POWER, where Henrik Bakken reports that register r28 can get corrupted under some subtle circumstances (interrupt happening at the wrong time?). That all indicates some seriously broken compiler issues, but since variable length arrays are bad regardless, there's little point in trying to chase it down. "Just don't do that, then". Reported-by: Henrik Grindal Bakken <henribak@cisco.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 16 Oct, 2011 1 commit
-
-
http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-armLinus Torvalds authored
* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS ARM: 7122/1: localtimer: add header linux/errno.h explicitly ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9 ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES
-
- 15 Oct, 2011 3 commits
-
-
Zoltan Devai authored
This is unneeded and causes an abort on the SPMP8000 platform. Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Zoltan Devai <zoss@devai.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Shawn Guo authored
Per the text in Documentation/SubmitChecklist as below, we should explicitly have header linux/errno.h in localtimer.h for ENXIO reference. 1: If you use a facility then #include the file that defines/declares that facility. Don't depend on other header files pulling in ones that you use. Otherwise, we may run into some compiling error like the following one, if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined. arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’: arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function) Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Will Deacon authored
Using COHERENT_LINE_{MISS,HIT} for cache misses and references respectively is completely wrong. Instead, use the L1D events which are a better and more useful approximation despite ignoring instruction traffic. Reported-by: Alasdair Grant <alasdair.grant@arm.com> Reported-by: Matt Horsnell <matt.horsnell@arm.com> Reported-by: Michael Williams <michael.williams@arm.com> Cc: stable@kernel.org Cc: Jean Pihet <j-pihet@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-