- 28 Dec, 2015 22 commits
-
-
Trond Myklebust authored
Currently, we will only record the layoutstats correctly if the RPC call successfully obtains a slot. If we exit before that happens, then we may find ourselves starting the busy timer through the call in ff_layout_(read|write)_prepare_layoutstats, but never stopping it. The same thing happens if we're doing DA-DS. The fix is to ensure that we catch these cases in the rpc_release() callback. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
When we replay a failed read, write or commit to the dataserver, we need to ensure that we call ff_layout_read_prepare_v3(), ff_layout_write_prepare_v3 or ff_layout_commit_prepare_v3() so that we reset the statistics. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
In pNFS/flexfiles, we want to return the layout without necessarily marking it as having completely failed. We therefore move the call to pnfs_layout_io_set_failed() out of pnfs_error_mark_layout_for_return(), and then ensura that pNFS/files layout calls it separately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Fix a bug in which flexfiles clients are falling back to I/O through the MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set. The flexfiles client will always report errors through the LAYOUTRETURN and/or LAYOUTERROR mechanisms, so it should normally be safe for it to retry the LAYOUTGET until it fails or succeeds. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
If client ever restarts IO due to some errors, we'll endup mis-counting IO stats if we do the counting in .rpc_done callback. Move it to .rpc_count_stats callback that is only called when releasing RPC. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
We just need to delay and retry in these cases. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
Instead of mapping it to EIO that is a fatal error and fails application. We'll go inband after getting NFS4ERR_LAYOUTUNAVAILABLE. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
Instead of dropping pages when write fails, only do it when we get fatal failure in launder_page write back. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
When we fail to queue a read page to IO descriptor, we need to clean it up otherwise it is hanging around preventing nfs module from being removed. When we fail to queue a write page to IO descriptor, we need to clean it up and also save the failure status to open context. Then at file close, we can try to write pages back again and drop the page if it fails to writeback in .launder_page, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
In case we fail during setting things up for read/write IO, set pg_error in IO descriptor and do the cleanup in nfs_pageio_add_request, where we clean up all pages that are still hanging around on the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
If we fail to set up things before sending anything over wire, we need to clean up the reqs that are still attached to the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Peng Tao authored
For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we should just bail out instead of hiding the error and retrying inband IO. Change all the call sites to pop the error all the way up. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Some servers want to be able to control the frequency with which clients report layoutstats, for instance, in order to monitor QoS for a particular file or set of file. In order to support this, the flexfiles layout allows the server to pass this info as a hint in the layout payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Instead of displaying a layout segment pointer in these tracepoints, let's use the layout stateid, now that Olga gave us a set of tools for displaying them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Trond Myklebust authored
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Jeff Layton authored
pnfs_update_layout is really the "nexus" of layout handling. If it returns NULL then we end up going through the MDS. This patch adds some tracepoints to that function that allow us to determine the cause when we end up going through the MDS unexpectedly. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Olga Kornievskaia authored
Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Olga Kornievskaia authored
Operations to which stateid information is added: close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid, write, lock, locku, lockt Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=", "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock tracepoints. New function is added to internal.h, nfs_stateid_hash(), to compute the hash trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr() to get access to stateid. trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds stateid information for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk() to get access to stateid information, and removed trace_nfs4_lock_reclaim(), trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were previously same LOCK_EVENT type. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Linus Torvalds authored
-
git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds authored
Pull MIPS fixes from Ralf Baechle: - Fix bitrot in __get_user_unaligned() - EVA userspace accessor bug fixes. - Fix for build issues with certain toolchains. - Fix build error for VDSO with particular toolchain versions. - Fix build error due to a variable that should have been removed by an earlier patch * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: Fix bitrot in __get_user_unaligned() MIPS: Fix build error due to unused variables. MIPS: VDSO: Fix build error MIPS: CPS: drop .set mips64r2 directives MIPS: uaccess: Take EVA into account in [__]clear_user MIPS: uaccess: Take EVA into account in __copy_from_user() MIPS: uaccess: Fix strlen_user with EVA
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "A smallish set of fixes that we've been sitting on for a while now, flushing the queue here so they go in. Summary: A handful of fixes for OMAP, i.MX, Allwinner and Tegra: - A clock rate and a PHY setup fix for i.MX6Q/DL - A couple of fixes for the reduced serial bus (sunxi-rsb) on Allwinner - UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX. - Suspend fix for Tegra124 Chromebooks - Fix for missing implicit include that's different between ARM/ARM64" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: tegra: Fix suspend hang on Tegra124 Chromebooks bus: sunxi-rsb: Fix peripheral IC mapping runtime address bus: sunxi-rsb: Fix primary PMIC mapping hardware address ARM: dts: Fix UART wakeirq for omap4 duovero parlor ARM: OMAP2+: AM43xx: select ARM TWD timer ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST fsl-ifc: add missing include on ARM64 ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl bus: sunxi-rsb: unlock on error in sunxi_rsb_read() ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property
-
- 27 Dec, 2015 5 commits
-
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management and ACPI fixes from Rafael Wysocki: "These fix an ACPI processor driver regression introduced during the 4.3 cycle and a mistake in the recently added SCPI support in the arm_big_little cpufreq driver. Specifics: - Fix a thermal management issue introduced by an ACPI processor driver change made during the 4.3 development cycle that failed to return 0 from a function on success which triggered an error cleanup path every time it had been called that deleted useful data structures created previously (Srinivas Pandruvada). - Fix a variable data type issue in the arm_big_little cpufreq driver's SCPI support code added recently that prevents error handling in there from working correctly (Dan Carpenter)" * tag 'pm+acpi-4.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info() ACPI / processor: Fix thermal cooling device regression
-
git://neil.brown.name/mdLinus Torvalds authored
Pull md bugfix from Neil Brown: "One more md fix for 4.4-rc Fix a regression which causes reshape to not start properly sometimes" * tag 'md/4.4-rc6-fix' of git://neil.brown.name/md: md: remove check for MD_RECOVERY_NEEDED in action_store.
-
git://git.infradead.org/linux-ubifsLinus Torvalds authored
Pull UBI bug fixes from Richard Weinberger: "This contains four bug fixes for UBI" * tag 'upstream-4.4-rc7' of git://git.infradead.org/linux-ubifs: mtd: ubi: don't leak e if schedule_erase() fails mtd: ubi: fixup error correction in do_sync_erase() UBI: fix use of "VID" vs. "EC" in header self-check UBI: fix return error code
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull ftrace/recordmcount fix from Steven Rostedt: "Russell King was reporting lots of warnings when he compiled his kernel with ftrace enabled. With some investigation it was discovered that it was his compile setup. He was using ccache with hard links, which allowed recordmcount to process the same .o twice. When this happens, recordmcount will detect that it was already done and give a warning about it. Russell fixed this by having recordmcount detect that the object file has more than one hard link, and if it does, it unlinks the object file after it maps it and processes then. This appears to fix the issue. As you did not like the fact that recordmcount modified the file in place and thought that it should do the modifications in memory and then write it out to disk and move it over the old file to prevent other more subtle issues like the one above, a second patch is added on top of Russell's to do just that. Luckily the original code had write and lseek wrappers that I was able to modify to not do inplace writes, but simply keep track of the changes made in memory. When a write is made, a "update" flag is set, and at the end of processing, if the update is set, then it writes the file with changes out to a new file, and then renames it over the original one. The file descriptor is still passed to the write and lseek wrappers because removing that would cause the change to be more intrusive. That can be removed in a follow up cleanup patch that can wait till the next merge window" * tag 'trace-v4.4-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/scripts: Have recordmcount copy the object file scripts: recordmcount: break hardlinks
-
- 26 Dec, 2015 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds authored
Pull ARC fixes from Vineet Gupta: "Sorry for this late pull request, but these are all important fixes for code introduced/updated in this release which we will otherwise end up back porting. - Unwinder rework (A revert followed by better fix) - Build errors: MMUv2, modules with -Os - highmem section mismatch build splat" * tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: dw2 unwind: Catch Dwarf SNAFUs early ARC: dw2 unwind: Don't bail for CIE.version != 1 Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing" ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE ARC: mm: fix building for MMU v2 ARC: mm: HIGHMEM: Fix section mismatch splat
-
Rafael J. Wysocki authored
* acpi-processor: ACPI / processor: Fix thermal cooling device regression * pm-cpufreq: cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()
-
- 25 Dec, 2015 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc system call restart fix from Helge Deller: "The architectural design of parisc always uses two instructions to call kernel syscalls (delayed branch feature). This means that the instruction following the branch (located in the delay slot of the branch instruction) is executed before control passes to the branch destination. Depending on which assembler instruction and how it is used in usersapce in the delay slot, this sometimes made restarted syscalls like futex() and poll() failing with -ENOSYS" * 'parisc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix syscall restarts
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fixes from David Miller: 1) Finally make perf stack backtraces stable on sparc, several problems (mostly due to the context in which the user copies from the stack are done) contributed to this. From Rob Gardner. 2) Export ADI capability if the cpu supports it. 3) Hook up userfaultfd system call. 4) When faults happen during user copies we really have to clean up and restore the FPU state fully. Also from Rob Gardner * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: tty/serial: Skip 'NULL' char after console break when sysrq enabled sparc64: fix FP corruption in user copy functions sparc64: Perf should save/restore fault info sparc64: Ensure perf can access user stacks sparc64: Don't set %pil in rtrap_nmi too early sparc64: Add ADI capability to cpu capabilities tty: serial: constify sunhv_ops structs sparc: Hook up userfaultfd system call
-
- 24 Dec, 2015 8 commits
-
-
Vijay Kumar authored
When sysrq is triggered from console, serial driver for SUN hypervisor console receives a console break and enables the sysrq. It expects a valid sysrq char following with break. Meanwhile if driver receives 'NULL' ASCII char then it disables sysrq and sysrq handler will never be invoked. This fix skips calling uart sysrq handler when 'NULL' is received while sysrq is enabled. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Acked-by: Karl Volz <karl.volz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rob Gardner authored
Short story: Exception handlers used by some copy_to_user() and copy_from_user() functions do not diligently clean up floating point register usage, and this can result in a user process seeing invalid values in floating point registers. This sometimes makes the process fail. Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions use floating point registers and VIS alignaddr/faligndata to accelerate data copying when source and dest addresses don't align well. Linux uses a lazy scheme for saving floating point registers; It is not done upon entering the kernel since it's a very expensive operation. Rather, it is done only when needed. If the kernel ends up not using FP regs during the course of some trap or system call, then it can return to user space without saving or restoring them. The various memcpy functions begin their FP code with VISEntry (or a variation thereof), which saves the FP regs. They conclude their FP code with VISExit (or a variation) which essentially marks the FP regs "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned off so that a lazy restore will be triggered when/if the user process accesses floating point regs again. The bug is that the user copy variants of memcpy, copy_from_user() and copy_to_user(), employ an exception handling mechanism to detect faults when accessing user space addresses, and when this handler is invoked, an immediate return from the function is forced, and VISExit is not executed, thus leaving the fprs register in an indeterminate state, but often with fprs.FPRS_FEF set and one or more dirty bits. This results in a return to user space with invalid values in the FP regs, and since fprs.FPRS_FEF is on, no lazy restore occurs. This bug affects copy_to_user() and copy_from_user() for NG4, NG2, U3, and U1. All are fixed by using a new exception handler for those loads and stores that are done during the time between VISEnter and VISExit. n.b. In NG4memcpy, the problematic code can be triggered by a copy size greater than 128 bytes and an unaligned source address. This bug is known to be the cause of random user process memory corruptions while perf is running with the callgraph option (ie, perf record -g). This occurs because perf uses copy_from_user() to read user stacks, and may fault when it follows a stack frame pointer off to an invalid page. Validation checks on the stack address just obscure the underlying problem. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rob Gardner authored
There have been several reports of random processes being killed with a bus error or segfault during userspace stack walking in perf. One of the root causes of this problem is an asynchronous modification to thread_info fault_address and fault_code, which stems from a perf counter interrupt arriving during kernel processing of a "benign" fault, such as a TSB miss. Since perf_callchain_user() invokes copy_from_user() to read user stacks, a fault is not only possible, but probable. Validity checks on the stack address merely cover up the problem and reduce its frequency. The solution here is to save and restore fault_address and fault_code in perf_callchain_user() so that the benign fault handler is not disturbed by a perf interrupt. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rob Gardner authored
When an interrupt (such as a perf counter interrupt) is delivered while executing in user space, the trap entry code puts ASI_AIUS in %asi so that copy_from_user() and copy_to_user() will access the correct memory. But if a perf counter interrupt is delivered while the cpu is already executing in kernel space, then the trap entry code will put ASI_P in %asi, and this will prevent copy_from_user() from reading any useful stack data in either of the perf_callchain_user_X functions, and thus no user callgraph data will be collected for this sample period. An additional problem is that a fault is guaranteed to occur, and though it will be silently covered up, it wastes time and could perturb state. In perf_callchain_user(), we ensure that %asi contains ASI_AIUS because we know for a fact that the subsequent calls to copy_from_user() are intended to read the user's stack. [ Use get_fs()/set_fs() -DaveM ] Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rob Gardner authored
Commit 28a1f533 delays setting %pil to avoid potential hardirq stack overflow in the common rtrap_irq path. Setting %pil also needs to be delayed in the rtrap_nmi path for the same reason. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Khalid Aziz authored
Add ADI (Application Data Integrity) capability to cpu capabilities list. ADI capability allows virtual addresses to be encoded with a tag in bits 63-60. This tag serves as an access control key for the regions of virtual address with ADI enabled and a key set on them. Hypervisor encodes this capability as "adp" in "hwcap-list" property in machine description. Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Aya Mahfouz authored
Constifies sunhv_ops structures in tty's serial driver since they are not modified after their initialization. Detected and found using Coccinelle. Suggested-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
The "domain" variable needs to be signed for the error handling to work. Fixes: 8def3103 (cpufreq: arm_big_little: add SCPI interface driver) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 23 Dec, 2015 1 commit
-
-
Mike Kravetz authored
After hooking up system call, userfaultfd selftest was successful for both 32 and 64 bit version of test. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-