- 09 Feb, 2010 31 commits
-
-
David Miller authored
commit 94673e96 upstream. Here are the sparc bits to remove TIF_ABI_PENDING now that set_personality() is called at the appropriate place in exec. Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Torvalds authored
commit 221af7f8 upstream. 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mike Frysinger authored
commit 04e4f2b1 upstream. The current code will load the stack size and protection markings, but then only use the markings in the MMU code path. The NOMMU code path always passes PROT_EXEC to the mmap() call. While this doesn't matter to most people whilst the code is running, it will cause a pointless icache flush when starting every FDPIC application. Typically this icache flush will be of a region on the order of 128KB in size, or may be the entire icache, depending on the facilities available on the CPU. In the case where the arch default behaviour seems to be desired (EXSTACK_DEFAULT), we probe VM_STACK_FLAGS for VM_EXEC to determine whether we should be setting PROT_EXEC or not. For arches that support an MPU (Memory Protection Unit - an MMU without the virtual mapping capability), setting PROT_EXEC or not will make an important difference. It should be noted that this change also affects the executability of the brk region, since ELF-FDPIC has that share with the stack. However, this is probably irrelevant as NOMMU programs aren't likely to use the brk region, preferring instead allocation via mmap(). Signed-off-by:
Mike Frysinger <vapier@gentoo.org> Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Hugh Dickins authored
commit a7016235 upstream. After memory pressure has forced it to dip into the reserves, 2.6.32's 5f8dcc21 "page-allocator: split per-cpu list into one-list-per-migrate-type" has been returning MIGRATE_RESERVE pages to the MIGRATE_MOVABLE free_list: in some sense depleting reserves. Fix that in the most straightforward way (which, considering the overheads of alternative approaches, is Mel's preference): the right migratetype is already in page_private(page), but free_pcppages_bulk() wasn't using it. How did this bug show up? As a 20% slowdown in my tmpfs loop kbuild swapping tests, on PowerMac G5 with SLUB allocator. Bisecting to that commit was easy, but explaining the magnitude of the slowdown not easy. The same effect appears, but much less markedly, with SLAB, and even less markedly on other machines (the PowerMac divides into fewer zones than x86, I think that may be a factor). We guess that lumpy reclaim of short-lived high-order pages is implicated in some way, and probably this bug has been tickling a poor decision somewhere in page reclaim. But instrumentation hasn't told me much, I've run out of time and imagination to determine exactly what's going on, and shouldn't hold up the fix any longer: it's valid, and might even fix other misbehaviours. Signed-off-by:
Hugh Dickins <hugh.dickins@tiscali.co.uk> Acked-by:
Mel Gorman <mel@csn.ul.ie> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 12e9a456 upstream. deactivate_locked_super() will be done by caller of fill_super, doing it there as well is b0rken. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 217686e9 upstream. Error handling in that sucker got broken back in 2003. If function returns 0 on failure, it's not nice to add return -EINVAL into it. Adding return 1 on other failure exits is also not a good thing (and yes, original success exits with 1 and some of failure exits with 0 are still there; so's the original logics in callers). Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 29333920 upstream. A couple of fields in affs_sb_info is used in follow_link() and symlink() for handling AFFS "absolute" symlinks. Need locking against affs_remount() updates. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 7e32b7bb upstream. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 083c73c2 upstream. if 9P ->get_sb() fails late (at root inode or root dentry allocation), we'll hit its ->kill_sb() with NULL ->s_root Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 5998649f upstream. double iput(), leaks... Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit afc70ed0 upstream. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Zhenyu Wang authored
commit c566ec49 upstream. Make sure hangcheck timer won't beat us unexpectedly on Ironlake. Signed-off-by:
Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by:
Eric Anholt <eric@anholt.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jesse Brandeburg authored
commit 9926146b upstream. When testing the "e1000: enhance frame fragment detection" (and e1000e) patches we found some bugs with reducing the MTU size. The 1024 byte descriptor used with the 1000 mtu test also (re) introduced the (originally) reported bug, and causes us to need the e1000_clean_tx_irq "enhance frame fragment detection" fix. So what has occured here is that 2.6.32 is only vulnerable for mtu < 1500 due to the jumbo specific routines in both e1000 and e1000e. So, 2.6.32 needs the 2kB buffer len fix for those smaller MTUs, but is not vulnerable to the original issue reported. It has been pointed out that this vulnerability needs to be patched in older kernels that don't have the e1000 jumbo routine. Without the jumbo routines, we need the "enhance frame fragment detection" fix the e1000, old e1000e is only vulnerable for < 1500 mtu, and needs a similar fix. We split the patches up to provide easy backport paths. There is only a slight bit of extra code when this fix and the original "enhance frame fragment detection" fixes are applied, so please apply both, even though it is a bit of overkill. Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jesse Brandeburg authored
commit b94b5028 upstream. Originally patched by Neil Horman <nhorman@tuxdriver.com> e1000e could with a jumbo frame enabled interface, and packet split disabled, receive a packet that would overflow a single rx buffer. While in practice very hard to craft a packet that could abuse this, it is possible. this is related to CVE-2009-4538 Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jesse Brandeburg authored
commit 40a14dea upstream. Originally From: Neil Horman <nhorman@tuxdriver.com> Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com> Hey all- A security discussion was recently given: http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html And a patch that I submitted awhile back was brought up. Apparently some of their testing revealed that they were able to force a buffer fragment in e1000 in which the trailing fragment was greater than 4 bytes. As a result the fragment check I introduced failed to detect the fragement and a partial invalid frame was passed up into the network stack. I've written this patch to correct it. I'm in the process of testing it now, but it makes good logical sense to me. Effectively it maintains a per-adapter state variable which detects a non-EOP frame, and discards it and subsequent non-EOP frames leading up to _and_ _including_ the next positive-EOP frame (as it is by definition the last fragment). This should prevent any and all partial frames from entering the network stack from e1000. Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mika Westerberg authored
commit c5ce5b46 upstream. Do not use an unchecked variable UBI_IOCMKVOL ioctl. Signed-off-by:
Mika Westerberg <ext-mika.1.westerberg@nokia.com> Signed-off-by:
Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Zhao Yakui authored
commit 6a4e2b75 upstream. If the BIOS pokes the system-wide OSC bits to see if Linux supports evaluating _OST after a _PPC change notification, answer yes. Also, fix an oversight where we neglected to set the OSC bit advertising processor aggregator device support when acpi-pad is compiled as a module. Signed-off-by:
Zhao Yakui <yakui.zhao@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Shaohua Li authored
commit 9dc130fc upstream. Executing _OSC returns a buffer, which has an acpi object in it. Don't directly returns the buffer, instead, we return the acpi object's buffer. This fixes a regression since caller of acpi_run_osc expects an acpi object's buffer returned. Tested-by:
Yinghai Lu <yinghai@kernel.org> Signed-off-by:
Shaohua Li <shaohua.li@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Shaohua Li authored
commit 3563ff96 upstream. Signed-off-by:
Shaohua Li <shaohua.li@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Shaohua Li authored
commit 70023de8 upstream. v2->v1: .improve debug info as suggedted by Bjorn,Kenji .API is using uuid string as suggested by Alexey Add an API to execute _OSC. A lot of devices can have this method, so add a generic API. Signed-off-by:
Shaohua Li <shaohua.li@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stefan Haberland authored
commit 294001a8 upstream. Fix possible NULL pointer in DASD messages and correct discipline checking. Signed-off-by:
Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Felix Beck authored
commit 19b123eb upstream. In a case where the number of the input data is bigger than the modulus of the key, the coprocessor adapters will report an 8/72 error. This case is not caught yet, thus the adapter will be taken offline. To prevent this, we return an -EINVAL instead. Signed-off-by:
Felix Beck <felix.beck@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit 534ead70 upstream. libata currently doesn't retry if a command fails with AC_ERR_INVALID assuming that retrying won't get it any further even if retried. However, a failure may be classified as invalid through hardware glitch (incorrect reading of the error register or firmware bug) and there isn't whole lot to gain by not retrying as actually invalid commands will be failed immediately. Also, commands serving FS IOs are extremely unlikely to be invalid. Retry FS IOs even if it's marked invalid. Transient and incorrect invalid failure was seen while debugging firmware related issue on Samsung n130 on bko#14314. http://bugzilla.kernel.org/show_bug.cgi?id=14314Signed-off-by:
Tejun Heo <tj@kernel.org> Reported-by:
Johannes Stezenbach <js@sig21.net> Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
H. Peter Anvin authored
commit b1600918 upstream. CONFIG_X86_CPU_DEBUG, which provides some parsed versions of the x86 CPU configuration via debugfs, has caused boot failures on real hardware. The value of this feature has been marginal at best, as all this information is already available to userspace via generic interfaces. Causes crashes that have not been fixed + minimal utility -> remove. See the referenced LKML thread for more information. Reported-by:
Ozan Çağlayan <ozan@pardus.org.tr> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> LKML-Reference: <alpine.LFD.2.00.1001221755320.13231@localhost.localdomain> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David Rientjes authored
commit 3a5fc0e4 upstream. nodes_possible_map does not currently include nodes that have SRAT entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is cleared in nodes_parsed if it does not have an online address range. Unequivocally setting the bit in nodes_parsed is insufficient since existing code, such as acpi_get_nodes(), assumes all nodes in the map have online address ranges. In fact, all code using nodes_parsed assumes such nodes represent an address range of online memory. nodes_possible_map is created by unioning nodes_parsed and cpu_nodes_parsed; the former represents nodes with online memory and the latter represents memoryless nodes. We now set the bit for hotpluggable nodes in cpu_nodes_parsed so that it also gets set in nodes_possible_map. [ hpa: Haicheng Li points out that this makes the naming of the variable cpu_nodes_parsed somewhat counterintuitive. However, leave it as is in the interest of keeping the pure bug fix patch small. ] Signed-off-by:
David Rientjes <rientjes@google.com> Tested-by:
Haicheng Li <haicheng.li@linux.intel.com> LKML-Reference: <alpine.DEB.2.00.1001201152040.30528@chino.kir.corp.google.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Martin Schwidefsky authored
commit 21ec7f6d upstream. If irq flags tracing is enabled the TRACE_IRQS_ON macros expands to a function call which clobbers registers %r0-%r5. The macro is used in the code path for single stepped system calls. The argument registers %r2-%r6 need to be restored from the stack before the system call function is called. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stefan Richter authored
commit 7a481436 upstream. Unsurprisingly, Texas Instruments TSB43AB23 exhibits the same behaviour as TSB43AB22/A in dual buffer IR DMA mode: If descriptors are located at physical addresses above the 31 bit address range (2 GB), the controller will overwrite random memory. With luck, this merely prevents video reception. With only a little less luck, the machine crashes. We use the same workaround here as with TSB43AB22/A: Switch off the dual buffer capability flag and use packet-per-buffer IR DMA instead. Another possible workaround would be to limit the coherent DMA mask to 31 bits. In Linux 2.6.33, this change serves effectively only as documentation since dual buffer mode is not used for any controller anymore. But somebody might want to re-enable it in the future to make use of features of dual buffer DMA that are not available in packet-per-buffer mode. In Linux 2.6.32 and older, this update is vital for anyone with this controller, more than 2 GB RAM, a 64 bit kernel, and FireWire video or audio applications. We have at least four reports: http://bugzilla.kernel.org/show_bug.cgi?id=13808 http://marc.info/?l=linux1394-user&m=126154279004083 https://bugzilla.redhat.com/show_bug.cgi?id=552142 http://marc.info/?l=linux1394-user&m=126432246128386 Reported-by: Paul Johnson Reported-by: Ronneil Camara Reported-by: G Zornetzer Reported-by: Mark Thompson Signed-off-by:
Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Chris Wilson authored
commit 4bdadb97 upstream. Having missed the ENOMEM return via i915_gem_fault(), there are probably other paths that I also missed. By not enabling NORETRY by default these paths can run the shrinker and take memory from the system (but not from our own inactive lists because our shrinker can not run whilst we hold the struct mutex) and this may allow the system to survive a little longer whilst our drivers consume all available memory. References: OOM killer unexpectedly called with kernel 2.6.32 http://bugzilla.kernel.org/show_bug.cgi?id=14933 v2: Pass gfp into page mapping. v3: Use new read_cache_page_gfp() instead of open-coding. Signed-off-by:
Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Torvalds authored
commit 0531b2aa upstream. It's a simplified 'read_cache_page()' which takes a page allocation flag, so that different paths can control how aggressive the memory allocations are that populate a address space. In particular, the intel GPU object mapping code wants to be able to do a certain amount of own internal memory management by automatically shrinking the address space when memory starts getting tight. This allows it to dynamically use different memory allocation policies on a per-allocation basis, rather than depend on the (static) address space gfp policy. The actual new function is a one-liner, but re-organizing the helper functions to the point where you can do this with a single line of code is what most of the patch is all about. Tested-by:
Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Anatolij Gustschin authored
commit f1053a7c upstream. Since commit 9d2e9d66 mptsas driver fails to allocate memory for the MPT chain buffers for second LSI adapter on PPC440SPe Katmai platform: ... ioc1: LSISAS1068E B3: Capabilities={Initiator} mptbase: ioc1: ERROR - Unable to allocate Reply, Request, Chain Buffers! mptbase: ioc1: ERROR - didn't initialize properly! (-3) mptsas: probe of 0002:31:00.0 failed with error -3 This commit increased MPT_FC_CAN_QUEUE value but initChainBuffers() doesn't differentiate between SAS and FC causing increased allocation for SAS case, too. Later pci_alloc_consistent() fails to allocate increased chain buffer pool size for SAS case. Provide a fix by looking at the bus type and using appropriate MPT_SAS_CAN_QUEUE value while calculation of the number of chain buffers. Signed-off-by:
Anatolij Gustschin <agust@denx.de> Acked-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Boaz Harrosh authored
commit 63c43b0e upstream. Because of the terrible structuring of scsi-bidi-commands it breaks some of the life time rules of a scsi-command. It is now not allowed to free up the block-request before cleanup and partial deallocation of the scsi-command. (Which is not so for none bidi commands) The right fix to this problem would be to make bidi command a first citizen by allocating a scsi_sdb pointer at scsi command just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated as part of the get/put_command (Again like the prot_sdb) and the current decoupling of scsi_cmnd and blk-request should be kept. For now make sure scsi_release_buffers() is called before the call to blk_end_request_all() which might cause the suicide of the block requests. At best the leak of bidi buffers, at worse a crash, as there is a race between the existence of the bidi_request and the free of the associated bidi_sdb. The reason this was never hit before is because only OSD has the potential of doing asynchronous bidi commands. (So does bsg but it is never used) And OSD clients just happen to do all their bidi commands synchronously, up until recently. Signed-off-by:
Boaz Harrosh <bharrosh@panasas.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- 28 Jan, 2010 9 commits
-
-
Greg Kroah-Hartman authored
-
Russ Anderson authored
commit da482474 upstream. Pass the number of minors when unregistering MSR and CPUID drivers. Reported-by:
Dean Nelson <dnelson@redhat.com> Signed-off-by:
Dean Nelson <dnelson@redhat.com> LKML-Reference: <20100127023722.GA22305@sgi.com> Signed-off-by:
Russ Anderson <rja@sgi.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Greg Kroah-Hartman authored
commit b04da8bf upstream. Commit 70362511 exposed that f_modown() should call write_lock_irqsave instead of just write_lock_irq so that because a caller could have a spinlock held and it would not be good to renable interrupts. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Wey-Yi Guy authored
commit 1152dcc2 upstream Similar to 6000 and 1000 series, RTS/CTS is the recommended protection mechanism for 5000 series in HT mode based on the HW design. Using RTS/CTS will better protect the inner exchange from interference, especially in highly-congested environment, it also prevent uCode encounter TX FIFO underrun and other HT mode related performance issues. Signed-off-by:
Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by:
Reinette Chatre <reinette.chatre@intel.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Len Brown authored
upstream in 2.6.33-rc: 5d76b6f6 Refreshed here for 2.6.32.y, applies w/ offset back to 2.6.29.y. Linux has always ignored ACPI BIOS C2 with exit latency > 100 usec, and the ACPI spec is clear that is correct FADT-supplied C2. However, the ACPI spec explicitly states that _CST-supplied C-states have no latency limits. So move the 100usec C2 test out of the code shared by FADT and _CST code-paths, and into the FADT-specific path. This bug has not been visible until Nehalem, which advertises a CPU-C2 worst case exit latency on servers of 205usec. That (incorrect) figure is being used by BIOS writers on mobile Nehalem systems for the AC configuration. Thus, Linux ignores C2 leaving just C1, which is saves less power, and also impacts performance by preventing the use of turbo mode. http://bugzilla.kernel.org/show_bug.cgi?id=15064Tested-by:
Alex Chiang <achiang@hp.com> Signed-off-by:
Len Brown <len.brown@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Pallipadi, Venkatesh authored
commit 6c56ccec upstream. Commit 83ce4009 did the following change If the TSC is constant and non-stop, also set it reliable. But, there seems to be few systems that will end up with TSC warp across sockets, depending on how the cpus come out of reset. Skipping TSC sync test on such systems may result in time inconsistency later. So, reenable TSC sync test even on constant and non-stop TSC systems. Set, sched_clock_stable to 1 by default and reset it in mark_tsc_unstable, if TSC sync fails. This change still gives perf benefit mentioned in 83ce4009 for systems where TSC is reliable. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by:
Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091217202702.GA18015@linux-os.sc.intel.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David J. Wilder authored
commit 0cd4d0fd upstream. IPoIB can miss a change in destination GID under some conditions. The problem is caused when ipoib_neigh->dgid contains a stale address. The fix is to set ipoib_neigh->dgid to zero in ipoib_neigh_alloc(). This can happen when a system using bonding on its IPoIB interfaces has switched its active interface from interface A to B and back to A. The system that fails over will not correctly processes the 2nd address change, as described below. When an address has changed neighbor->ha is updated with the new address. Each neighbor has an associated ipoib_neigh. ipoib_neigh->dgid also holds a copy of the remote node's hardware address. When an address changes neighbor->ha is updated by the network layer (arp code) with the new address. IPoIB detects this change in ipoib_start_xmit() by comparing neighbor->ha with ipoib_neigh->dgid. The bug is that ipoib_neigh->dgid may already contain the new address (A) thus the change from B to A is missed by ipoib. Here is the sequence of events: ipoib_neigh->dgid = A and neighbor->ha = A The address is switched to B (the first switch) neighbor->ha = B The change is seen in ipoib_start_xmit() -- neighbor->ha != ipoib_neigh->dgid so ipoib_neigh is released, and a new one is allocated. The allocator may return the same chunk of memory that was just released, therefore ipoib_neigh->dgid still contains A at this point. ipoib_neigh->dgid should be updated in neigh_add_path(), but if the following conditions are true dgid is not updated: 1) __path_find() returns a path 2) path->ah is NULL The remote system now switches from address B to A, neighbor->ha is updated to A. Now we have again : ipoib_neigh->dgid = A and neighbor->ha = A Since the addresses are the same ipoib won't process the change in address. Fix this by zeroing out the dgid field when allocating a new struct ipoib_neigh. Signed-off-by:
David Wilder <dwilder@us.ibm.com> Signed-off-by:
Roland Dreier <rolandd@cisco.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
commit e50212bb upstream. Otherwise kvm might attempt to dereference a NULL pointer. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jiri Slaby authored
commit 0c6ddceb upstream. Stanse found 2 lock imbalances in kvm_request_irq_source_id and kvm_free_irq_source_id. They omit to unlock kvm->irq_lock on fail paths. Fix that by adding unlock labels at the end of the functions and jump there from the fail paths. Signed-off-by:
Jiri Slaby <jirislaby@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-