- 27 Sep, 2010 40 commits
-
-
Mel Gorman authored
commit 72853e29 upstream. When allocating a page, the system uses NR_FREE_PAGES counters to determine if watermarks would remain intact after the allocation was made. This check is made without interrupts disabled or the zone lock held and so is race-prone by nature. Unfortunately, when pages are being freed in batch, the counters are updated before the pages are added on the list. During this window, the counters are misleading as the pages do not exist yet. When under significant pressure on systems with large numbers of CPUs, it's possible for processes to make progress even though they should have been stalled. This is particularly problematic if a number of the processes are using GFP_ATOMIC as the min watermark can be accidentally breached and in extreme cases, the system can livelock. This patch updates the counters after the pages have been added to the list. This makes the allocator more cautious with respect to preserving the watermarks and mitigates livelock possibilities. [akpm@linux-foundation.org: avoid modifying incoming args] Signed-off-by:
Mel Gorman <mel@csn.ul.ie> Reviewed-by:
Rik van Riel <riel@redhat.com> Reviewed-by:
Minchan Kim <minchan.kim@gmail.com> Reviewed-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by:
Christoph Lameter <cl@linux.com> Reviewed-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Lameter authored
mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake commit aa454840 upstream. Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is cheaper than scanning a number of lists. To avoid synchronization overhead, counter deltas are maintained on a per-cpu basis and drained both periodically and when the delta is above a threshold. On large CPU systems, the difference between the estimated and real value of NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than number of real free page in buddy, the VM can allocate pages below min watermark, at worst reducing the real number of pages to zero. Even if the OOM killer kills some victim for freeing memory, it may not free memory if the exit path requires a new page resulting in livelock. This patch introduces a zone_page_state_snapshot() function (courtesy of Christoph) that takes a slightly more accurate view of an arbitrary vmstat counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid the watermark being accidentally broken. The estimate is not perfect and may result in cache line bounces but is expected to be lighter than the IPI calls necessary to continually drain the per-cpu counters while kswapd is awake. Signed-off-by:
Christoph Lameter <cl@linux.com> Signed-off-by:
Mel Gorman <mel@csn.ul.ie> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mel Gorman authored
commit 9ee493ce upstream. When under significant memory pressure, a process enters direct reclaim and immediately afterwards tries to allocate a page. If it fails and no further progress is made, it's possible the system will go OOM. However, on systems with large amounts of memory, it's possible that a significant number of pages are on per-cpu lists and inaccessible to the calling process. This leads to a process entering direct reclaim more often than it should increasing the pressure on the system and compounding the problem. This patch notes that if direct reclaim is making progress but allocations are still failing that the system is already under heavy pressure. In this case, it drains the per-cpu lists and tries the allocation a second time before continuing. Signed-off-by:
Mel Gorman <mel@csn.ul.ie> Reviewed-by:
Minchan Kim <minchan.kim@gmail.com> Reviewed-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by:
Christoph Lameter <cl@linux.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Divy Le Ray authored
commit a6f018e3 upstream. queue restart tasklets need to be stopped after napi handlers are stopped since the latter can restart them. So stop them after stopping napi. Signed-off-by:
Divy Le Ray <divy@chelsio.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Brandon Philips <bphilips@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Nicolas Ferre authored
commit 8d2602e0 upstream. Reported-by:
Dan Liang <dan.liang@atmel.com> Signed-off-by:
Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Michael Chan authored
commit f048fa9c upstream. The regression is caused by: commit 4327ba43 bnx2: Fix netpoll crash. If ->open() and ->close() are called multiple times, the same napi structs will be added to dev->napi_list multiple times, corrupting the dev->napi_list. This causes free_netdev() to hang during rmmod. We fix this by calling netif_napi_del() during ->close(). Also, bnx2_init_napi() must not be in the __devinit section since it is called by ->open(). Signed-off-by:
Michael Chan <mchan@broadcom.com> Signed-off-by:
Benjamin Li <benli@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Benjamin Li authored
commit 4327ba43 upstream. The bnx2 driver calls netif_napi_add() for all the NAPI structs during ->probe() time but not all of them will be used if we're not in MSI-X mode. This creates a problem for netpoll since it will poll all the NAPI structs in the dev_list whether or not they are scheduled, resulting in a crash when we access structure fields not initialized for that vector. We fix it by moving the netif_napi_add() call to ->open() after the number of IRQ vectors has been determined. Signed-off-by:
Benjamin Li <benli@broadcom.com> Signed-off-by:
Michael Chan <mchan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Zhang Rui authored
commit 81074e90 upstream. Fix a win7 compability issue on Asus K50IJ. Here is the _BCM method of this laptop: Method (_BCM, 1, NotSerialized) { If (LGreaterEqual (OSFG, OSVT)) { If (LNotEqual (OSFG, OSW7)) { Store (One, BCMD) Store (GCBL (Arg0), Local0) Subtract (0x0F, Local0, LBTN) ^^^SBRG.EC0.STBR () ... } Else { DBGR (0x0B, Zero, Zero, Arg0) Store (Arg0, LBTN) ^^^SBRG.EC0.STBR () ... } } } LBTN is used to store the index of the brightness level in the _BCL. GCBL is a method that convert the percentage value to the index value. If _OSI(Windows 2009) is not disabled, LBTN is stored a percentage value which is surely beyond the end of _BCL package. http://bugzilla.kernel.org/show_bug.cgi?id=14753Signed-off-by:
Zhang Rui <rui.zhang@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com> Cc: maximilian attems <max@stro.at> Cc: Paolo Ornati <ornati@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit b4aaa78f upstream. The VIAFB_GET_INFO device ioctl allows unprivileged users to read 246 bytes of uninitialized stack memory, because the "reserved" member of the viafb_ioctl_info struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by:
Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit a122eb2f upstream. The XFS_IOC_FSGETXATTR ioctl allows unprivileged users to read 12 bytes of uninitialized stack memory, because the fsxattr struct declared on the stack in xfs_ioc_fsgetxattr() does not alter (or zero) the 12-byte fsx_pad member before copying it back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Cc: dann frazier <dannf@debian.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David Howells authored
commit 3d96406c upstream. Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership of the parent process's session keyring whether or not the parent has a session keyring [CVE-2010-2960]. This results in the following oops: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 IP: [<ffffffff811ae4dd>] keyctl_session_to_parent+0x251/0x443 ... Call Trace: [<ffffffff811ae2f3>] ? keyctl_session_to_parent+0x67/0x443 [<ffffffff8109d286>] ? __do_fault+0x24b/0x3d0 [<ffffffff811af98c>] sys_keyctl+0xb4/0xb8 [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b if the parent process has no session keyring. If the system is using pam_keyinit then it mostly protected against this as all processes derived from a login will have inherited the session keyring created by pam_keyinit during the log in procedure. To test this, pam_keyinit calls need to be commented out in /etc/pam.d/. Reported-by:
Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Tavis Ormandy <taviso@cmpxchg8b.com> Cc: dann frazier <dannf@debian.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David Howells authored
commit 9d1ac65a upstream. There's an protected access to the parent process's credentials in the middle of keyctl_session_to_parent(). This results in the following RCU warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by keyctl-session-/2137: #0: (tasklist_lock){.+.+..}, at: [<ffffffff811ae2ec>] keyctl_session_to_parent+0x60/0x236 stack backtrace: Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1 Call Trace: [<ffffffff8105606a>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff811ae379>] keyctl_session_to_parent+0xed/0x236 [<ffffffff811af77e>] sys_keyctl+0xb4/0xb6 [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b The code should take the RCU read lock to make sure the parents credentials don't go away, even though it's holding a spinlock and has IRQ disabled. Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: dann frazier <dannf@debian.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Petr Tesarik authored
commit 2d2b6901 upstream. Tony's fix (f574c843) has a small bug, it incorrectly uses "r3" as a scratch register in the first of the two unlock paths ... it is also inefficient. Optimize the fast path again. Signed-off-by:
Petr Tesarik <ptesarik@suse.cz> Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tony Luck authored
commit f574c843 upstream. When ia64 converted to using ticket locks, an inline implementation of trylock/unlock in fsys.S was missed. This was not noticed because in most circumstances it simply resulted in using the slow path because the siglock was apparently not available (under old spinlock rules). Problems occur when the ticket spinlock has value 0x0 (when first initialised, or when it wraps around). At this point the fsys.S code acquires the lock (changing the 0x0 to 0x1. If another process attempts to get the lock at this point, it will change the value from 0x1 to 0x2 (using new ticket lock rules). Then the fsys.S code will free the lock using old spinlock rules by writing 0x0 to it. From here a variety of bad things can happen. Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit 84a8dce2 upstream. A few functions were still modifying i_flags in a racy manner. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Ryan Kuester authored
commit 2a1b7e57 upstream. I may have an explanation for the LSI 1068 HBA hangs provoked by ATA pass-through commands, in particular by smartctl. First, my version of the symptoms. On an LSI SAS1068E B3 HBA running 01.29.00.00 firmware, with SATA disks, and with smartd running, I'm seeing occasional task, bus, and host resets, some of which lead to hard faults of the HBA requiring a reboot. Abusively looping the smartctl command, # while true; do smartctl -a /dev/sdb > /dev/null; done dramatically increases the frequency of these failures to nearly one per minute. A high IO load through the HBA while looping smartctl seems to improve the chance of a full scsi host reset or a non-recoverable hang. I reduced what smartctl was doing down to a simple test case which causes the hang with a single IO when pointed at the sd interface. See the code at the bottom of this e-mail. It uses an SG_IO ioctl to issue a single pass-through ATA identify device command. If the buffer userspace gives for the read data has certain alignments, the task is issued to the HBA but the HBA fails to respond. If run against the sg interface, neither the test code nor smartctl causes a hang. sd and sg handle the SG_IO ioctl slightly differently. Unless you specifically set a flag to do direct IO, sg passes a buffer of its own, which is page-aligned, to the block layer and later copies the result into the userspace buffer regardless of its alignment. sd, on the other hand, always does direct IO unless the userspace buffer fails an alignment test at block/blk-map.c line 57, in which case a page-aligned buffer is created and used for the transfer. The alignment test currently checks for word-alignment, the default setup by scsi_lib.c; therefore, userspace buffers of almost any alignment are given directly to the HBA as DMA targets. The LSI 1068 hardware doesn't seem to like at least a couple of the alignments which cross a page boundary (see the test code below). Curiously, many page-boundary-crossing alignments do work just fine. So, either the hardware has an bug handling certain alignments or the hardware has a stricter alignment requirement than the driver is advertising. If stricter alignment is required, then in no case should misaligned buffers from userspace be allowed through without being bounced or at least causing an error to be returned. It seems the mptsas driver could use blk_queue_dma_alignment() to advertise a stricter alignment requirement. If it does, sd does the right thing and bounces misaligned buffers (see block/blk-map.c line 57). The following patch to 2.6.34-rc5 makes my symptoms go away. I'm sure this is the wrong place for this code, but it gets my idea across. Acked-by:
Kashyap Desai <Kashyap.Desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Paris authored
commit 611da04f upstream. Since the .31 or so notify rewrite inotify has not sent events about inodes which are unmounted. This patch restores those events. Signed-off-by:
Eric Paris <eparis@redhat.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jeff Moyer authored
commit 75e1c70f upstream. Tavis Ormandy pointed out that do_io_submit does not do proper bounds checking on the passed-in iocb array: if (unlikely(nr < 0)) return -EINVAL; if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp))))) return -EFAULT; ^^^^^^^^^^^^^^^^^^ The attached patch checks for overflow, and if it is detected, the number of iocbs submitted is scaled down to a number that will fit in the long. This is an ok thing to do, as sys_io_submit is documented as returning the number of iocbs submitted, so callers should handle a return value of less than the 'nr' argument passed in. Reported-by:
Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by:
Jeff Moyer <jmoyer@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit 46b30ea9 upstream. pcpu_first/last_unit_cpu are used to track which cpu has the first and last units assigned. This in turn is used to determine the span of a chunk for man/unmap cache flushes and whether an address belongs to the first chunk or not in per_cpu_ptr_to_phys(). When the number of possible CPUs isn't power of two, a chunk may contain unassigned units towards the end of a chunk. The logic to determine pcpu_last_unit_cpu was incorrect when there was an unused unit at the end of a chunk. It failed to ignore the unused unit and assigned the unused marker NR_CPUS to pcpu_last_unit_cpu. This was discovered through kdump failure which was caused by malfunctioning per_cpu_ptr_to_phys() on a kvm setup with 50 possible CPUs by CAI Qian. Signed-off-by:
Tejun Heo <tj@kernel.org> Reported-by:
CAI Qian <caiqian@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit fd02db9d upstream. The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "reserved" member of the fb_vblank struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: Thomas Winischhofer <thomas@winischhofer.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andrew Morton authored
commit df08cdc7 upstream. drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw': drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here Move the offending function (and its siblings) to top-of-file, remove the forward declaration. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441Reported-by:
Martin Mokrejs <mmokrejs@ribosome.natur.cuni.cz> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jan Kara authored
commit 371d217e upstream. These devices don't do any writeback but their device inodes still can get dirty so mark bdi appropriately so that bdi code does the right thing and files inodes to lists of bdi carrying the device inodes. Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Patrick Simmons authored
commit c33f543d upstream. This patch adds CPU type detection for the Intel Celeron 540, which is part of the Core 2 family according to Wikipedia; the family and ID pair is absent from the Volume 3B table referenced in the source code comments. I have tested this patch on an Intel Celeron 540 machine reporting itself as Family 6 Model 22, and OProfile runs on the machine without issue. Spec: http://download.intel.com/design/mobile/SPECUPDT/317667.pdfSigned-off-by:
Patrick Simmons <linuxrocks123@netscape.net> Acked-by:
Andi Kleen <ak@linux.intel.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Robert Richter <robert.richter@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stanislaw Gruszka authored
commit e75e863d upstream. We have 32-bit variable overflow possibility when multiply in task_times() and thread_group_times() functions. When the overflow happens then the scaled utime value becomes erroneously small and the scaled stime becomes i erroneously big. Reported here: https://bugzilla.redhat.com/show_bug.cgi?id=633037 https://bugzilla.kernel.org/show_bug.cgi?id=16559Reported-by:
Michael Chapman <redhat-bugzilla@very.puzzling.org> Reported-by:
Ciriaco Garcia de Celis <sysman@etherpilot.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> LKML-Reference: <20100914143513.GB8415@redhat.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Paul E. McKenney authored
commit 950eaaca upstream. [ 23.584719] [ 23.584720] =================================================== [ 23.585059] [ INFO: suspicious rcu_dereference_check() usage. ] [ 23.585176] --------------------------------------------------- [ 23.585176] kernel/pid.c:419 invoked rcu_dereference_check() without protection! [ 23.585176] [ 23.585176] other info that might help us debug this: [ 23.585176] [ 23.585176] [ 23.585176] rcu_scheduler_active = 1, debug_locks = 1 [ 23.585176] 1 lock held by rc.sysinit/728: [ 23.585176] #0: (tasklist_lock){.+.+..}, at: [<ffffffff8104771f>] sys_setpgid+0x5f/0x193 [ 23.585176] [ 23.585176] stack backtrace: [ 23.585176] Pid: 728, comm: rc.sysinit Not tainted 2.6.36-rc2 #2 [ 23.585176] Call Trace: [ 23.585176] [<ffffffff8105b436>] lockdep_rcu_dereference+0x99/0xa2 [ 23.585176] [<ffffffff8104c324>] find_task_by_pid_ns+0x50/0x6a [ 23.585176] [<ffffffff8104c35b>] find_task_by_vpid+0x1d/0x1f [ 23.585176] [<ffffffff81047727>] sys_setpgid+0x67/0x193 [ 23.585176] [<ffffffff810029eb>] system_call_fastpath+0x16/0x1b [ 24.959669] type=1400 audit(1282938522.956:4): avc: denied { module_request } for pid=766 comm="hwclock" kmod="char-major-10-135" scontext=system_u:system_r:hwclock_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclas It turns out that the setpgid() system call fails to enter an RCU read-side critical section before doing a PID-to-task_struct translation. This commit therefore does rcu_read_lock() before the translation, and also does rcu_read_unlock() after the last use of the returned pointer. Reported-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by:
David Howells <dhowells@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit 339db11b upstream. The members of struct llc_sock are unsigned so if we pass a negative value for "opt" it can cause a sign bug. Also it can cause an integer overflow when we multiply "opt * HZ". Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit dd173abf upstream. "param->u.wpa_associate.wpa_ie_len" comes from the user. We should check it so that the copy_from_user() doesn't overflow the buffer. Also further down in the function, we assume that if "param->u.wpa_associate.wpa_ie_len" is set then "abyWPAIE[0]" is initialized. To make that work, I changed the test here to say that if "wpa_ie_len" is set then "wpa_ie" has to be a valid pointer or we return -EINVAL. Oddly, we only use the first element of the abyWPAIE[] array. So I suspect there may be some other issues in this function. Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andy Gospodarek authored
commit ab12811c upstream. It was recently brought to my attention that 802.3ad mode bonds would no longer form when using some network hardware after a driver update. After snooping around I realized that the particular hardware was using page-based skbs and found that skb->data did not contain a valid LACPDU as it was not stored there. That explained the inability to form an 802.3ad-based bond. For balance-alb mode bonds this was also an issue as ARPs would not be properly processed. This patch fixes the issue in my tests and should be applied to 2.6.36 and as far back as anyone cares to add it to stable. Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one. Signed-off-by:
Andy Gospodarek <andy@greyhouse.net> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Jay Vosburgh <fubar@us.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit 44467187 upstream. Fixed formatting (tabs and line breaks). The EQL_GETMASTRCFG device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "master_name" member of the master_config_t struct declared on the stack in eql_g_master_cfg() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit 49c37c03 upstream. Fixed formatting (tabs and line breaks). The CHELSIO_GET_QSET_NUM device ioctl allows unprivileged users to read 4 bytes of uninitialized stack memory, because the "addr" member of the ch_reg struct declared on the stack in cxgb_extension_ioctl() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Rosenberg authored
commit 7011e660 upstream. Fixed formatting (tabs and line breaks). The TIOCGICOUNT device ioctl allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack in hso_get_count() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David S. Miller authored
[ Upstream commit 25edd694 ] This is based upon a report by Meelis Roos showing that it's possible that we'll try to fetch a property that is 32K in size with some devices. With the current fixed 3K buffer we use for moving data in and out of the firmware during PROM calls, that simply won't work. In fact, it will scramble random kernel data during bootup. The reasoning behind the temporary buffer is entirely historical. It used to be the case that we had problems referencing dynamic kernel memory (including the stack) early in the boot process before we explicitly told the firwmare to switch us over to the kernel trap table. So what we did was always give the firmware buffers that were locked into the main kernel image. But we no longer have problems like that, so get rid of all of this indirect bounce buffering. Besides fixing Meelis's bug, this also makes the kernel data about 3K smaller. It was also discovered during these conversions that the implementation of prom_retain() was completely wrong, so that was fixed here as well. Currently that interface is not in use. Reported-by:
Meelis Roos <mroos@linux.ee> Tested-by:
Meelis Roos <mroos@linux.ee> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Timo Teräs authored
[ Upstream commit 81a95f04 ] Realtek confirmed that a 20us delay is needed after mdio_read and mdio_write operations. Reduce the delay in mdio_write, and add it to mdio_read too. Also add a comment that the 20us is from hw specs. Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Acked-by:
Francois Romieu <romieu@fr.zoreil.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Timo Teräs authored
[ Upstream commit 024a07ba ] Some configurations need delay between the "write completed" indication and new write to work reliably. Realtek driver seems to use longer delay when polling the "write complete" bit, so it waits long enough between writes with high probability (but could probably break too). This patch adds a new udelay to make sure we wait unconditionally some time after the write complete indication. This caused a regression with XID 18000000 boards when the board specific phy configuration writing many mdio registers was added in commit 2e955856 (r8169: phy init for the 8169scd). Some of the configration mdio writes would almost always fail, and depending on failure might leave the PHY in non-working state. Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Acked-off-by:
Francois Romieu <romieu@fr.zoreil.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tetsuo Handa authored
[ Upstream commit a9117426d0fcc05a194f728159a2d43df43c7add ] We assumed that unix_autobind() never fails if kzalloc() succeeded. But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can consume all names using fork()/socket()/bind(). If all names are in use, those who call bind() with addr_len == sizeof(short) or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue while (1) yield(); loop at unix_autobind() till a name becomes available. This patch adds a loop counter in order to give up after 1048576 attempts. Calling yield() for once per 256 attempts may not be sufficient when many names are already in use, for __unix_find_socket_byname() can take long time under such circumstance. Therefore, this patch also adds cond_resched() call. Note that currently a local user can consume 2GB of kernel memory if the user is allowed to create and autobind 1048576 UNIX domain sockets. We should consider adding some restriction for autobind operation. Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Alexey Kuznetsov authored
[ Upstream commit 01f83d69 ] If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised window, the SWS logic will packetize to half the MSS unnecessarily. This causes problems with some embedded devices. However for large MSS devices we do want to half-MSS packetize otherwise we never get enough packets into the pipe for things like fast retransmit and recovery to work. Be careful also to handle the case where MSS > window, otherwise we'll never send until the probe timer. Reported-by:
ツ Leandro Melo de Sales <leandroal@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Dumazet authored
[ Upstream commit f037590f ] struct rds_rdma_notify contains a 32 bits hole on 64bit arches, make sure it is zeroed before copying it to user. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> CC: Andy Grover <andy.grover@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Steven J. Magnani authored
[ Upstream commit baff42ab ] tcp_read_sock() can have a eat skbs without immediately advancing copied_seq. This can cause a panic in tcp_collapse() if it is called as a result of the recv_actor dropping the socket lock. A userspace program that splices data from a socket to either another socket or to a file can trigger this bug. Signed-off-by:
Steven J. Magnani <steve@digidescorp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David S. Miller authored
[ Upstream commit 4ce6b9e1621c187a32a47a17bf6be93b1dc4a3df ] In a similar vain to commit 17762060 ("bridge: Clear IPCB before possible entry into IP stack") Any time we call into the IP stack we have to make sure the state there is as expected by the ipv4 code. With help from Eric Dumazet and Herbert Xu. Reported-by:
Brandan Das <brandan.das@stratus.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Herbert Xu authored
[ Upstream commit 17762060 ] The bridge protocol lives dangerously by having incestuous relations with the IP stack. In this instance an abomination has been created where a bogus IPCB area from a bridged packet leads to a crash in the IP stack because it's interpreted as IP options. This patch papers over the problem by clearing the IPCB area in that particular spot. To fix this properly we'd also need to parse any IP options if present but I'm way too lazy for that. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-