- 21 Dec, 2011 40 commits
-
-
Ian Campbell authored
commit d3db7281 upstream. d312ae87 "xen: use maximum reservation to limit amount of usable RAM" clamped the total amount of RAM to the current maximum reservation. This is correct for dom0 but is not correct for guest domains. In order to boot a guest "pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for future memory expansion the guest must derive max_pfn from the e820 provided by the toolstack and not the current maximum reservation (which can reflect only the current maximum, not the guest lifetime max). The existing algorithm already behaves this correctly if we do not artificially limit the maximum number of pages for the guest case. For a guest booted with maxmem=512, memory=128 this results in: [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable) [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved) -[ 0.000000] Xen: 0000000000100000 - 0000000008100000 (usable) -[ 0.000000] Xen: 0000000008100000 - 0000000020800000 (unusable) +[ 0.000000] Xen: 0000000000100000 - 0000000020800000 (usable) ... [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI not present or invalid. [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable) -[ 0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000 +[ 0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000 [ 0.000000] initial memory mapped : 0 - 027ff000 [ 0.000000] Base memory trampoline at [c009f000] 9f000 size 4096 -[ 0.000000] init_memory_mapping: 0000000000000000-0000000008100000 -[ 0.000000] 0000000000 - 0008100000 page 4k -[ 0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000 +[ 0.000000] init_memory_mapping: 0000000000000000-0000000020800000 +[ 0.000000] 0000000000 - 0020800000 page 4k +[ 0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000 [ 0.000000] xen: setting RW the range 27e8000 - 27ff000 [ 0.000000] 0MB HIGHMEM available. -[ 0.000000] 129MB LOWMEM available. -[ 0.000000] mapped low ram: 0 - 08100000 -[ 0.000000] low ram: 0 - 08100000 +[ 0.000000] 520MB LOWMEM available. +[ 0.000000] mapped low ram: 0 - 20800000 +[ 0.000000] low ram: 0 - 20800000 With this change "xl mem-set <domain> 512M" will successfully increase the guest RAM (by reducing the balloon). There is no change for dom0. Reported-and-Tested-by: George Shuklin <george.shuklin@gmail.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Alex Deucher authored
commit cf2aff6e upstream. Supposedly both NUTMEG and TRAVIS should use the same panel mode, but switching the panel mode for TRAVIS gets things working. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=41569Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Alex Deucher authored
commit 1d33e1fc upstream. Return the encoder id rather than a boolean. This is needed for differentiate between multiple DP bridge chips. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Alex Deucher authored
commit b4f15f80 upstream. The logic was messy and hard to follow. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Phillip Lougher authored
commit 434a964d upstream. Clement Lecigne reports a filesystem which causes a kernel oops in hfs_find_init() trying to dereference sb->ext_tree which is NULL. This proves to be because the filesystem has a corrupted MDB extent record, where the extents file does not fit into the first three extents in the file record (the first blocks). In hfs_get_block() when looking up the blocks for the extent file (HFS_EXT_CNID), it fails the first blocks special case, and falls through to the extent code (which ultimately calls hfs_find_init()) which is in the process of being initialised. Hfs avoids this scenario by always having the extents b-tree fitting into the first blocks (the extents B-tree can't have overflow extents). The fix is to check at mount time that the B-tree fits into first blocks, i.e. fail if HFS_I(inode)->alloc_blocks >= HFS_I(inode)->first_blocks Note, the existing commit 47f365eb ("hfs: fix oops on mount with corrupted btree extent records") becomes subsumed into this as a special case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly acceptable for the catalog B-Tree file to grow beyond three extents, with the remaining extent descriptors in the extents overfow. This fixes CVE-2011-2203 Reported-by: Clement LECIGNE <clement.lecigne@netasq.com> Signed-off-by: Phillip Lougher <plougher@redhat.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Moritz Mühlenhoff <jmm@inutil.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Eryu Guan authored
commit 8762202d upstream. I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3 image has s_first = 0 in journal superblock, and the 0 is passed to journal->j_head in journal_reset(), then to blocknr in cleanup_journal_tail(), in the end the J_ASSERT failed. So validate s_first after reading journal superblock from disk in journal_get_superblock() to ensure s_first is valid. The following script could reproduce it: fstype=ext3 blocksize=1024 img=$fstype.img offset=0 found=0 magic="c0 3b 39 98" dd if=/dev/zero of=$img bs=1M count=8 mkfs -t $fstype -b $blocksize -F $img filesize=`stat -c %s $img` while [ $offset -lt $filesize ] do if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then echo "Found journal: $offset" found=1 break fi offset=`echo "$offset+$blocksize" | bc` done if [ $found -ne 1 ];then echo "Magic \"$magic\" not found" exit 1 fi dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1 mkdir -p ./mnt mount -o loop $img ./mnt Cc: Jan Kara <jack@suse.cz> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Moritz Mühlenhoff <jmm@inutil.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Mark Langsdorf authored
commit 2ded6e6a upstream. When HPET is operating in RTC mode, the TN_ENABLE bit on timer1 controls whether the HPET or the RTC delivers interrupts to irq8. When the system goes into suspend, the RTC driver sends a signal to the HPET driver so that the HPET releases control of irq8, allowing the RTC to wake the system from suspend. The switchover is accomplished by a write to the HPET configuration registers which currently only occurs while servicing the HPET interrupt. On some systems, I have seen the system suspend before an HPET interrupt occurs, preventing the write to the HPET configuration register and leaving the HPET in control of the irq8. As the HPET is not active during suspend, it does not generate a wake signal and RTC alarms do not work. This patch forces the HPET driver to immediately transfer control of the irq8 channel to the RTC instead of waiting until the next interrupt event. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Link: http://lkml.kernel.org/r/20111118153306.GB16319@alberich.amd.comTested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Konrad Rzeszutek Wilk authored
commit e5fd47bf upstream. The idea behind commit d91ee586 ("cpuidle: replace xen access to x86 pm_idle and default_idle") was to have one call - disable_cpuidle() which would make pm_idle not be molested by other code. It disallows cpuidle_idle_call to be set to pm_idle (which is excellent). But in the select_idle_routine() and idle_setup(), the pm_idle can still be set to either: amd_e400_idle, mwait_idle or default_idle. This depends on some CPU flags (MWAIT) and in AMD case on the type of CPU. In case of mwait_idle we can hit some instances where the hypervisor (Amazon EC2 specifically) sets the MWAIT and we get: Brought up 2 CPUs invalid opcode: 0000 [#1] SMP Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1 RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 ... Call Trace: [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8 [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10 RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 RSP <ffff8801d28ddf10> In the case of amd_e400_idle we don't get so spectacular crashes, but we do end up making an MSR which is trapped in the hypervisor, and then follow it up with a yield hypercall. Meaning we end up going to hypervisor twice instead of just once. The previous behavior before v3.0 was that pm_idle was set to default_idle regardless of select_idle_routine/idle_setup. We want to do that, but only for one specific case: Xen. This patch does that. Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Sascha Hauer authored
commit e58f516f upstream. When we can't configure the dma channel we want to fall back to PIO. We do this by setting host->do_dma to zero. This does not work as do_dma is used to see whether dma can be used for the current transfer. Instead, we have to set host->dma to NULL. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Arnaud Patard authored
commit 9811ccdf upstream. arm_dma_zone_size is used by arm_bootmem_free() which is called by paging_init(). Thus it needs to be set before calling it. Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Axel Lin authored
commit 0b57d760 upstream. wait_for_completion_interruptible_timeout() may return negative value. In this case, checking if (t > 0) will return true if t is unsigned. Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Torvalds authored
commit 13c07b02 upstream. Exactly like roundup_pow_of_two(1), the rounddown version was buggy for the case of a compile-time constant '1' argument. Probably because it originated from the same code, sharing history with the roundup version from before the bugfix (for that one, see commit 1a06a52e: "Fix roundup_pow_of_two(1)"). However, unlike the roundup version, the fix for rounddown is to just remove the broken special case entirely. It's simply not needed - the generic code 1UL << ilog2(n) does the right thing for the constant '1' argment too. The only reason roundup needed that special case was because rounding up does so by subtracting one from the argument (and then adding one to the result) causing the obvious problems with "ilog2(0)". But rounddown doesn't do any of that, since ilog2() naturally truncates (ie "rounds down") to the right rounded down value. And without the ilog2(0) case, there's no reason for the special case that had the wrong value. tl;dr: rounddown_pow_of_two(1) should be 1, not 0. Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Jeff Layton authored
commit 7023676f upstream. Prior to commit eaf35b1e, cifs_save_resume_key had some NULL pointer checks at the top. It turns out that at least one of those NULL pointer checks is needed after all. When the LastNameOffset in a FIND reply appears to be beyond the end of the buffer, CIFSFindFirst and CIFSFindNext will set srch_inf.last_entry to NULL. Since eaf35b1e, the code will now oops in this situation. Fix this by having the callers check for a NULL last entry pointer before calling cifs_save_resume_key. No change is needed for the call site in cifs_readdir as it's not reachable with a NULL current_entry pointer. This should fix: https://bugzilla.redhat.com/show_bug.cgi?id=750247 Cc: Christoph Hellwig <hch@infradead.org> Reported-by: Adam G. Metzler <adamgmetzler@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit a855b84c upstream. Percpu allocator recorded the cpus which map to the first and last units in pcpu_first/last_unit_cpu respectively and used them to determine the address range of a chunk - e.g. it assumed that the first unit has the lowest address in a chunk while the last unit has the highest address. This simply isn't true. Groups in a chunk can have arbitrary positive or negative offsets from the previous one and there is no guarantee that the first unit occupies the lowest offset while the last one the highest. Fix it by actually comparing unit offsets to determine cpus occupying the lowest and highest offsets. Also, rename pcu_first/last_unit_cpu to pcpu_low/high_unit_cpu to avoid confusion. The chunk address range is used to flush cache on vmalloc area map/unmap and decide whether a given address is in the first chunk by per_cpu_ptr_to_phys() and the bug was discovered by invalid per_cpu_ptr_to_phys() translation for crash_note. Kudos to Dave Young for tracking down the problem. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: WANG Cong <xiyou.wangcong@gmail.com> Reported-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> LKML-Reference: <4EC21F67.10905@redhat.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Sebastian Andrzej Siewior authored
commit 9649fa1b upstream. This patch changes fileio to use for_each_sg() when walking se_task->task_sg memory passed into from loopback LLD struct scsi_cmnd scatterlist memory. This addresses an issue where FILEIO backends with loopback where hitting the following OOPs with mkfs.ext2: |kernel BUG at include/linux/scatterlist.h:97! |invalid opcode: 0000 [#1] PREEMPT SMP |Modules linked in: sd_mod tcm_loop target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs scsi_mod | |Pid: 671, comm: LIO_fileio Not tainted 3.1.0-rc10+ #139 Bochs Bochs |EIP: 0060:[<e0afd746>] EFLAGS: 00010202 CPU: 0 |EIP is at fd_do_task+0x396/0x420 [target_core_file] | [<e0aa7884>] __transport_execute_tasks+0xd4/0x190 [target_core_mod] | [<e0aa797c>] transport_execute_tasks+0x3c/0xf0 [target_core_mod] |EIP: [<e0afd746>] fd_do_task+0x396/0x420 [target_core_file] SS:ESP 0068:dea47e90 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Nicholas Bellinger authored
commit 7ae0b103 upstream. This patch sets the missing ISCSI_FLAG_CMD_FINAL bit in iscsit_send_task_mgt_rsp() for a struct iscsi_tm_rsp PDU. This usage is hardcoded for all TM response PDUs in RFC-3720 section 10.6. Reported-by: whucecil <whucecil1999@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Roland Dreier authored
commit 1289a057 upstream. The LSB of the page length is at offset 3, not 2. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Roland Dreier authored
commit 9b5cd7f3 upstream. SBC-3 says: A TRANSFER LENGTH field set to zero specifies that 256 logical blocks shall be written. Any other value specifies the number of logical blocks that shall be written. The old code was always just returning the value in the TRANSFER LENGTH byte. Fix this to return 256 if the byte is 0. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Nicholas Bellinger authored
commit 7e46cf02 upstream. This patch fixes iscsi-target handling of underflow where residual data is causing an OOPs by using the incorrect iscsi_cmd_t->data_length initially assigned in iscsit_allocate_se_cmd(). It resets iscsi_cmd_t->data_length from se_cmd_t->data_length after transport_generic_allocate_tasks() has been invoked in iscsit_handle_scsi_cmd() RX context, and converts iscsi_cmd->residual_count usage to access iscsi_cmd->se_cmd.residual_count to get the proper residual count set by target-core. Reported-by: <lists@internyc.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Nicholas Bellinger authored
commit fef58a60 upstream. This patch changes transport_generic_map_mem_to_cmd() to reject SCSI data overflow and to send exception status with CHECK_CONDITION + TCM_INVALID_CDB_FIELD for fabrics that are passing a pre-populated struct scatterlist (eg: tcm_loop and iscsi-target) being mapped into se_cmd->t_data_sg and se_cmd->t_data_nents. This addresses an OOPs where transport_allocate_data_tasks() would walk the incorrect post OVERFLOW cmd->data_length value beyond the end of the passed scatterlist. Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Tetsuo Handa authored
commit 1418a3e5 upstream. Current tomoyo_realpath_from_path() implementation returns strange pathname when calculating pathname of a file which belongs to lazy unmounted tree. Use local pathname rather than strange absolute pathname in that case. Also, this patch fixes a regression by commit 02125a82 "fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Al Viro authored
commit 02125a82 upstream. __d_path() API is asking for trouble and in case of apparmor d_namespace_path() getting just that. The root cause is that when __d_path() misses the root it had been told to look for, it stores the location of the most remote ancestor in *root. Without grabbing references. Sure, at the moment of call it had been pinned down by what we have in *path. And if we raced with umount -l, we could have very well stopped at vfsmount/dentry that got freed as soon as prepend_path() dropped vfsmount_lock. It is safe to compare these pointers with pre-existing (and known to be still alive) vfsmount and dentry, as long as all we are asking is "is it the same address?". Dereferencing is not safe and apparmor ended up stepping into that. d_namespace_path() really wants to examine the place where we stopped, even if it's not connected to our namespace. As the result, it looked at ->d_sb->s_magic of a dentry that might've been already freed by that point. All other callers had been careful enough to avoid that, but it's really a bad interface - it invites that kind of trouble. The fix is fairly straightforward, even though it's bigger than I'd like: * prepend_path() root argument becomes const. * __d_path() is never called with NULL/NULL root. It was a kludge to start with. Instead, we have an explicit function - d_absolute_root(). Same as __d_path(), except that it doesn't get root passed and stops where it stops. apparmor and tomoyo are using it. * __d_path() returns NULL on path outside of root. The main caller is show_mountinfo() and that's precisely what we pass root for - to skip those outside chroot jail. Those who don't want that can (and do) use d_path(). * __d_path() root argument becomes const. Everyone agrees, I hope. * apparmor does *NOT* try to use __d_path() or any of its variants when it sees that path->mnt is an internal vfsmount. In that case it's definitely not mounted anywhere and dentry_path() is exactly what we want there. Handling of sysctl()-triggered weirdness is moved to that place. * if apparmor is asked to do pathname relative to chroot jail and __d_path() tells it we it's not in that jail, the sucker just calls d_absolute_path() instead. That's the other remaining caller of __d_path(), BTW. * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway - the normal seq_file logics will take care of growing the buffer and redoing the call of ->show() just fine). However, if it gets path not reachable from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped ignoring the return value as it used to do). Reviewed-by: John Johansen <john.johansen@canonical.com> ACKed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Mel Gorman authored
commit 1368edf0 upstream. Commit f5252e00 ("mm: avoid null pointer access in vm_struct via /proc/vmallocinfo") adds newly allocated vm_structs to the vmlist after it is fully initialised. Unfortunately, it did not check that __vmalloc_area_node() successfully populated the area. In the event of allocation failure, the vmalloc area is freed but the pointer to freed memory is inserted into the vmlist leading to a a crash later in get_vmalloc_info(). This patch adds a check for ____vmalloc_area_node() failure within __vmalloc_node_range. It does not use "goto fail" as in the previous error path as a warning was already displayed by __vmalloc_area_node() before it called vfree in its failure path. Credit goes to Luciano Chavez for doing all the real work of identifying exactly where the problem was. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Luciano Chavez <lnx1138@linux.vnet.ibm.com> Tested-by: Luciano Chavez <lnx1138@linux.vnet.ibm.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Michal Hocko authored
commit d0215638 upstream. setup_zone_migrate_reserve() expects that zone->start_pfn starts at pageblock_nr_pages aligned pfn otherwise we could access beyond an existing memblock resulting in the following panic if CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid: IP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 *pdpt = 0000000000000000 *pde = f000ff53f000ff53 Oops: 0000 [#1] SMP Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform EIP: 0060:[<c02d331d>] EFLAGS: 00010006 CPU: 0 EIP is at setup_zone_migrate_reserve+0xcd/0x180 EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000 ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000) Call Trace: [<c02d389c>] __setup_per_zone_wmarks+0xec/0x160 [<c02d3a1f>] setup_per_zone_wmarks+0xf/0x20 [<c08a771c>] init_per_zone_wmark_min+0x27/0x86 [<c020111b>] do_one_initcall+0x2b/0x160 [<c086639d>] kernel_init+0xbe/0x157 [<c05cae26>] kernel_thread_helper+0x6/0xd Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 <2b> 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f EIP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58 CR2: 00000000000012b4 We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because highstart_pfn = 0x36ffe. The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page in a pageblock is reserved before marking it MIGRATE_RESERVE"). Make sure that start_pfn is always aligned to pageblock_nr_pages to ensure that pfn_valid s always called at the start of each pageblock. Architectures with holes in pageblocks will be correctly handled by pfn_valid_within in pageblock_is_reserved. Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Mel Gorman <mgorman@suse.de> Tested-by: Dang Bo <bdang@vmware.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Arve Hjnnevg <arve@android.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Gleixner authored
commit d68fb11c upstream. The clock_getres() function must return the resolution in the timespec argument and return 0 for success. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Yong Zhang authored
commit a33caeb1 upstream. Since commit f59de899 ("lockdep: Clear whole lockdep_map on initialization"), lockdep_init_map() will clear all the struct. But it will break lock_set_class()/lock_set_subclass(). A typical race condition is like below: CPU A CPU B lock_set_subclass(lockA); lock_set_class(lockA); lockdep_init_map(lockA); /* lockA->name is cleared */ memset(lockA); __lock_acquire(lockA); /* lockA->class_cache[] is cleared */ register_lock_class(lockA); look_up_lock_class(lockA); WARN_ON_ONCE(class->name != lock->name); lock->name = name; So restore to what we have done before commit f59de899 but annotate ->lock with kmemcheck_mark_initialized() to suppress the kmemcheck warning reported in commit f59de899. Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Borislav Petkov <bp@alien8.de> Suggested-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111109080451.GB8124@zhySigned-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Youquan Song authored
commit 58a84aa9 upstream. Commit 70b50f94 ("mm: thp: tail page refcounting fix") keeps all page_tail->_count zero at all times. But the current kernel does not set page_tail->_count to zero if a 1GB page is utilized. So when an IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a tail page's _count does not equal zero. kernel BUG at include/linux/mm.h:386! invalid opcode: 0000 [#1] SMP Call Trace: gup_pud_range+0xb8/0x19d get_user_pages_fast+0xcb/0x192 ? trace_hardirqs_off+0xd/0xf hva_to_pfn+0x119/0x2f2 gfn_to_pfn_memslot+0x2c/0x2e kvm_iommu_map_pages+0xfd/0x1c1 kvm_iommu_map_memslots+0x7c/0xbd kvm_iommu_map_guest+0xaa/0xbf kvm_vm_ioctl_assigned_device+0x2ef/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP gup_huge_pud+0xf2/0x159 Signed-off-by: Youquan Song <youquan.song@intel.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Youquan Song authored
commit b6999b19 upstream. With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page failed to be used. The root cause is that 1GB page allocation calls gup_huge_pud() while 2M page calls gup_huge_pmd. If compound pages are used and the page is a tail page, gup_huge_pmd() increases _mapcount to record tail page are mapped while gup_huge_pud does not do that. So when the mapped page is relesed, it will result in kernel oops because the page is not marked mapped. This patch add tail process for compound page in 1GB huge page which keeps the same process as 2M page. Reproduce like: 1. Add grub boot option: hugepagesz=1G hugepages=8 2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages 3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages -net none -device pci-assign,host=07:00.1 kernel BUG at mm/swap.c:114! invalid opcode: 0000 [#1] SMP Call Trace: put_page+0x15/0x37 kvm_release_pfn_clean+0x31/0x36 kvm_iommu_put_pages+0x94/0xb1 kvm_iommu_unmap_memslots+0x80/0xb6 kvm_assign_device+0xba/0x117 kvm_vm_ioctl_assigned_device+0x301/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP put_compound_page+0xd4/0x168 Signed-off-by: Youquan Song <youquan.song@intel.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Claudio Scordino authored
commit b53fc7c2 upstream. Fix the error message "directives may not be used inside a macro argument" which appears when the kernel is compiled for the cris architecture. Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Jonghwan Choi authored
commit 2dbcd05f upstream. If an error occurs after the clock is enabled, the enable/disable state can become unbalanced. Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Mark Brown authored
commit cefcc03f upstream. Allow userspace applications to do more parameter setting by providing a more complete stub DMA driver specifying a wildcard set of formats and channels and essentially random values for the DMA parameters. This is required for useful runtime operation of the dummy DMA driver until we are able to figure out how to power up links and do hw_params() from DAPM. Sending to stable as without this the dummy driver is not terribly useful. Reported-by: Kyung-Kwee Ryu <Kyung-Kwee.Ryu@wolfsonmicro.com> Tested-by: Kyung-Kwee Ryu <Kyung-Kwee.Ryu@wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Hans Verkuil authored
commit 83713fc9 upstream. The function setup_vpif_input_channel_mode() used the VSCLKDIS register instead of VIDCLKCTL. This meant that when in HD mode videoport channel 0 used a different clock from channel 1. Clearly a copy-and-paste error. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Manjunath Hadli <manjunath.hadli@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Anders Darander authored
commit 0e934e22 upstream. Since 193ab2a6, various AT91 boards don't register at91_udc anymore due to depending on a now non-existing symbol. Fix the symbol name. Signed-off-by: Anders Darander <anders@chargestorm.se> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Jean-Christophe PLAGNIOL-VILLARD authored
commit 1808958d upstream. The conid is supposed to be t0/t1/t2_clk. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Wolfram Sang authored
commit 11357be9 upstream. Adding the machine_is_* line was forgotten when converting mach-stmp378x to mach-mxs. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Manjunathappa, Prakash authored
commit f1b21c52 upstream. On OMAP-L138 platform, EDMA event queue 0 should be used for audio transfers so that they are not starved by video data moving on event queue 1. Commit 48519f0a (ASoC: davinci: let platform data define edma queue numbers) had a side-effect of changing this behavior by making the driver actually honor the platform data passed. Fix this now by passing event queue 0 as the queue to be used for audio transfers. Signed-off-by: Manjunathappa, Prakash <prakash.pm@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Gleixner authored
commit c9c024b3 upstream. The expiry function compares the timer against current time and does not expire the timer when the expiry time is >= now. That's wrong. If the timer is set for now, then it must expire. Make the condition expiry > now for breaking out the loop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Takashi Iwai authored
commit f1a73746 upstream. Some HP laptops with IDT 92HD75 codecs may use a GPIO > 4 for the mute LED, but currently the driver doesn't check this properly, and confuses the mute LED behavior. This ended up with the silent output on some HP laptops due to having another GPIO used as external amp control. This patch fixes the problem by checking the max GPIO count and comparing with the given value from DMI entry instead of magic fixed value 4 and 8, and adding a new field to indicate the VREF mute-LED behavior. Reported-and-tested-by: Vitaliy Kulikov <Vitaliy.Kulikov@idt.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Takashi Iwai authored
commit cce4aa37 upstream. When no imux is available (e.g. a single capture source), alc_auto_init_input_src() may trigger an Oops due to the access to -1. Add a proper zero-check to avoid it. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
David Dillow authored
commit fc084e0b upstream. There are some AC97 codec and board combinations that have been observed to take a very long time to respond after the cold reset has completed. In one case, more than 350 ms was required. To allow users to have sound on those platforms, we'll wait up to 500ms for the codec to become ready. As a board may have multiple codecs, with some faster than others to reset, we add a module parameter to inform the driver which codecs should be present. Reported-by: KotCzarny <tjosko@yahoo.com> Signed-off-by: David Dillow <dave@thedillows.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-