- 20 Jul, 2010 4 commits
-
-
Manfred Spraul authored
The last change to improve the scalability moved the actual wake-up out of the section that is protected by spin_lock(sma->sem_perm.lock). This means that IN_WAKEUP can be in queue.status even when the spinlock is acquired by the current task. Thus the same loop that is performed when queue.status is read without the spinlock acquired must be performed when the spinlock is acquired. Thanks to kamezawa.hiroyu@jp.fujitsu.com for noticing lack of the memory barrier. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16255 [akpm@linux-foundation.org: clean up kerneldoc, checkpatch warning and whitespace] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Luca Tettamanti <kronos.it@gmail.com> Tested-by: Luca Tettamanti <kronos.it@gmail.com> Reported-by: Christoph Lameter <cl@linux-foundation.org> Cc: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/cbou/battery-2.6.35Linus Torvalds authored
* git://git.infradead.org/users/cbou/battery-2.6.35: ds2782_battery: Fix ds2782_get_capacity return value
-
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdevLinus Torvalds authored
* 'shrinker' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev: xfs: track AGs with reclaimable inodes in per-ag radix tree xfs: convert inode shrinker to per-filesystem contexts mm: add context argument to shrinker callback
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix checks in BTRFS_IOC_CLONE_RANGE Btrfs: fix CLONE ioctl destination file size expansion to block boundary Btrfs: fix split_leaf double split corner case
-
- 19 Jul, 2010 15 commits
-
-
Dave Chinner authored
https://bugzilla.kernel.org/show_bug.cgi?id=16348 When the filesystem grows to a large number of allocation groups, the summing of recalimable inodes gets expensive. In many cases, most AGs won't have any reclaimable inodes and so we are wasting CPU time aggregating over these AGs. This is particularly important for the inode shrinker that gets called frequently under memory pressure. To avoid the overhead, track AGs with reclaimable inodes in the per-ag radix tree so that we can find all the AGs with reclaimable inodes via a simple gang tag lookup. This involves setting the tag when the first reclaimable inode is tracked in the AG, and removing the tag when the last reclaimable inode is removed from the tree. Then the summation process becomes a loop walking the radix tree summing AGs with the reclaim tag set. This significantly reduces the overhead of scanning - a 6400 AG filesystea now only uses about 25% of a cpu in kswapd while slab reclaim progresses instead of being permanently stuck at 100% CPU and making little progress. Clean filesystems filesystems will see no overhead and the overhead only increases linearly with the number of dirty AGs. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Dave Chinner authored
Now the shrinker passes us a context, wire up a shrinker context per filesystem. This allows us to remove the global mount list and the locking problems that introduced. It also means that a shrinker call does not need to traverse clean filesystems before finding a filesystem with reclaimable inodes. This significantly reduces scanning overhead when lots of filesystems are present. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Dan Rosenberg authored
1. The BTRFS_IOC_CLONE and BTRFS_IOC_CLONE_RANGE ioctls should check whether the donor file is append-only before writing to it. 2. The BTRFS_IOC_CLONE_RANGE ioctl appears to have an integer overflow that allows a user to specify an out-of-bounds range to copy from the source file (if off + len wraps around). I haven't been able to successfully exploit this, but I'd imagine that a clever attacker could use this to read things he shouldn't. Even if it's not exploitable, it couldn't hurt to be safe. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> cc: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Linus Torvalds authored
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pci, mrst: Add extra sanity check in walking the PCI extended cap chain x86: Fix x2apic preenabled system with kexec x86: Force HPET readback_cmp for all ATI chipsets
-
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cmLinus Torvalds authored
* 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm: kmemleak: Add support for NO_BOOTMEM configurations kmemleak: Annotate false positive in init_section_page_cgroup()
-
git://git390.marist.edu/pub/scm/linux-2.6Linus Torvalds authored
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] cio: fix potential overflow in chpid descriptor [S390] add missing device put [S390] dasd: use correct label location for diag fba disks
-
Sreedhara DS authored
- fix reversing of command/sub arguments - fix a crash if the i2c interface is called before the device is found Signed-off-by: Sreedhara DS <sreedhara.ds@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sage Weil authored
The CLONE and CLONE_RANGE ioctls round up the range of extents being cloned to the block size when the range to clone extends to the end of file (this is always the case with CLONE). It was then using that offset when extending the destination file's i_size. Fix this by not setting i_size beyond the originally requested ending offset. This bug was introduced by a22285a6 (2.6.35-rc1). Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Chris Mason authored
split_leaf was not properly balancing leaves when it was forced to split a leaf twice. This commit adds an extra push left and right before forcing the double split in hopes of getting the slot where we want to insert at either the start or end of the leaf. If the extra pushes do work, then we are able to avoid splitting twice and we keep the tree properly balanced. Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Catalin Marinas authored
With commits 08677214 and 59be5a8e, alloc_bootmem()/free_bootmem() and friends use the early_res functions for memory management when NO_BOOTMEM is enabled. This patch adds the kmemleak calls in the corresponding code paths for bootmem allocations. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: stable@kernel.org
-
Catalin Marinas authored
The pointer to the page_cgroup table allocated in init_section_page_cgroup() is stored in section->page_cgroup as (base - pfn). Since this value does not point to the beginning or inside the allocated memory block, kmemleak reports a false positive. This was reported in bugzilla.kernel.org as #16297. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Adrien Dessemond <adrien.dessemond@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org>
-
Sebastian Ott authored
The length filed in the chsc response block (if valid) has a value of n*(sizeof(chp_desc))+8 (for the response block header). When we memcopied from the response block to the actual descriptor we copied 8 bytes too much. The bug was not revealed since the descriptor is embedded in struct channel_path. Since we only write one descriptor at a time ignore the length value and use sizeof(*desc). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Stefan Haberland authored
The dasd_alias_show function does not return a device reference in case the device is an alias. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Peter Oberparleiter authored
Partition boundary calculation fails for DASD FBA disks under the following conditions: - disk is formatted with CMS FORMAT with a blocksize of more than 512 bytes - all of the disk is reserved to a single CMS file using CMS RESERVE - the disk is accessed using the DIAG mode of the DASD driver Under these circumstances, the partition detection code tries to read the CMS label block containing partition-relevant information from logical block offset 1, while it is in fact located at physical block offset 1. Fix this problem by using the correct CMS label block location depending on the device type as determined by the DASD SENSE ID information. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Dave Chinner authored
The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
- 18 Jul, 2010 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-rolandLinus Torvalds authored
* 'x86/kprobes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: x86: kprobes: fix swapped segment registers in kretprobe
-
Roland McGrath authored
In commit f007ea26, the order of the %es and %ds segment registers got accidentally swapped, so synthesized 'struct pt_regs' frames have the two values inverted. It's almost sure that these values never matter, and that they also never differ. But wrong is wrong. Signed-off-by: Roland McGrath <roland@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fall back to original BIOS BAR addresses
-
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2Linus Torvalds authored
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Silence gcc warning in ocfs2_write_zero_page(). jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down node ocfs2: Don't duplicate pages past i_size during CoW. ocfs2: tighten up strlen() checking ocfs2: Make xattr reflink work with new local alloc reservation. ocfs2: make xattr extension work with new local alloc reservation. ocfs2: Remove the redundant cpu_to_le64. ocfs2/dlm: don't access beyond bitmap size ocfs2: No need to zero pages past i_size. ocfs2: Zero the tail cluster when extending past i_size. ocfs2: When zero extending, do it by page. ocfs2: Limit default local alloc size within bitmap range. ocfs2: Move orphan scan work to ocfs2_wq. fs/ocfs2/dlm: Add missing spin_unlock
-
Linus Torvalds authored
The hibernate issues that got fixed in commit 985b823b ("drm/i915: fix hibernation since i915 self-reclaim fixes") turn out to have been incomplete. Vefa Bicakci tested lots of hibernate cycles, and without the __GFP_RECLAIMABLE flag the system eventually fails to resume. With the flag added, Vefa can apparently hibernate forever (or until he gets bored running his automated scripts, whichever comes first). The reclaimable flag was there originally, and was one of the flags that were dropped (unintentionally) by commit 4bdadb97 ("drm/i915: Selectively enable self-reclaim") that introduced all these problems, but I didn't want to just blindly add back all the flags in commit 985b823b, and it looked like __GFP_RECLAIM wasn't necessary. It clearly was. I still suspect that there is some subtle reason we're missing that causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use in this context, and is what the code historically used. And we have no idea what the causes the corruption without it. Reported-and-tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 16 Jul, 2010 9 commits
-
-
Jacob Pan authored
The fixed bar capability structure is searched in PCI extended configuration space. We need to make sure there is a valid capability ID to begin with otherwise, the search code may stuck in a infinite loop which results in boot hang. This patch adds additional check for cap ID 0, which is also invalid, and indicates end of chain. End of chain is supposed to have all fields zero, but that doesn't seem to always be the case in the field. Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <1279306706-27087-1-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
-
Yinghai Lu authored
Found one x2apic system kexec loop test failed when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip) first kernel can kexec second kernel, but second kernel can not kexec third one. it can be duplicated on another system with BIOS preenabled x2apic. First kernel can not kexec second kernel. It turns out, when kernel boot with pre-enabled x2apic, it will not execute disable_local_APIC on shutdown path. when init_apic_mappings() is called in setup_arch, it will skip setting of apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic()) Then later, disable_local_APIC() will bail out early because !apic_phys. So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys. another solution could be updating init_apic_mappings() to set apic_phys even for preenabled x2apic system. Actually even for x2apic system, that lapic address is mapped already in early stage. BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C3EB22B.3000701@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: stable@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
-
Joel Becker authored
ocfs2_write_zero_page() has a loop that won't ever be skipped, but gcc doesn't know that. Set ret=0 just to make gcc happy. Signed-off-by: Joel Becker <joel.becker@oracle.com>
-
Bjorn Helgaas authored
If we fail to assign resources to a PCI BAR, this patch makes us try the original address from BIOS rather than leaving it disabled. Linux tries to make sure all PCI device BARs are inside the upstream PCI host bridge or P2P bridge apertures, reassigning BARs if necessary. Windows does similar reassignment. Before this patch, if we could not move a BAR into an aperture, we left the resource unassigned, i.e., at address zero. Windows leaves such BARs at the original BIOS addresses, and this patch makes Linux do the same. This is a bit ugly because we disable the resource long before we try to reassign it, so we have to keep track of the BIOS BAR address somewhere. For lack of a better place, I put it in the struct pci_dev. I think it would be cleaner to attempt the assignment immediately when the claim fails, so we could easily remember the original address. But we currently claim motherboard resources in the middle, after attempting to claim PCI resources and before assigning new PCI resources, and changing that is a fairly big job. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263Reported-by: Andrew <nitr0@seti.kr.ua> Tested-by: Andrew <nitr0@seti.kr.ua> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
-
Linus Torvalds authored
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Add alignment to syscall metadata declarations perf: Sync callchains with period based hits perf: Resurrect flat callchains perf: Version String fix, for fallback if not from git perf: Version String fix, using kernel version
-
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixesLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: rename causes kernel Oops GFS2: BUG in gfs2_adjust_quota GFS2: Fix kernel NULL pointer dereference by dlm_astd GFS2: recovery stuck on transaction lock GFS2: O_TRUNC not working on stuffed files across cluster
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: w90p910_ts - fix call to setup_timer() Input: synaptics - fix wrong dimensions check Input: i8042 - mark stubs in i8042.h "static inline"
-
Wan ZongShun authored
No need to take address, w90p910_ts is already a pointer. Signed-off-by: Wan ZongShun <mcuos.com@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: skcipher - avoid NULL dereference
-
- 15 Jul, 2010 7 commits
-
-
Jan Kara authored
OCFS2 uses t_commit trigger to compute and store checksum of the just committed blocks. When a buffer has b_frozen_data, checksum is computed for it instead of b_data but this can result in an old checksum being written to the filesystem in the following scenario: 1) transaction1 is opened 2) handle1 is opened 3) journal_access(handle1, bh) - This sets jh->b_transaction to transaction1 4) modify(bh) 5) journal_dirty(handle1, bh) 6) handle1 is closed 7) start committing transaction1, opening transaction2 8) handle2 is opened 9) journal_access(handle2, bh) - This copies off b_frozen_data to make it safe for transaction1 to commit. jh->b_next_transaction is set to transaction2. 10) jbd2_journal_write_metadata() checksums b_frozen_data 11) the journal correctly writes b_frozen_data to the disk journal 12) handle2 is closed - There was no dirty call for the bh on handle2, so it is never queued for any more journal operation 13) Checkpointing finally happens, and it just spools the bh via normal buffer writeback. This will write b_data, which was never triggered on and thus contains a wrong (old) checksum. This patch fixes the problem by calling the trigger at the moment data is frozen for journal commit - i.e., either when b_frozen_data is created by do_get_write_access or just before we write a buffer to the log if b_frozen_data does not exist. We also rename the trigger to t_frozen as that better describes when it is called. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
-
Wengang Wang authored
For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set before sending DLM_MIG_LOCKRES_MSG message to the target. We are using dlm_migration_can_proceed() for that purpose. However, if the node is down, dlm_migration_can_proceed() will also return "go ahead". In this rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove the BUG_ON() that trips over this condition. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
-
Tao Ma authored
During CoW, the pages after i_size don't contain valid data, so there's no need to read and duplicate them. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
-
Thomas Gleixner authored
commit 30a564be (x86, hpet: Restrict read back to affected ATI chipset) restricted the workaround for the HPET bug to SMX00 chipsets. This was reasonable as those were the only ones against which we ever got a bug report. Stephan Wolf reported now that this patch breaks his IXP400 based machine. Though it's confirmed to work on other IXP400 based systems. To error out on the safe side, we force the HPET readback workaround for all ATI SMbus class chipsets. Reported-by: Stephan Wolf <stephan@letzte-bankreihe.de> LKML-Reference: <alpine.LFD.2.00.1007142134140.3321@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Stephan Wolf <stephan@letzte-bankreihe.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com>
-
Bob Peterson authored
This patch fixes a kernel Oops in the GFS2 rename code. The problem was in the way the gfs2 directory code was trying to re-use sentinel directory entries. In the failing case, gfs2's rename function was renaming a file to another name that had the same non-trivial length. The file being renamed happened to be the first directory entry on the leaf block. First, the rename code (gfs2_rename in ops_inode.c) found the original directory entry and decided it could do its job by simply replacing the directory entry with another. Therefore it determined correctly that no block allocations were needed. Next, the rename code deleted the old directory entry prior to replacing it with the new name. Therefore, the soon-to-be replaced directory entry was temporarily made into a directory entry "sentinel" or a place holder at the start of a leaf block. Lastly, it went to re-add the replacement directory entry in that leaf block. However, when gfs2_dirent_find_space was looking for space in the leaf block, it used the wrong value for the sentinel. That threw off its calculations so later it decides it can't really re-use the sentinel and therefore must allocate a new leaf block. But because it previously decided to re-use the directory entry, it didn't waste the time to grab a new block allocation for the inode. Therefore, the inode's i_alloc pointer was still NULL and it crashes trying to reference it. In the case of sentinel directory entries, the entire dirent is reused, not just the "free space" portion of it, and therefore the function gfs2_dirent_find_space should use the value 0 rather than GFS2_DIRENT_SIZE(0) for the actual dirent size. Fixing this calculation enables the reproducer programs to work properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Abhijith Das authored
HighMem pages on i686 do not get mapped to the buffer_heads and this was causing a NULL pointer dereference when we were trying to memset page buffers to zero. We now use zero_user() that kmaps the page and directly manipulates page data. This patch also fixes a boundary condition that was incorrect. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Bob Peterson authored
This patch fixes a problem in an error path when looking up dinodes. There are two sister-functions, gfs2_inode_lookup and gfs2_process_unlinked_inode. Both functions acquire and hold the i_iopen glock for the dinode being looked up. The last thing they try to do is hold the i_gl glock for the dinode. If that glock fails for some reason, the error path was incorrectly calling gfs2_glock_put for the i_iopen glock twice. This resulted in the glock being prematurely freed. The "minimum hold time" usually kept the glock in memory, but the lock interface to dlm (aka lock_dlm) freed its memory for the glock. In some circumstances, it would cause dlm's dlm_astd daemon to try to call the bast function for the freed lock_dlm memory, which resulted in a NULL pointer dereference. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-