- 19 Oct, 2004 40 commits
-
-
William Lee Irwin III authored
I've been informed that /proc/profile livelocks some systems in the timer interrupt, usually at boot. The following patch attempts to amortize the atomic operations done on the profile buffer to address this stability concern. This patch has nothing to do with performance; kernels using periodic timer interrupts are under realtime constraints to complete whatever work they perform within timer interrupts before the next timer interrupt arrives lest they livelock, performing no work whatsoever apart from servicing timer interrupts. The latency of the cacheline bounce for prof_buffer contributes to the time spent in the timer interrupt, hence it must be amortized when remote access latencies or deviations from fair exclusive cacheline acquisition may cause cacheline bounces to take longer than the interval between timer ticks. What this patch does is to create a pair of per-cpu open-addressed hashtables indexed by profile buffer slot holding values representing the number of pending profile buffer hits for the profile buffer slot. When this hashtable overflows, one iterates over the hashtable accounting each of the pairs of profile buffer slots and hit counts to the global profile buffer. Zero is a legitimate profile buffer slot, so zero hit counts represent unused hashtable entries. The hashtable is furthermore protected from flush IPI's by interrupt disablement. In order to flush the pending profile hits for read_profile(), this patch flips betweeen the pairs of per-cpu profile buffer by signalling all cpus to flip via IPI at the time of read_profile(), followed by doing all the work to flush the profile hits from the older per-cpu buffers in the context of the caller of read_profile(), with exclusion provided by a semaphore ensuring that only one caller of profile_flip_buffers() may execute at a time, and using interrupt disablement to prevent buffer flip IPI's from altering the hashtables or flip state while an update is in progress. The flip state is per-cpu so that remote cpus need only disable interrupts locally for synchronization, which is both simple and busywait-free for remote cpus. The flip states all change in tandem when some cpu requests the hashtables be flipped, and the requester waits for the completion of smp_call_function() for notification that all cpus have finished flipping between their hashtables. The IPI handler merely toggles the flip state (which is an array index) between 0 and 1. This is expected to be a much stronger amortization than merely reducing the frequency of profile buffer access by a factor of the size of the hashtable because numerous hits may be held for each of its entries. This reduces what was before the patch a number of atomic increments equal to what after the patch becomes the sum of the hits held for each entry in the hashtable, to a number of atomic_add()'s equal to the number of entries in the per_cpu hashtable. This is nondeterministic, but as the profile hits tend to be concentrated in a very small number of profile buffer slots during any given timing interval, is likely to represent a very large number of atomic increments. This amortization of atomic increments does not depend on the hash function, only the sharp peakedness of the distribution of profile buffer hits. This algorithm has two advantages over full-size per-cpu profile buffers. The first is that the space footprint is much smaller. Per-cpu profile buffers would increase the space requirements by a factor of num_online_cpus(), where this algorithm only requires one page per cpu. The second is that reading the profile state is much faster, because the state that must be traversed is exactly the above space consumers, and the relative reduction in size concomitantly reduces the time required for a read operation. I also took the liberty of adding some commentary to the comments at the beginning of the file reflecting the major work done on profile.c in recent months and describing what the file implements. The reporters of this issue have verified that this resolves their timer interrupt livelock on 512x Altixen. In my own testing on 4x logical x86-64, this patch saw a rate of about 18 flushes per minute under load, or about one flush every 3 seconds, for about 38.4 atomic accesses to the profile buffer per second per cpu in one of the algorithm's worst cases, about 3.84% of the number of atomic profile buffer accesses per second per cpu as a normal kernel would commit. This represents a twenty-six-fold increase in the scalability on SMP systems with 4KB PAGE_SIZE, i.e. with a 4KB PAGE_SIZE, the number of atomic profile buffer accesses per second per cpu is reduced by a factor of 26, thereby increasing the number of cpus a system must have before it would experience a timer interrupt livelock by a factor of 26, with the proviso that cacheline bounces must take the same amount of time to service. This increase in the scalability of the kernel is expected to be much larger for ia64, which has a large PAGE_SIZE, because the distribution of profile buffer hits is so sharply peaked that doubling the hashtable size will much more than double the amortization factor. In fact, only 19 flushes were observed on a 64x Altix over an approximately 10 minute AIM7 run, and 1 flush on a 512x Altix over the course of an entire AIM7 run, for truly vast effective amortization factors. A prior version of this patch, which did not include the node-local hashtable allocation and bounded collision chains has been successfully tested on 64x and 512x ia64 vs 2.6.9-rc2, 8x ia64 vs. 2.6.9-rc2-mm1, 4x x86-64 vs. 2.6.9-rc2-mm1, and 6x sparc64 vs. 2.6.9-rc2-mm1. This patch minus the hashtable initialization fix has been successfully tested on 2x ppc64, 2x alpha, 8x ia64, 6x sparc64, and 4x x86-64, all vs. 2.6.9-rc2-mm1. This precise version of the patch has been successfully tested on 8x ia32 against 2.6.9-rc2-mm1 and 6x sparc64 vs. both 2.6.9-rc2-mm1 and 2.6.9-rc2-mm2. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
The patch below allows all types of filesystems to specify the fscreate mount option (which is used to specify the security context of the filesystem itself). This was previously only available for filesystems with full xattr security labeling, but is also potentially required for filesystems with e.g. psuedo xattr labeling such as devpts and tmpfs. An example of use is to specify at mount time the fs security context of a tmpfs filesystem, overriding the default specified in policy for that filesystem. This patch has been in the Fedora kernel for some weeks with no problems. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andreas Gruenbacher authored
* ext[23]_xattr_list(): - Before inserting an xattr block into the cache, make sure that the block is not corrupted. The check got moved after inserting into the cache in the xattr consolidation patches, so corrupted blocks could become visible to cache users. - Take a variable out of the loop that calls the ->list handlers. * A few cosmetic changes. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch adds xattr support to tmpfs, and a security xattr handler. The purpose of this is to allow udev to be mounted on tmpfs, as used currently by Fedora. Original patch from: Luke Kenneth Casson Leighton <lkcl@lkcl.net>. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch updates the devpts xattr handler code to the generic xattr API, also adds a GPL notice, author and copyright details. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch converts ext2 xattr and acl code to the new generic xattr API. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch converts the ext3 xattr and acl code to the generic xattr API. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch replaces the dentry parameter with an inode in the LSM inode_{set|get|list}security hooks, in keeping with the ext2/ext3 code. dentries are not needed here. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
James Morris authored
This patch consolidates common xattr handling logic into the core fs code, with modifications suggested by Christoph Hellwig (hang off superblock, remove locking, use generic code as methods), for use by ext2, ext3 and devpts, as well as upcoming tmpfs xattr code. Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Carlos Eduardo Medaglia Dyonisio authored
This patch fixes troubles when compiling some applications that include <linux/byteorder/little_endian.h>, like xmms. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ulrich Drepper authored
The last change to alloc_layer in lib/idr.c unnecessarily complicates the code and depending on the definition of spin_unlock will cause worse code to be generated than necessary. The following patch should improve the situation. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Haroldo Gamal authored
This patch fixes "Samba Bugzilla Bug 999". The last version (2.6.8.1) of smbfs kernel module do not honor uid, gid, file_mode and dir_mode supplied by user during mount. This bug is also logged as "Kernel Bug Tracker Bug 3330". To fully work, some modifications are needed to samba smbmount.c and smbmnt.c files. Those patches are available at Samba and Kernel Bug Tracker pages. After those patches, if the user do not supply any of the parameters above, the uid, gid, file_mode and dir_mode on the server will be used by the client. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Hugh and I both thought this would be generally useful. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
This taint didn't appear to be reported. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andi Kleen authored
This patch adds machine check tainting. When a handled machine check occurs the oops gets a new 'M' flag. This is useful to ignore machines with hardware problems in oops reports. On i386 a thermal failure also sets this flag. Done for x86-64 and i386 so far. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Dipankar Sarma authored
Finally some in-tree documentation for RCU-based dcache look-up. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Dipankar Sarma authored
Tested using dcachebench and hevy rename test. http://lse.sourceforge.net/locking/dcache/rename_test/ While going over dcache code, I realized that d_bucket which was introduced to prevent hash chain traversals from going into an infinite loop earlier, is no longer necessary. Originally, when RCU based lock-free lookup was first introduced, dcache hash chains used list_head. Hash chain traversal was terminated when dentry->next reaches the list_head in the hash bucket. However, if renames happen during a lock-free lookup, a dentry may move to different bucket and subsequent hash chain traversal from there onwards may not see the list_head in the original bucket at all. In fact, this would result in the list_head in the bucket interpreted as a list_head in dentry and bad things will happen after that. Once hlist based hash chains were introduced in dcache, the termination condition changed and lock-free traversal would be safe with NULL pointer based termination of hlists. This means that d_bucket check is no longer required. There still exist some theoritical livelocks like a dentry getting continuously moving and lock-free look-up never terminating. But that isn't really any worse that what we have. In return for these changes, we reduce the dentry size by the size of a pointer. That should make akpm and mpm happy. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Dipankar Sarma authored
__d_lookup() has leftover stuff from earlier code to protect it against rename. The smp_rmb() there was needed for the sequence counter logic. Original dcache_rcu had : + move_count = dentry->d_move_count; + smp_rmb(); + if (dentry->d_name.hash != hash) continue; if (dentry->d_parent != parent) continue; This was to make sure that comparisons didn't happen before before the sequence counter was snapshotted. This logic is now gone and memory barrier is not needed. Removing this should also improve performance. The other change is the leftover smp_read_barrier_depends(), later converted to rcu_dereference(). Originally, the name comparison was not protected against d_move() and there could have been a mismatch of allocation size of the name string and dentry->d_name.len. This was avoided by making the qstr update in dentry atomic using a d_qstr pointer. Now, we do ->d_compare() or memcmp() with the d_lock held and it is safe against d_move(). So, there is no need to rcu_dereference() anything. In fact, the current code is meaningless. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Martin Schwidefsky authored
This patch moves some definitions among time.h, times.h, timex.h and jiffies.h. The purpose is to sort all jiffies related functions to jiffies.h, to get rid of the cyclic dependency between time.h & timex.h and to move all #include lines to the start of the header files. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Martin Schwidefsky authored
The CLOCK_TICK_FACTOR and FINETUNE defines from <asm/timex.h> are not used anywhere. Kill them. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Martin Schwidefsky authored
For non-smp kernels the call to update_process_times is done in the do_timer function. It is more consistent with smp kernels to move this call to the architecture file which calls do_timer. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Hellwig authored
security.h gets pulled in in lots of places, so use forward declarations for struct ctl_table instead of pulling sysctl in everywhere. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Hellwig authored
These had been officially deprecated since Rusty's module rewrite, but never got the __deprecated marker. The only remaining users are drm and mtd, so we'll get some warnings for common builds. But maybe that's the only way to get the drm people to fix the mess :) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Hellwig authored
They've been marked deprecated since 2.5.x and there's no more users. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andreas Gruenbacher authored
When building external modules, MODVERDIR is relative to the external module instead of in the kernel source tree. Use the MODVERDIR environment variable instead of the hard-coded path in modpost. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Stelian Pop authored
A simple ringbuffer implementation for various character drivers. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Andi Kleen requested that the number of pagetable pages in use by a process be reported in /proc/$PID/status; this patch implements that. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Relatively minor add-on (not necessarily tied to it or required to be taken or a fix for any bug). Since cond_resched() is using PREEMPT_ACTIVE now, it may be useful to update the open-coded instance of cond_resched() to use the generic call. Also, it should probably be __sched so the caller shows up in wchan. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Not all binfmts page align ->end_code and ->start_code, so the task_mmu statistics calculations need to perform this alignment themselves. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Natalie Protasevich authored
In arch/i386/kernel/acpi/boot.c, platform GSI does not propagate back from mp_register_gsi() to a calling routine which results in IRQ to be set for wrong GSI. This causes most of the PCI slots on the first PCI module to fail. This patch fixes the problem by returning new GSI back to acpi_register_gsi(). Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ian Kent authored
Having recently repaired autofs' ability to recognise updates to maps dynamically I found I needed to reintroduce the directory inode lookup method (I broke the update recognition several versions ago, oops). This patch does this and applies cleanly against 2.6.9-rc1-mm4. As far as I can tell from testing it doesn't introduce any backward incompatibilities. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Zwane Mwaikambo authored
I had to use the following patch to allow multiple arguments to be passed down to the asm stub for alternative_input whilst writing alternatives for mwait code, it seems like a simple enough fix. Signed-off-by: Zwane Mwaikambo <zwane@linuxpower.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
The pid_max sysctl doesn't enforce PID_MAX_LIMIT or sane lower bounds. RESERVED_PIDS + 1 is the minimum pid_max that won't break alloc_pidmap(), and PID_MAX_LIMIT may not be aligned to 8*PAGE_SIZE boundaries for unusual values of PAGE_SIZE, so this also rounds up PID_MAX_LIMIT to it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
/proc/ breaks when PID_MAX_LIMIT is elevated on 32-bit, so this patch lowers it there. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
I was informed that the vendor component of the copyright can't be clobbered without more care, so this patch retains the older vendor, updating it only to reflect the appropriate time period. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Rewrite alloc_pidmap() to clarify control flow by eliminating all usage of goto, honor pid_max and first available pid after last_pid semantics, make only a single pass over the used portion of the pid bitmap, and update copyrights to reflect ongoing maintenance by Ingo and myself. Signed-off-by: William Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Suresh B. Siddha authored
Sync x86_64 noexec behaviour with i386. And remove all the confusing noexec related boot parameters. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Petr Vandrovec authored
For several months I'm receiving complaints from matroxfb users that v4lctl suddenly stops working for them on kernel upgrade. Problem is that VIDIOC_S_CTRL was renumbered, but all distros still use old VIDIOC_S_CTRL value (f.e. even xawtv-3.94 in Debian unstable still uses old VIDIOC_S_CTRL definition). So let's add this old VIDIOC_S_CTRL value (now named VIDIOC_S_CTRL_OLD) to matroxfb's v4l handling. Signed-off-by: Petr Vandrovec <vandrove@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Antonino Daplas authored
Trivial fb_get_options fix for - cyber200fb - bw2fb Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Geert Uytterhoeven authored
fm2fb: Trivial fix for the breakage introduced by the addition of fb_get_options(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-