- 25 May, 2003 40 commits
-
-
Andrew Morton authored
- Add an explanation for clearing the focus bit on P4 (zwane) - __d_path kerneldoc fix (John Levon) - generic-hdlc documentation fix (Krzysztof Halasa <khc@pm.waw.pl>) - cmdline_read_proc cleanup (Oleg Drokin) - remove a couple of unused vars from drivers/ide/pci/hpt366.c - sound/core/sgbuf.c needs mm.h at least on alpha, for mem_map and other page stuff. (Ivan Kokshaysky <ink@jurassic.park.msu.ru>) - Don't use "u32 long" in cs46xx.c (Kevin Puetz <puetzk@puetzk.org>) - fs/nfs/nfs4xdr.c warning fix: all the `goto out;' statements are commented away, so comment away the label too. - net/ipv6/af_inet6.c: remove unused var - drivers/media/video/bttv-cards.c: jiffies are unsigned long - drivers/media/video/saa7134/saa7134-cards.c: unused var - Fix Documentation/Changes comment wrt sparc compiler version - drivers/pnp/quirks.c needs slab.h for kfree(). (Daniele Bellucci <bellucda@tiscali.it>)
-
Andrew Morton authored
From: David Gibson <david@gibson.dropbear.id.au> Renames check_valid_hugepage_range() to is_hugepage_only_range(), which makes more sense.
-
Andrew Morton authored
From: Manfred Spraul <manfred@colorfullife.com> de_thread is called by exec to kill all threads in the thread group except the threads required for exec. The waiting is implemented by waiting for a wakeup from __exit_signal: If the reference count is less or equal to 2, then the waiter is woken up. If exec is called by a non-leader thread, then two threads are required for exec. But if a thread group leader calls exec, then only one thread is required for exec. Thus the hardcoded "2" leads to a superfluous wakeup. The patch fixes that by adding a "notify_count" field to the signal structure.
-
Andrew Morton authored
From: Frank Cusack <fcusack@fcusack.com> net/sunrpc/sunrpc_syms.c typo fix
-
Andrew Morton authored
From: Dave Hansen <haveblue@us.ibm.com> This patch makes vm_enough_memory(), more likely return failure when overcommit_memory==0 and !CAP_SYS_ADMIN. I'm not sure it's worth having another tunable just for this. I also reworked the documentation a bit. It should be a lot clearer to read now.
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> This patch against 2.5.69-bk adds an xattr handler for security labels to devpts and corresponding hooks to the LSM API to support conversion between xattr values and the security labels stored in the inode security field by the security module. This allows userspace to get and set the security labels on devpts nodes, e.g. so that sshd can set the security label for the pty using setxattr, just as sshd already sets the ownership using chown. SELinux uses this support to protect the pty in accordance with the user process' security label. The changes to the LSM API are general and should be re-useable by xattr handlers in other pseudo filesystems to support similar security labeling. The xattr handler for devpts includes the same generic framework as in ext[23], so handlers for other kinds of attributes can be added easily in the future.
-
Andrew Morton authored
From: Christopher Hoover <ch@murgatroid.com> Here's a patch to drop some more text/data/bss out of 2.5. This time the ``victim'' is eventpollfs (epoll).
-
Andrew Morton authored
From: Christopher Hoover <ch@murgatroid.com> Not everyone needs futex support, so it should be optional. This is needed for small platforms.
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> This patch against 2.5.69-bk adds a hook to proc_pid_make_inode to allow security modules to set the security attributes on /proc/pid inodes based on the security attributes of the associated task. This is required by SELinux in order to control access to the process state accessible via /proc/pid inodes in accordance with the task's security label. An alternative approach that was considered was to implement an xattr handler for /proc/pid inodes. That approach would still require a hook call from the xattr handler to the security module to obtain an xattr value based on the task security attributes, so it would add a further level of indirection/translation. The only benefit of implementing an xattr handler for the /proc/pid inodes would be that the /proc/pid inode security labels could then be exported to userspace. However, the /proc/pid inode security labels are only used internally by the security module for access control purposes, and userspace access to the full range of process attributes is already provided via the /proc/pid/attr interface. Consequently, a simple hook in proc_pid_make_inode seemed preferable.
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> This patch, relative to the /proc/pid/attr patch against 2.5.69, fixes the mode values of the /proc/pid/attr nodes to avoid interference by the normal Linux access checks for these nodes (and also fixes the /proc/pid/attr/prev mode to reflect its read-only nature). Otherwise, when the dumpable flag is cleared by a set[ug]id or unreadable executable, a process will lose the ability to set its own attributes via writes to /proc/pid/attr due to a DAC failure (/proc/pid inodes are assigned the root uid/gid if the task is not dumpable, and the original mode only permitted the owner to write). The security module should implement appropriate permission checking in its [gs]etprocattr hook functions. In the case of SELinux, the setprocattr hook function only allows a process to write to its own /proc/pid/attr nodes as well as imposing other policy-based restrictions, and the getprocattr hook function performs a permission check between the security labels of the current process and target process to determine whether the operation is permitted.
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> This updated patch against 2.5.69 merges the readdir and lookup routines for proc_base and proc_attr, fixes the copy_to_user call in proc_attr_read and proc_info_read, moves the new data and code within CONFIG_SECURITY, and uses ARRAY_SIZE, per the comments from Al Viro and Andrew Morton. As before, this patch implements a process attribute API for security modules via a set of nodes in a /proc/pid/attr directory. Credit for the idea of implementing this API via /proc/pid/attr nodes goes to Al Viro. Jan Harkes provided a nice cleanup of the implementation to reduce the code bloat.
-
Andrew Morton authored
All slabs which can be reclaimed via VM presure are marked as being shrinkable, so the core slab code will keep count of their pages. Except for the one in XFS. It has strange wrapper stuff.
-
Andrew Morton authored
We have a problem at present in vm_enough_memory(): it uses smoke-n-mirrors to try to work out how much memory can be reclaimed from dcache and icache. it sometimes gets it quite wrong, especially if the slab has internal fragmentation. And it often does. So here we take a new approach. Rather than trying to work out how many pages are reclaimable by counting up the number of inodes and dentries, we change the slab allocator to keep count of how many pages are currently used by slabs which can be shrunk by the VM. The creator of the slab marks the slab as being reclaimable at kmem_cache_create()-time. Slab keeps a global counter of pages which are currently in use by thus-tagged slabs. Of course, we now slightly overestimate the amount of reclaimable memory, because not _all_ of the icache, dcache, mbcache and quota caches are reclaimable. But I think it's better to be a bit permissive rather than bogusly failing brk() calls as we do at present.
-
Andrew Morton authored
From: Neil Brown <neilb@cse.unsw.edu.au> When an NFS request arrives, it contains a filehandle which needs to be converted to a dentry. Many filesystems use find_exported_dentry in fs/exportfs/expfs.c. A key part of this on filesystem where a 32bit inode number uniquely locates a file is export_iget which calls iget(sb, inum). iget will either: 1/ find the inode in the inode cache and return it or 2/ create a new inode and call ->read_inode to load it from the storage device. export_iget then verifies the inode is really a good inode (->read_inode didn't detect any problems) and the right inode (base on generation number from the file handle). For this to work reliably, it is important that whenever an inode is *not* in the cache, the on-device version is up-to-date. Otherwise, when read_inode loads the inode it will get bad data. For a file that has not been deleted, this condition always holds: a dirty inode is always flushed to disc before the inode is unhashed. However for a file that is being deleted this condition doesn't (didn't) hold. When iput -> iput_final -> generic_drop_inode -> generic_delete_inode is called we would unhash the inode before calling into the filesytem through ->delete_inode. So there is a small window between when generic_delete_inode unhashes the inode, and when ->delete_inode writes something to disc, where a call to ->read_inode (for export_iget) might discover what it thinks is a valid inode, but is really one that is in the process of being destroyed. It is this window that I want to close by moving the unhashing to the end of generic_delete_inode.
-
Andrew Morton authored
From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> There are a couple of places in the readdir code where it forgets to set the returned error code to -EFAULT, leaving it at the default -EINVAL. Fix that up, and rename getdents_callback64.count to "result", which makes more sense.
-
Andrew Morton authored
From zwane We shutdown the MAC part of the card and have interrupts disabled, interrupt gets queued, we reenable interrupts after shutting down device, service the interrupt, check status and get 0xff from powered down device. No idea what he's talking about here, but apparently the irq return handling isn't working out. Just return IRQ_HANDLED all the time.
-
Andrew Morton authored
From: Oleg Drokin <green@namesys.com> This is a forward port of 2.4's inode attributes support for reiserfs. Original implementation for 2.4 was performed by Nikita Danilov. In order to enable this support, one must use "attrs" mount options, eg: mount /dev/hda1 /mount/pont -t reiserfs -o attrs Also either the filesystem must have been created with a recent mkreiserfs or must have been modified by a recent version of reiserfsck with its "--clean-attributes" option. If that is not done, attributes support will not be enabled and a kernel message will be printed. This is necessary because old kernels left random garbage in the place where these attributes now live. These attributes are totally compatible with ext2's ones. You can manipulate them with chattr/lsattr etc. Additionally the chattr 'd' option may be used to disable tail packing on a specific file or a directory tree. (The 'd' option normally means "don't dump". reiserfs has overloaded it).
-
Andrew Morton authored
From: Zwane Mwaikambo <zwane@linuxpower.ca> kapmd does a conditional check in order to decide whether to set the task's cpu affinity mask. This can change during runtime, therefore we unconditionally set it. There is an early exit in set_cpus_allowed if the current processor is in the allowed mask anyway.
-
Andrew Morton authored
__unhash_process acquires the dcache_lock while holding the tasklist_lock for writing. This can deadlock. Additionally, fs/proc/base.c incorrectly assumed that p->pid would be set to 0 during release_task. The patch fixes that by adding a new spinlock to the task structure and fixing all references to (!p->pid). The alternative to the new spinlock would be to hold dcache_lock around __unhash_process. - fs/proc/base.c assumed that p->pid is reset to 0 during exit. This is not the case anymore. I now look at the count of the pid structure for PIDTYPE_PID. - de_thread now tested - as broken as it was before: open handles to /proc/<pid> are either stale or invalid after an exec of a nptl process, if the exec was call from a secondary thread. - a few lock_kernels removed - that part of /proc doesn't need it. - additional instances of 'if(current->pid)' replaced with pid_alive.
-
Andrew Morton authored
From: William Lee Irwin III <wli@holomorphy.com> mpc_apicid is a u8, and MAX_APICS can be 256.
-
Andrew Morton authored
fs/compat.c: In function `compat_sys_ioctl': fs/compat.c:324: warning: implicit declaration of function `siocdevprivate_ioctl'
-
Andrew Morton authored
Don't assume the size of dev_t: on ppc64 it is unsignedlong and this generates a printk warning.
-
Andrew Morton authored
arch/ppc64/kernel/htab.c:105: warning: implicit declaration of function `pSeries_lpar_hpte_insert' arch/ppc64/kernel/htab.c:109: warning: implicit declaration of function `pSeries_hpte_insert'
-
Andrew Morton authored
Fix a printk warning
-
Andrew Morton authored
two printk warnings
-
Andrew Morton authored
warning: assignment makes pointer from integer without a cast
-
Andrew Morton authored
It needs sched.h for `current'.
-
Andrew Morton authored
From: David Gibson <david@gibson.dropbear.id.au> This removes a bunch of unused variables in prom_init(), squashing the associated warnings.
-
Andrew Morton authored
From: David Gibson <david@gibson.dropbear.id.au> xics.c uses ppc64_boot_msg() without prototype, this fixes it by inclding <asm/machdep.h>.
-
Andrew Morton authored
do_signal32() is used before it is defined, this prototype squashes the warning.
-
Andrew Morton authored
From: David Gibson <david@gibson.dropbear.id.au> Squash implicit declaration warning in ppc64 align.c
-
Andrew Morton authored
From: David Gibson <david@gibson.dropbear.id.au> addnote in arch/ppc64/boot (a userspace tool, not kernel code) uses exit() without including stdlib.h.
-
Andrew Morton authored
PPC64 irq return fix
-
Andrew Morton authored
Fix some warnings in the ppc64 build. Also declare a couple of AIO functions in aio.h rather than aio.c They are needed for 32-bit emulation support.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> PPC64 32/64-bit emulation for AIO.
-
Linus Torvalds authored
Very early initialization (core_initcall) needs to have the cdev initialization done. So make it part of the pre-initcall sequence, the same way the bdev caches were done.
-
Linus Torvalds authored
-
Ingo Molnar authored
This addresses a futex related SMP scalability problem of glibc. A number of regressions have been reported to the NTPL mailing list when going to many CPUs, for applications that use condition variables and the pthread_cond_broadcast() API call. Using this functionality, testcode shows a slowdown from 0.12 seconds runtime to over 237 seconds (!) runtime, on 4-CPU systems. pthread condition variables use two futex-backed mutex-alike locks: an internal one for the glibc CV state itself, and a user-supplied mutex which the API guarantees to take in certain codepaths. (Unfortunately the user-supplied mutex cannot be used to protect the CV state, so we've got to deal with two locks.) The cause of the slowdown is a 'swarm effect': if lots of threads are blocked on a condition variable, and pthread_cond_broadcast() is done, then glibc first does a FUTEX_WAKE on the cv-internal mutex, then down a mutex_down() on the user-supplied mutex. Ie. a swarm of threads is created which all race to serialize on the user-supplied mutex. The more threads are used, the more likely it becomes that the scheduler will balance them over to other CPUs - where they just schedule, try to lock the mutex, and go to sleep. This 'swarm effect' is purely technical, a side-effect of glibc's use of futexes, and the imperfect coupling of the two locks. the solution to this problem is to not wake up the swarm of threads, but 'requeue' them from the CV-internal mutex to the user-supplied mutex. The attached patch adds the FUTEX_REQUEUE feature FUTEX_REQUEUE requeues N threads from futex address A to futex address B. This way glibc can wake up a single thread (which will take the user-mutex), and can requeue the rest, with a single system-call. Ulrich Drepper has implemented FUTEX_REQUEUE support in glibc, and a number of people have tested it over the past couple of weeks. Here are the measurements done by Saurabh Desai: System: 4xPIII 700MHz ./cond-perf -r 100 -n 200: 1p 2p 4p Default NPTL: 0.120s 0.211s 237.407s requeue NPTL: 0.124s 0.156s 0.040s ./cond-perf -r 1000 -n 100: Default NPTL: 0.276s 0.412s 0.530s requeue NPTL: 0.349s 0.503s 0.550s ./pp -v -n 128 -i 1000 -S 32768: Default NPTL: 128 games in 1.111s 1.270s 16.894s requeue NPTL: 128 games in 1.111s 1.959s 2.426s ./pp -v -n 1024 -i 10 -S 32768: Default NPTL: 1024 games in 0.181s 0.394s incompleted 2m+ requeue NPTL: 1024 games in 0.166s 0.254s 0.341s the speedup with increasing number of threads is quite significant, in the 128 threads, case it's more than 8 times. In the cond-perf test, on 4 CPUs it's almost infinitely faster than the 'swarm of threads' catastrophy triggered by the old code.
-
Alexander Viro authored
new fields in struct inode - i_cdev and i_cindex. When we do open() on a character device we cache result of cdev lookup in inode and put the inode on a cyclic list anchored in cdev. If we already have that done, we don't bother with any lookups. When inode disappears it's removed from the list. When cdev gets unregistered we remove all cached references to it (and remove such inodes from the list). cdev is held until final fput() now.
-
Alexander Viro authored
New object: struct cdev. It contains a kobject, a pointer to file_operations and a pointer to owner module. These guys have a search structure of the same sort as gendisks and chrdev_open() picks file_operations from them. Intended use: embed such animal in driver-owned structure (e.g. tty_driver) and register it as associated with given range of device numbers. Generic code will do lookup for such object and use it for the rest. The behaviour of register_chrdev() is _not_ changed - it allocates struct cdev and registers it; any old driver will work as if nothing had changed. On that stage we only use it during chrdev_open() to find file_operations. Later it will be cached in inode->i_cdev (and index in range - in inode->i_cindex) so that ->open() could get whatever objects it wants directly without any special-cased lookups, etc.
-