- 08 Feb, 2008 40 commits
-
-
Harvey Harrison authored
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Harvey Harrison authored
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Harvey Harrison authored
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nick Piggin authored
Support direct_access XIP method with brd. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nick Piggin authored
This is a rewrite of the ramdisk block device driver. The old one is really difficult because it effectively implements a block device which serves data out of its own buffer cache. It relies on the dirty bit being set, to pin its backing store in cache, however there are non trivial paths which can clear the dirty bit (eg. try_to_free_buffers()), which had recently lead to data corruption. And in general it is completely wrong for a block device driver to do this. The new one is more like a regular block device driver. It has no idea about vm/vfs stuff. It's backing store is similar to the buffer cache (a simple radix-tree of pages), but it doesn't know anything about page cache (the pages in the radix tree are not pagecache pages). There is one slight downside -- direct block device access and filesystem metadata access goes through an extra copy and gets stored in RAM twice. However, this downside is only slight, because the real buffercache of the device is now reclaimable (because we're not playing crazy games with it), so under memory intensive situations, footprint should effectively be the same -- maybe even a slight advantage to the new driver because it can also reclaim buffer heads. The fact that it now goes through all the regular vm/fs paths makes it much more useful for testing, too. text data bss dec hex filename 2837 849 384 4070 fe6 drivers/block/rd.o 3528 371 12 3911 f47 drivers/block/brd.o Text is larger, but data and bss are smaller, making total size smaller. A few other nice things about it: - Similar structure and layout to the new loop device handlinag. - Dynamic ramdisk creation. - Runtime flexible buffer head size (because it is no longer part of the ramdisk code). - Boot / load time flexible ramdisk size, which could easily be extended to a per-ramdisk runtime changeable size (eg. with an ioctl). - Can use highmem for the backing store. [akpm@linux-foundation.org: fix build] [byron.bbradley@gmail.com: make rd_size non-static] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Byron Bradley <byron.bbradley@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Add platform MTD support for the ASB2303 board. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Add architecture support for the MN10300/AM33 CPUs produced by MEI to the kernel. This patch also adds board support for the ASB2303 with the ASB2308 daughter board, and the ASB2305. The only processor supported is the MN103E010, which is an AM33v2 core plus on-chip devices. [akpm@linux-foundation.org: nuke cvs control strings] Signed-off-by: Masakazu Urade <urade.masakazu@jp.panasonic.com> Signed-off-by: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Allocate serial port UART type IDs for the MN10300 on-chip serial ports. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
net2280 can't have a function called show_registers() because this can produce a namespace clash with an arch function of the same name. All this driver's functions and variables should really be prefixed with "net2280_" to avoid such a problem in future. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Greg KH <greg@kroah.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Remove now unnecessary inclusions of {asm,linux}/a.out.h. [akpm@linux-foundation.org: fix alpha build] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set. Not all architectures support the A.OUT binfmt, so the ELF binfmt should not be permitted to go looking for A.OUT libraries to load in such a case. Not only that, but under such conditions A.OUT core dumps are not produced either. To make this work, this patch also does the following: (1) Makes the existence of the contents of linux/a.out.h contingent on CONFIG_ARCH_SUPPORTS_AOUT. (2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT core dumping code. (3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This is then included only where needed. This means that this bit of arch code will be stored in the appropriate A.OUT binfmt module rather than the core kernel. (4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not needed) and FRV. This patch depends on the previous patch to move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT format is available. [jdike@addtoit.com: uml: re-remove accidentally restored code] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Mark arches that support A.OUT format by including the following in their master Kconfig files: config ARCH_SUPPORTS_AOUT def_bool y This should also be set if the arch provides compatibility A.OUT support for an older arch, for instance x86_64 for i386 or sparc64 for sparc. I've guessed at which arches don't, based on comments in the code, however I'm sure that some of the ones I've marked as 'yes' actually should be 'no'. Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT format is available. Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Zefan authored
Fix typo in comments. BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise checkpatch.pl will be complaining. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Zefan authored
Function timekeeping_is_continuous() no longer checks flag CLOCK_IS_CONTINUOUS, and it checks CLOCK_SOURCE_VALID_FOR_HRES now. So rename the function accordingly. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Zefan authored
list_for_each_safe() suffices here. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Zefan authored
Flag CLOCK_SOURCE_WATCHDOG is cleared twice. Note clocksource_change_rating() won't do anyting with the cs flag. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pavel Emelyanov authored
There's only one caller left - the kill_pgrp one - so merge these two functions and forget the kill_pgrp_info one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pavel Emelyanov authored
This is the first step (of two) in removing the kill_pgrp_info. All the users of this function are in kernel/signal.c, but all they need is to call __kill_pgrp_info() with the tasklist_lock read-locked. Fortunately, one of its users is the kill_something_info(), which already needs this lock in one of its branches, so clean these branches up and call the __kill_pgrp_info() directly. Based on Oleg's view of how this function should look. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pavel Emelyanov authored
When sending the pid namespaces patches I wrongly converted the tsk->tgid into task_pid_vnr(tsk) in mqueue-s (the git id of this patch is b488893a). The proper behavior is to get the task_tgid_vnr(tsk). This seem to be the only mistake of that kind. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pavel Emelyanov authored
Some time ago the xxx_vnr() calls (e.g. pid_vnr or find_task_by_vpid) were _all_ converted to operate on the current pid namespace. After this each call like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo) one. Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where appropriate. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
signal_struct->tsk points to the ->group_leader and thus we have the nasty code in de_thread() which has to change it and restart ->real_timer if the leader is changed. Use "struct pid *leader_pid" instead. This also allows us to kill now unneeded send_group_sig_info(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
There is a window when de_thread() switches the leader and drops tasklist_lock. In that window do_each_pid_task(PIDTYPE_PID) finds both new and old leaders. The problem is pretty much theoretical and probably can be ignored. Currently the only users of do_each_pid_task(PIDTYPE_PID) are send_sigio/send_sigurg, so they can send the signal to the same process twice. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
kill_pid_info()->pid_task() could be the old leader of the execing process. In that case it is possible that the leader will be released before we take siglock. This means that kill_pid_info() (and thus sys_kill()) can return a false -ESRCH. Change the code to retry when lock_task_sighand() fails. The endless loop is not possible, __exit_signal() both clears ->sighand and does detach_pid(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric W. Biederman authored
Use task_pgrp_vnr not task_pgrp_nr so we return the process id the processes pid namespace and not in the initial pid namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric W. Biederman authored
[m.kozlowski@tuxland.pl: fix unbalanced parenthesis in irix_BSDsetpgrp()] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
With the new semantics of find_vpid() we don't need to play with ->nsproxy explicitely, _vxx() do the right things. Also s/tasklist/rcu/. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric W. Biederman authored
pid_vnr returns the user space pid with respect to the pid namespace the struct pid was allocated in. What we want before we return a pid to user space is the user space pid with respect to the pid namespace of current. pid_vnr is a very nice optimization but because it isn't quite what we want it is easy to use pid_vnr at times when we aren't certain the struct pid was allocated in our pid namespace. Currently this describes at least tiocgpgrp and tiocgsid in ttyio.c the parent process reported in the core dumps and the parent process in get_signal_to_deliver. So unless the performance impact is huge having an interface that does what we want instead of always what we want should be much more reliable and much less error prone. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric W. Biederman authored
This modifies do_wait and eligible child to take a pair of enum pid_type and struct pid *pid to precisely specify what set of processes are eligible to be waited for, instead of the raw pid_t value from sys_wait4. This fixes a bug in sys_waitid where you could not wait for children in just process group 1. This fixes a pid namespace crossing case in eligible_child. Allowing us to wait for a processes in our current process group even if our current process group == 0. This allows the no child with this pid case to be optimized. This allows us to optimize the pid membership test in eligible child to be optimized. This even closes a theoretical pid wraparound race where in a threaded parent if two threads are waiting for the same child and one thread picks up the child and the pid numbers wrap around and generate another child with that same pid before the other thread is scheduled (teribly insanely unlikely) we could end up waiting on the second child with the same pid# and not discover that the specific child we were waiting for has exited. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
The previous bugfix was not optimal, we shouldn't care about group stop when we are the only thread or the group stop is in progress. In that case nothing special is needed, just set PF_EXITING and return. Also, take the related "TIF_SIGPENDING re-targeting" code from exit_notify(). So, from the performance POV the only difference is that we don't trust !signal_pending() until we take ->siglock. But this in fact fixes another ___pure___ theoretical minor race. __group_complete_signal() finds the task without PF_EXITING and chooses it as the target for signal_wake_up(). But nothing prevents this task from exiting in between without noticing the pending signal and thus unpredictably delaying the actual delivery. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Eric's "fix clone(CLONE_NEWPID)" eliminated the last reason for this hack. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
do_signal_stop() counts all sub-thread and sets ->group_stop_count accordingly. Every thread should decrement ->group_stop_count and stop, the last one should notify the parent. However a sub-thread can exit before it notices the signal_pending(), or it may be somewhere in do_exit() already. In that case the group stop never finishes properly. Note: this is a minimal fix, we can add some optimizations later. Say we can return quickly if thread_group_empty(). Also, we can move some signal related code from exit_notify() to exit_signals(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
As Eric pointed out, there is no problem with init starting with sid == pgid == 0, and this was historical linux behavior changed in 2.6.18. Remove kernel_init()->__set_special_pids(), this is unneeded and complicates the rules for sys_setsid(). This change and the previous change in daemonize() mean that /sbin/init does not need the special "session != 1" hack in sys_setsid() any longer. We can't remove this check yet, we should cleanup copy_process(CLONE_NEWPID) first, so update the comment only. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Daemonized kernel threads run in the init's session. This doesn't match the behaviour of kthread_create()'ed threads, and this is one of the 2 reasons why we need a special hack in sys_setsid(). Now that set_special_pids() was changed to use struct pid, not pid_t, we can use init_struct_pid and set 0,0 special pids. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Change set_special_pids() to work with struct pid, not pid_t from global name space. This again speedups and imho cleanups the code, also a preparation for the next patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
sys_setsid() still deals with pid_t's from the global namespace. This means that the "session > 1" check can't help for sub-namespace init, setsid() can't succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links. Remove the usage of task_struct->pid and convert the code to use "struct pid". This also simplifies and speedups the code, saves one find_pid(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
sys_setpgid() does unneeded conversions from pid_t to "struct pid" and vice versa. Use "struct pid" more consistently. Saves one find_vpid() and eliminates the explicit usage of ->nsproxy->pid_ns. Imho, cleanups the code. Also use the same_thread_group() helper. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
The first "p->exit_state != EXIT_ZOMBIE" check doesn't make too much sense. The exit_state was EXIT_ZOMBIE when the function was called, and another thread can change it to EXIT_DEAD right after the check. The second condition is not possible, detached non-traced threads were already filtered out by eligible_child(), we didn't drop tasklist since then. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Surprise, the other two wait_task_*() functions also abuse the task_pid_nr_ns() function, and may cause read-after-free or report nr == 0 in wait_task_continued(). wait_task_zombie() doesn't have this problem, but it is still better to cache pid_t rather than call task_pid_nr_ns() three times on the saved pid_namespace. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Imho, the current usage of security_task_wait() is not logical. Suppose we have the single child p, and security_task_wait(p) return -EANY. In that case waitpid(-1) returns this error. Why? Isn't it better to return ECHLD? We don't really have reapable children. Now suppose that child was stolen by gdb. In that case we find this child on ->ptrace_children and set flag = 1, but we don't check that the child was denied. So, do_wait(..., WNOHANG) returns 0, this doesn't match the behaviour above. Without WNOHANG do_wait() blocks only to return the error later, when the child will be untraced. Inho, really strange. I think eligible_child() should return the error only if the child's pid was requested explicitly, otherwise we should silently ignore the tasks which were nacked by security_task_wait(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-