- 31 Aug, 2004 25 commits
-
-
Nick Piggin authored
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Use hlists for the PID hashes. This halves the memory footprint of these hashes. No benchmarks, but I think this is a worthy improvement because the hashes are something that would be likely to have significant portions loaded into the cache of every CPU on some workloads. This comes at the "expense" of 1. reintroducing the memory prefetch into the hash traversal loop; 2. adding new pids to the head of the list instead of the tail. I suspect that if this was a big problem then the hash isn't sized well or could benefit from moving hot entries to the head. Also, account for all the pid hashes when reporting hash memory usage. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
A 4GB, 4-way Opteron would create the smallest size table (16 entries) because pidhash_init is called before mem_init which is where x86-64 sets up max_pfn. nr_kernel_pages is setup by paging_init, called from setup_arch, which is also where i386 sets up max_pfn. So export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing the PID hash. This fixes the problem. This also makes the pid hash dependant on the size of ZONE_NORMAL instead of total size of memory. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Roland McGrath authored
This patch changes the rusage bookkeeping and the semantics of the getrusage and times calls in a couple of ways. The first change is in the c* fields counting dead child processes. POSIX requires that children that have died be counted in these fields when they are reaped by a wait* call, and that if they are never reaped (e.g. because of ignoring SIGCHLD, or exitting yourself first) then they are never counted. These were counted in release_task for all threads. I've changed it so they are counted in wait_task_zombie, i.e. exactly when being reaped. POSIX also specifies for RUSAGE_CHILDREN that the report include the reaped child processes of the calling process, i.e. whole thread group in Linux, not just ones forked by the calling thread. POSIX specifies tms_c[us]time fields in the times call the same way. I've moved the c* fields that contain this information into signal_struct, where the single set of counters accumulates data from any thread in the group that calls wait*. Finally, POSIX specifies getrusage and times as returning cumulative totals for the whole process (aka thread group), not just the calling thread. I've added fields in signal_struct to accumulate the stats of detached threads as they die. The process stats are the sums of these records plus the stats of remaining each live/zombie thread. The times and getrusage calls, and the internal uses for filling in wait4 results and siginfo_t, now iterate over the threads in the thread group and sum up their stats along with the stats recorded for threads already dead and gone. I added a new value RUSAGE_GROUP (-3) for the getrusage system call rather than changing the behavior of the old RUSAGE_SELF (0). POSIX specifies RUSAGE_SELF to mean all threads, so the glibc getrusage call will just translate it to RUSAGE_GROUP for new kernels. I did this thinking that someone somewhere might want the old behavior with an old glibc and a new kernel (it is only different if they are using CLONE_THREAD anyway). However, I've changed the times system call to conform to POSIX as well and did not provide any backward compatibility there. In that case there is nothing easy like a parameter value to use, it would have to be a new system call number. That seems pretty pointless. Given that, I wonder if it is worth bothering to preserve the compatible RUSAGE_SELF behavior by introducing RUSAGE_GROUP instead of just changing RUSAGE_SELF's meaning. Comments? I've done some basic testing on x86 and x86-64, and all the numbers come out right after these fixes. (I have a test program that shows a few Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Roland McGrath authored
This patch adds a new system call `waitid'. This is a new POSIX call that subsumes the rest of the wait* family and can do some things the older calls cannot. A minor addition is the ability to select what kinds of status to check for with a mask of independent bits, so you can wait for just stops and not terminations, for example. A more significant improvement is the WNOWAIT flag, which allows for polling child status without reaping. This interface fills in a siginfo_t with the same details that a SIGCHLD for the status change has; some of that info (e.g. si_uid) is not available via wait4 or other calls. I've added a new system call that has the parameter conventions of the POSIX function because that seems like the cleanest thing. This patch includes the actual system call table additions for i386 and x86-64; other architectures will need to assign the system call number, and 64-bit ones may need to implement 32-bit compat support for it as I did for x86-64. The new features could instead be provided by some new kludge inventions in the wait4 system call interface (that's what BSD did). If kludges are preferable to adding a system call, I can work up something different. I added a struct rusage field si_rusage to siginfo_t in the SIGCHLD case (this does not affect the size or layout of the struct). This is not part of the POSIX interface, but it makes it so that `waitid' subsumes all the functionality of `wait4'. Future kernel ABIs (new arch's or whatnot) can have only the `waitid' system call and the rest of the wait* family including wait3 and wait4 can be implemented in user space using waitid. There is nothing in user space as yet that would make use of the new field. Most of the new functionality is implemented purely in the waitid system call itself. POSIX also provides for the WCONTINUED flag to report when a child process had been stopped by job control and then resumed with SIGCONT. Corresponding to this, a SIGCHLD is now generated when a child resumes (unless SA_NOCLDSTOP is set), with the value CLD_CONTINUED in siginfo_t.si_code. To implement this, some additional bookkeeping is required in the signal code handling job control stops. The motivation for this work is to make it possible to implement the POSIX semantics of the `waitid' function in glibc completely and correctly. If changing either the system call interface used to accomplish that, or any details of the kernel implementation work, would improve the chances of getting this incorporated, I am more than happy to work through any issues. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Anton Blanchard authored
I tested how long it took to do a dd from /dev/random on ppc64 before and after this patch, while doing a ping flood from another machine. before: # /usr/bin/time dd if=/dev/random of=/dev/zero count=1k 0+51 records in Command terminated by signal 2 0.00user 0.00system 19:18.46elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k I gave up after 19 minutes. after: # /usr/bin/time dd if=/dev/random of=/dev/zero count=1k 0+1024 records in 0.00user 0.00system 0:33.38elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k Just over 33 seconds. Better. From: Arnd Bergmann <arnd@arndb.de> I noticed that only i386 and x86-64 are currently using a high resolution timer source when adding randomness. Since many architectures have a working get_cycles() implementation, it seems rather straightforward to use that. Has this been discussed before, or can anyone comment on the implementation below? This patch attempts to take into account the size of cycles_t, which is either 32 or 64 bits wide but independent of the architecture's word size. The behavior should be nearly identical to the old one on i386, x86-64 and all architectures without a time stamp counter, while finding more entropy on the other architectures. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andi Kleen authored
Apply this handy patch and boot with numa=fake=4 (or how many nodes you want, 8 max right now). There is a minor issue with the hash function, which can make the last node be bigger than the others. Is probably fixable if it should be a problem. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Matt Mackall authored
A patch to replace tmpfs/shmem with ramfs for systems without swap, incorporating the suggestions from Andi and Hugh. It uses ramfs instead. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Alan Cox authored
This adds VLAN support to the 3c59x/90x series hardware. Stefan de Konink ported this code from the 2.4 VLAN patches and tested it extensively. I cleaned up the ifdefs and fixed a problem with bracketing that made older cards fail. -- Developer's Certificate of Origin 1.0 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. I, Stefan de Konink, certify that: The contribution is based upon previous work that, again is based on GPL code and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license. I, Alan Cox, certify likewise. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Fix scheduling latency issues with large truncates. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Oleg Nesterov authored
Hugetlbfs silently coerce private mappings of hugetlb files into shared ones. So private writable mapping has MAP_SHARED semantics. I think, such mappings should be disallowed. First, such behavior allows open hugetlbfs file O_RDONLY, and overwrite it via mmap(PROT_READ|PROT_WRITE, MAP_PRIVATE), so it is security bug. Second, private writable mmap() should fail just because kernel does not support this. I belisve, it is ok to allow private readonly hugetlb mappings, sys_mprotect() does not work with hugetlb vmas. There is another problem. Hugetlb mapping is always prefaulted, pages allocated at mmap() time. So even readonly mapping allows to enlarge the size of the hugetlbfs file, and steal huge pages without appropriative permissions. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Oleg Nesterov authored
Hugetlbfs mmap with MAP_PRIVATE becomes MAP_SHARED silently, but vma->vm_flags have no VM_SHARED bit. Reading from /dev/zero into hugetlb area will do: read_zero() read_zero_pagealigned() if (vma->vm_flags & VM_SHARED) break; // fallback to clear_user() zap_page_range(); zeromap_page_range(); It will hit BUG_ON() in unmap_hugepage_range() if region is not huge page aligned, or silently convert it into the private anonymous mapping. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Dave Hansen authored
This is a helper function that a few architectures already have. This just copies the i386 implementation to ppc64. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Mackerras authored
With earlier setup of cpu_possible_map the number of irqstacks shrinks from NR_CPUS to the number of possible cpus. Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Mackerras authored
Move the initialization of the per-cpu paca->hw_cpu_id out of the Open Firmware client boot code and into a common location which is executed later. Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Mackerras authored
Move all cpu map initializations to one place (except for the online map -- cpus mark themselves online as they come up). This sets up cpu_possible_map early enough that we can use num_possible_cpus for allocating irqstacks instead of NR_CPUS. Hopefully this should also help set the stage for kexec. Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Mackerras authored
David Engebretsen has moved on to other things and is no longer maintaining ppc64. This patch adds an entry in CREDITS to note his contribution in leading the team that did the PPC64 port originally and updates various PPC-related MAINTAINERS entries. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Takashi Iwai authored
Currently add_interrupt_randomness() is called at each interrupt when one of the handlers has SA_SAMPLE_RANDOM flag, regardless whether the interrupt is processed by that handler or not. This results in the higher latency and perfomance loss. The patch fixes this behavior to avoid the unnecessary call by checking the return value from each handler. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Prasanna S. Panchamukhi authored
A special kprobe type which can be placed on function entry points, and employs a simple mirroring principle to allow seamless access to the arguments of a function being probed. The probe handler routine should have the same prototype as the function being probed. Currently implemented for x86. The way it works is that when the probe is hit, the breakpoint handler simply irets to the probe handler's eip while retaining register and stack state corresponding to the function entry. After it is done, the probe handler calls jprobe_return() which traps again to restore processor state and switch back to the probed function. Linus noted correctly at KS that we need to be careful as gcc assumes that the callee owns arguments. We save and restore enough stack bytes to cover argument space. Sample Usage: static int jip_queue_xmit(struct sk_buff *skb, int ipfragok) { ... whatever ... jprobe_return(); return 0; } struct jprobe jp = { {.addr = (kprobe_opcode_t *) ip_queue_xmit}, .entry = (kprobe_opcode_t *) jip_queue_xmit }; register_jprobe(&jp); Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Prasanna S. Panchamukhi authored
This patch helps developers to trap at almost any kernel code address, specifying a handler routine to be invoked when the breakpoint is hit. Useful for analysing the Linux kernel by collecting debugging information non-disruptively. Employs single-stepping out-of-line to avoid probe misses on SMP and may be especially useful in aiding debugging elusive races and problems on live systems. More elaborate dynamic tracing tools such as DProbes can be built over the kprobes interface. Helps developers to trap at almost any kernel code address, specifying a handler routine to be invoked when the breakpoint is hit. Useful for analysing the Linux kernel by collecting debugging information non-disruptively. Employs single-stepping out-of-line to avoid probe misses on SMP and may be especially useful in aiding debugging elusive races and problems on live systems. More elaborate dynamic tracing tools such as DProbes can be built over the kprobes interface. Sample usage: To place a probe on __blockdev_direct_IO: static int probe_handler(struct kprobe *p, struct pt_regs *) { ... whatever ... } struct kprobe kp = { .addr = __blockdev_direct_IO, .pre_handler = probe_handler }; register_kprobe(&kp); Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Prasanna S. Panchamukhi authored
This patch provides notifiers for i386 architecture exceptions. This patch has been ported from x86_64 architecture as suggested by Andi Kleen. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Philippe Elie authored
Contributions from Zarakin <zarakin@hotpop.com> Intel removed two msrs: MSR_P4_IQ_ESCR_0|1 (0x3ba/0x3bb), P4 model >= 3. See Intel documentation Vol. 3 System Programming Guide Appendix B. nmi_watchdog=2 oopsed at boot time and oprofile at driver load. Avoid touching them when model >= 3. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jason Davis authored
This update only applies to Unisys' ES7000 server machines. The patch adds a OEM id check to verify the current machine running is actually a Unisys type box before executing the Unisys OEM parser routine. It also increases the MAX_MP_BUSSES definition from 32 to 256. On the ES7000s, bus ID numbering can range from 0 to 255. Without the patch, the system panics if booted with acpi=off. This patch has been tested and verified on an authentic ES7000 machine. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Bjorn Helgaas authored
Make assign_irq_vector() non-__init always (it's called from io_apic_set_pci_routing(), which is used in the pci_enable_device() path). Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Anton Blanchard authored
I had a problem when compiling a 2.6 kernel with gcc 3.5 CVS. The prototype for prio_tree_remove in mm/prio_tree.c is inside another function. gcc 3.5 gets upset and removes the function completely. Apparently this isnt valid C, so lets fix it up. Details can be found here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17205Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 30 Aug, 2004 3 commits
-
-
Dave Jones authored
Nehemiah wasn't having the CX8 bit enabled before. In fixing it up, I rewrote the code to be a little clearer. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Richard Henderson authored
From: Ivan Kokshaysky <ink@jurassic.park.msu.ru> - big functions moved out of line; - handle dev == NULL case, which apparently means the isa device. As before, this is needed for Jensen. This also makes the isa sound drivers working again with recent kernels.
-
bk://bk.arm.linux.org.uk/linux-2.6-rmkLinus Torvalds authored
into ppc970.osdl.org:/home/torvalds/v2.6/linux
-
- 31 Aug, 2004 3 commits
-
-
Ben Dooks authored
Patch from Ben Dooks Register definitions, and support for modifying the Drive Strength Control registers on the S3C2440 Signed-off-by: Ben Dooks
-
Ben Dooks authored
Patch from Ben Dooks Ensure we check for IRQ pending for the gettimeoffset() function
-
Ben Dooks authored
Patch from Ben Dooks Add the definition of timer 0's dead-zone capability
-
- 30 Aug, 2004 9 commits
-
-
David S. Miller authored
into kernel.bkbits.net:/home/davem/tg3-2.6
-
David S. Miller authored
-
David S. Miller authored
Need to clear one bit at a time, so if we are clearing both 625_CORE_CLOCK and ALTCLOCK we first clear the latter then the final write will clear the former. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Otherwise we do not handle properly the case where the switch/hub does not support auto- negotiation. This is what was breaking 5704 hw fiber autoneg. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@redhat.com>
-
bk://gkernel.bkbits.net/libata-2.6Linus Torvalds authored
into ppc970.osdl.org:/home/torvalds/v2.6/linux
-
Bartlomiej Zolnierkiewicz authored
[patch] libata: ata_piix.c PIO fix "[libata] transfer mode cleanup" introduced bug in ata_piix.c: previously PIO number (not mode) was passed to piix_set_piomode(). Fortunately this function is only used for (disabled) PATA support. I bet that this is the reason why MWDMA didn't work for PATA. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
-
Andrew Morton authored
The ioctl32 conversion registration stubs are in ioctl32.h now. Signed-off-by: Andrew Morton <akpm@osdl.org>
-