- 21 Dec, 2002 16 commits
-
-
Andrew Morton authored
The `low latency page reclaim' design works by preventing page allocators from blocking on request queues (and by preventing them from blocking against writeback of individual pages, but that is immaterial here). This has a problem under some situations. pdflush (or a write(2) caller) could be saturating the queue with highmem pages. This prevents anyone from writing back ZONE_NORMAL pages. We end up doing enormous amounts of scenning. A test case is to mmap(MAP_SHARED) almost all of a 4G machine's memory, then kill the mmapping applications. The machine instantly goes from 0% of memory dirty to 95% or more. pdflush kicks in and starts writing the least-recently-dirtied pages, which are all highmem. The queue is congested so nobody will write back ZONE_NORMAL pages. kswapd chews 50% of the CPU scanning past dirty ZONE_NORMAL pages and page reclaim efficiency (pages_reclaimed/pages_scanned) falls to 2%. So this patch changes the policy for kswapd. kswapd may use all of a request queue, and is prepared to block on request queues. What will now happen in the above scenario is: 1: The page alloctor scans some pages, fails to reclaim enough memory and takes a nap in blk_congetion_wait(). 2: kswapd() will scan the ZONE_NORMAL LRU and will start writing back pages. (These pages will be rotated to the tail of the inactive list at IO-completion interrupt time). This writeback will saturate the queue with ZONE_NORMAL pages. Conveniently, pdflush will avoid the congested queues. So we end up writing the correct pages. In this test, kswapd CPU utilisation falls from 50% to 2%, page reclaim efficiency rises from 2% to 40% and things are generally a lot happier. The downside is that kswapd may now do a lot less page reclaim, increasing page allocation latency, causing more direct reclaim, increasing lock contention in the VM, etc. But I have not been able to demonstrate that in testing. The other problem is that there is only one kswapd, and there are lots of disks. That is a generic problem - without being able to co-opt user processes we don't have enough threads to keep lots of disks saturated. One fix for this would be to add an additional "really congested" threshold in the request queues, so kswapd can still perform nonblocking writeout. This gives kswapd priority over pdflush while allowing kswapd to feed many disk queues. I doubt if this will be called for.
-
Andrew Morton authored
We keep getting in a mess with the current->flags setting and unsetting. Remove current->flags:PF_NOWARN and create __GFP_NOWARN instead.
-
Andrew Morton authored
- A C99 initialiser in drivers/char/mem.c - Remove unneeded deref in madvise_willneed()
-
Andrew Morton authored
Add a generic_file_readonly_mmap() for !CONFIG_MMU.
-
Andrew Morton authored
slab poisons objects with 0x5a both when they are constructed and when they are freed. So it is not possible to tell whether a deref of 0x5a5a5a5a was a use-before-initialisation bug or a use-after-free bug. The patch changes it so that 1) A deref of 0x5a5a5a5a means use-of-uninitialised-memory 2) A deref of 0x6b6b6b6b means use-of-freed-memory.
-
Andrew Morton authored
move_vma() calls do_munmap() and then uses the memory at *new_vma. But when starting X11 it just happens that the memory which do_munmap unmapped had the same start address and the range at *new_vma. So new_vma is freed by do_munmap(). This was never noticed before because (vm_flags & VM_LOCKED) evaluates false when vm_flags is 0x5a5a5a5a. But I just changed that to 0x6b6b6b6b and boom - we call make_pages_present() with start == end == 0x6b6b6b6b and it goes BUG. So I think the right fix here is for move_vma() to not inspect the values of any vma's after it has called do_munmap(). The patch does that, for `new_vma'. The local variable `vma' is also being used after the call do do_munmap(), and this may also be a bug. Proving that this is not so, and adding a comment to explain why is hereby added to Hugh's todo list ;)
-
Andrew Morton authored
There's a small window in which another CPU could dirty the page after we've cleaned it, and before we've moved it to mapping->dirty_pages(). The end result is a dirty page on mapping->locked_pages, which is wrong. So take mapping->page_lock before clearing the dirty bit.
-
Andrew Morton authored
Running a `mount -o remount' against ext3 deadlocks if there is heavy write activity. It's a sort of AB/BA deadlock caused by calling log_wait_commit() under lock_super(). The caller holds lock_super() and is waiting for a commit, but the commit cannot complete because lock_super() is also used in the block allocator. The way we fixed this in tha past is to drop the superblock lock inside ext3. The way this patch fixes it is to arrange for lock_super() to not be held around the ->sync_fs() call. Also: sync_filesystems is on the sys_sync() path and is racy wrt unmount. Check sb->s_root after taking sb->s_umount.
-
Linus Torvalds authored
- set up kernel stack pointer for sysenter at each context switch. - disable sysenter while in vm86 mode. - clean up mtrr number defines and SEP feature testing
-
Linus Torvalds authored
-
Ivan Kokshaysky authored
Don't disable PCI devices before changing the BARs, as discussed recently. Disabling PCI_COMMAND_MASTER bit is an obvious bug. Further, pdev_enable_device() is a leftover from very old (2.0, I guess) alpha PCI code. It's used in pci_assign_unassigned_resources() to enable *every* PCI device in the system. So, if we have two graphic cards on the same bus, both with legacy VGA IO... oops. Actually, only alpha relied on that due to the lack of pcibios_enable_device (which has been already fixed).
-
Manfred Spraul authored
This replaces the dynamically allocated two-level array in sys_poll with a dynamically allocated linked list. The current implementation causes at least two alloc/free calls, even if only one or two descriptors are polled. This reduces that to one alloc/free, and the .text segment is around 220 bytes shorter. The microbenchmark that polls one pipe fd is around 30% faster. [1140 cycles instead of 1604 cycles, Celeron mobile 1.13 GHz]
-
bk://linux-dj.bkbits.net/agpgartLinus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Dave Jones authored
into tetrachloride.(none):/mnt/stuff/kernel/2.5/agpgart
-
Dave Jones authored
-
http://lia64.bkbits.net/to-linus-2.5Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
- 20 Dec, 2002 24 commits
-
-
Michael Milligan authored
-
David Mosberger authored
-
David Mosberger authored
-
Linus Torvalds authored
-
Linus Torvalds authored
-
bk://lsm.bkbits.net/linus-2.5Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Linus Torvalds authored
-
Chuck Lever authored
Description: everywhere the NFS client uses the req_offset() function today, it adds req->wb_offset to the result. this patch simply makes "+req->wb_offset" a part of the req_offset() function. Test status: Passes all Connectathon '02 tests with v2, v3, UDP and TCP. Passes NFS torture tests on an x86 UP highmem system.
-
Chuck Lever authored
Description: The default set_page_dirty address space op is too heavyweight for NFS, which doesn't use buffers.
-
Chuck Lever authored
Description: andrew morton suggested there are places in the NFS client that could make use of kmap_atomic instead of vanilla kmap in order to improve scalability on 8-way and higher SMP systems. Test status: Passes all Connectathon '02 tests with v2 and v3, UDP and TCP; passes NFS torture tests on a UP HIGHMEM x86 system.
-
Miles Bader authored
This moves most of the duplicated text in the various v850 platform- specific linker scripts (each of which was previously completely standalone) into cpp macros in vmlinux.lds.S, which are then used by the platform linker scripts as appropriate. This should make the scripts a lot easier to maintain. Also, a number of linker-script bugs are fixed.
-
Miles Bader authored
The old code seems completely wrong; I guess it was just left over from whichever architecture this code was copied from.
-
Miles Bader authored
These are used for the new in-kernel module loader (actually not all the relocation types are used right now, but are included for completeness). Only the EM_CYGNUS_V850 macro, which is in a global namespace, is added to <linux/elf.h>; the relocation types, which are private to the v850, are added to <asm-v850/elf.h>. [Perhaps some other archs can do a similar split, to reduce the bloat in <linux/elf.h>]
-
Miles Bader authored
-
Miles Bader authored
A few symbols are only defined when CONFIG_MMU=y, but are exported (by kernel/ksyms.c) unconditionally. This patch makes them conditional.
-
Miles Bader authored
Adds extra includes needed because sched.h doesn't include them anymore, and removes includes of sched.h where they're not really necessary.
-
William Lee Irwin III authored
Fix task->cpus_allowed bitmask truncations on 64.bit architectures. Originally by Bjorn Helgaas for 2.4.x.
-
Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Russell Cattelan authored
major changes to actually fit. SGI Modid: 2.5.x-xfs:slinx:132210a
-
Eric Sandeen authored
SGI Modid: 2.5.x-xfs:slinx:135454a
-
Nathan Scott authored
SGI Modid: 2.5.x-xfs:slinx:135453a
-
Nathan Scott authored
very first read on mount. Make some of the surrounding code dealing with buffers consistent. SGI Modid: 2.5.x-xfs:slinx:135452a
-
Christoph Hellwig authored
SGI Modid: 2.5.x-xfs:slinx:135307a
-
Christoph Hellwig authored
SGI Modid: 2.5.x-xfs:slinx:135308a
-