1. 15 Aug, 2002 34 commits
    • James Simmons's avatar
      Ooops. Fix from Paul Mackerras · 3d35153e
      James Simmons authored
      3d35153e
    • James Simmons's avatar
      Synced up. · cb0f6aa1
      James Simmons authored
      cb0f6aa1
    • Benjamin LaHaise's avatar
      [PATCH] reduce stack usage of sanitize_e820_map · 270ebb5c
      Benjamin LaHaise authored
      Currently, sanitize_e820_map uses 0x738 bytes of stack.  The patch below
      moves the arrays into __initdata, reducing stack usage to 0x34 bytes.
      270ebb5c
    • Andrew Morton's avatar
      [PATCH] uninitialised local in generic_file_write · 7dd294f7
      Andrew Morton authored
      generic_file_write_nolock() is initialising the pagevec too late,
      so if we take an early `goto out' the kernel oopses.  O_DIRECT writes
      take that path.
      7dd294f7
    • Martin Mares's avatar
      [PATCH] PCI ID's for 2.5.31 · 75754eb4
      Martin Mares authored
      I've filtered all submissions to the ID database, merged new ID's from
      both 2.4.x and 2.5.x kernels and here is the result -- patch to 2.5.31
      pci.ids with all the new stuff. Could you please send it to Linus?
      (I would do it myself, but it seems I'll have a lot of work with the
      floods in Prague very soon.)
      75754eb4
    • Keith Mannthey's avatar
      [PATCH] for i386 SETUP CODE · 9cbec887
      Keith Mannthey authored
         The following is a simple fix for an array overrun problem in
      mpparse.c.  I am working on a multiquad box which has a EISA bus in it
      for it's service processor.  It's local bus number is 18 which is > 3
      (see quad_local_to_mp_bus_id.  When the NR_CPUS is close the the real
      number of cpus adding the EISA bus #18 in the array stomps all over
      various things in memory.  The EISA bus does not need to be mapped
      anywhere in the kernel for anything.  This patch will not affect non
      clustered apic (multiquad) kernels.
      9cbec887
    • Trond Myklebust's avatar
      [PATCH] Clean up the RPC socket slot allocation code [2/2] · fb9100d0
      Trond Myklebust authored
      Patch by Chuck Lever. Remove the timeout logic from call_reserve.
      This improves the overall RPC call ordering, and ensures that soft
      tasks don't time out and give up before they have attempted to send
      their message down the socket.
      fb9100d0
    • Trond Myklebust's avatar
      [PATCH] Clean up the RPC socket slot allocation code [1/2] · 7a72fa16
      Trond Myklebust authored
      Another patch by Chuck Lever. Fixes up some nasty logic in
      call_reserveresult().
      7a72fa16
    • Trond Myklebust's avatar
      [PATCH] cleanup RPC accounting · be6dd3ef
      Trond Myklebust authored
      The following patch is by Chuck Lever, and fixes an an accounting
      error in the 'rpc' field in /proc/net/rpc/nfs.
      be6dd3ef
    • Trond Myklebust's avatar
      [PATCH] Fix typo in the RPC reconnect code... · 0e6a8740
      Trond Myklebust authored
      The following patch fixes a typo that appears both in kernel 2.4.19
      and 2.5.31
      0e6a8740
    • Albert Cranford's avatar
      [PATCH] 2.5.31 reverse spin_lock_irq for i2c-elektor.c · 2e2fa887
      Albert Cranford authored
      Pleaase reverse deadlocking change to i2c-elektor.c
      2e2fa887
    • Andrew Morton's avatar
      [PATCH] deferred and batched addition of pages to the LRU · 44260240
      Andrew Morton authored
      The remaining source of page-at-a-time activity against
      pagemap_lru_lock is the anonymous pagefault path, which cannot be
      changed to operate against multiple pages at a time.
      
      But what we can do is to batch up just its adding of pages to the LRU,
      via buffering and deferral.
      
      This patch is based on work from Bill Irwin.
      
      The patch changes lru_cache_add to put the pages into a per-CPU
      pagevec.  They are added to the LRU 16-at-a-time.
      
      And in the page reclaim code, purge the local CPU's buffer before
      starting.  This is mainly to decrease the chances of pages staying off
      the LRU for very long periods: if the machine is under memory pressure,
      CPUs will spill their pages onto the LRU promptly.
      
      A consequence of this change is that we can have up to 15*num_cpus
      pages which are not on the LRU.  Which could have a slight effect on VM
      accuracy, but I find that doubtful.  If the system is under memory
      pressure the pages will be added to the LRU promptly, and these pages
      are the most-recently-touched ones - the VM isn't very interested in
      them anyway.
      
      This optimisation could be made SMP-specific, but I felt it best to
      turn it on for UP as well for consistency and better testing coverage.
      44260240
    • Andrew Morton's avatar
      [PATCH] pagemap_lru_lock wrapup · eed29d66
      Andrew Morton authored
      Some fallout from the pagemap_lru_lock changes:
      
      - lru_cache_del() is no longer used.  Kill it.
      
      - page_cache_release() almost never actually frees pages.  So inline
        page_cache_release() and move its rarely-called slow path into (the
        misnamed) mm/swap.c
      
      - update the locking comment in filemap.c.  pagemap_lru_lock used to
        be one of the outermost locks in the VM locking hierarchy.  Now, we
        never take any other locks while holding pagemap_lru_lock.  So it
        doesn't have any relationship with anything.
      
      - put_page() now removes pages from the LRU on the final put.  The
        lock is interrupt safe.
      eed29d66
    • Andrew Morton's avatar
      [PATCH] make pagemap_lru_lock irq-safe · aaba9265
      Andrew Morton authored
      It is expensive for a CPU to take an interrupt while holding the page
      LRU lock, because other CPUs will pile up on the lock while the
      interrupt runs.
      
      Disabling interrupts while holding the lock reduces contention by an
      additional 30% on 4-way.  This is when the only source of interrupts is
      disk completion.  The improvement will be higher with more CPUs and it
      will be higher if there is networking happening.
      
      The maximum hold time of this lock is 17 microseconds on 500 MHx PIII,
      which is well inside the kernel's maximum interrupt latency (which was
      100 usecs when I last looked, a year ago).
      
      This optimisation is not needed on uniprocessor, but the patch disables
      IRQs while holding pagemap_lru_lock anyway, so it becomes an irq-safe
      spinlock, and pages can be moved from the LRU in interrupt context.
      
      pagemap_lru_lock has been renamed to _pagemap_lru_lock to pick up any
      missed uses, and to reliably break any out-of-tree patches which may be
      using the old semantics.
      aaba9265
    • Andrew Morton's avatar
      [PATCH] batched removal of pages from the LRU · 008f707c
      Andrew Morton authored
      Convert all the bulk callers of lru_cache_del() to use the batched
      pagevec_lru_del() function.
      
      Change truncate_complete_page() to not delete the page from the LRU.
      Do it in page_cache_release() instead.  (This reintroduces the problem
      with final-release-from-interrupt.  THat gets fixed further on).
      
      This patch changes the truncate locking somewhat.  The removal from the
      LRU now happens _after_ the page has been removed from the
      address_space and has been unlocked.  So there is now a window where
      the shrink_cache code can discover the to-be-freed page via the LRU
      list.  But that's OK - the page is clean, its buffers (if any) are
      clean.  It's not attached to any mapping.
      008f707c
    • Andrew Morton's avatar
      [PATCH] batched addition of pages to the LRU · 9eb76ee2
      Andrew Morton authored
      The patch goes through the various places which were calling
      lru_cache_add() against bulk pages and batches them up.
      
      Also.  This whole patch series improves the behaviour of the system
      under heavy writeback load.  There is a reduction in page allocation
      failures, some reduction in loss of interactivity due to page
      allocators getting stuck on writeback from the VM.  (This is still bad
      though).
      
      I think it's due to the change here in mpage_writepages().  That
      function was originally unconditionally refiling written-back pages to
      the head of the inactive list.  The theory being that they should be
      moved out of the way of page allocators, who would end up waiting on
      them.
      
      It appears that this simply had the effect of pushing dirty, unwritten
      data closer to the tail of the inactive list, making things worse.
      
      So instead, if the caller is (typically) balance_dirty_pages() then
      leave the pages where they are on the LRU.
      
      If the caller is PF_MEMALLOC then the pages *have* to be refiled.  This
      is because VM writeback is clustered along mapping->dirty_pages, and
      it's almost certain that the pages which are being written are near the
      tail of the LRU.  If they were left there, page allocators would block
      on them too soon.  It would effectively become a synchronous write.
      9eb76ee2
    • Andrew Morton's avatar
      [PATCH] batched movement of lru pages in writeback · 823e0df8
      Andrew Morton authored
      Makes mpage_writepages() move pages around on the LRU sixteen-at-a-time
      rather than one-at-a-time.
      823e0df8
    • Andrew Morton's avatar
      [PATCH] multithread page reclaim · 3aa1dc77
      Andrew Morton authored
      This patch multithreads the main page reclaim function, shrink_cache().
      
      This function used to run under pagemap_lru_lock.  Instead, we grab
      that lock, put 32 pages from the LRU into a private list, drop the
      pagemap_lru_lock and then proceed to attempt to free those pages.
      
      Any pages which were succesfully reclaimed are batch-freed.  Pages
      which were not reclaimed are re-added to the LRU.
      
      This patch reduces pagemap_lru_lock contention on the 4-way by a factor
      of thirty.
      
      The shrink_cache() code has been simplified somewhat.
      
      refill_inactive() was being called too often - often just to process
      two or three pages.  Fiddled with that so it processes pages at the
      same rate, but works on 32 pages at a time.
      
      Added a couple of mark_page_accessed() calls into mm/memory.c from 2.4.
      They seem appropriate.
      
      Change the shrink_caches() logic so that it will still trickle through
      the active list (via refill_inactive) even if the inactive list is much
      larger than the active list.
      3aa1dc77
    • Andrew Morton's avatar
      [PATCH] pagevec infrastructure · 6a952840
      Andrew Morton authored
      This is the first patch in a series of eight which address
      pagemap_lru_lock contention, and which simplify the VM locking
      hierarchy.
      
      Most testing has been done with all eight patches applied, so it would
      be best not to cherrypick, please.
      
      The workload which was optimised was: 4x500MHz PIII CPUs, mem=512m, six
      disks, six filesystems, six processes each flat-out writing a large
      file onto one of the disks.  ie: heavy page replacement load.
      
      The frequency with which pagemap_lru_lock is taken is reduced by 90%.
      
      Lockmeter claims that pagemap_lru_lock contention on the 4-way has been
      reduced by 98%.  Total amount of system time lost to lock spinning went
      from 2.5% to 0.85%.
      
      Anton ran a similar test on 8-way PPC, the reduction in system time was
      around 25%, and the reduction in time spent playing with
      pagemap_lru_lock was 80%.
      
      	http://samba.org/~anton/linux/2.5.30/standard/
      versus
      	http://samba.org/~anton/linux/2.5.30/akpm/
      
      Throughput changes on uniprocessor are modest: a 1% speedup with this
      workload due to shortened code paths and improved cache locality.
      
      The patches do two main things:
      
      1: In almost all places where the kernel was doing something with
         lots of pages one-at-a-time, convert the code to do the same thing
         sixteen-pages-at-a-time.  Take the lock once rather than sixteen
         times.  Take the lock for the minimum possible time.
      
      2: Multithread the pagecache reclaim function: don't hold
         pagemap_lru_lock while reclaiming pagecache pages.  That function
         was massively expensive.
      
      One fallout from this work is that we never take any other locks while
      holding pagemap_lru_lock.  So this lock conceptually disappears from
      the VM locking hierarchy.
      
      
      So.  This is all basically a code tweak to improve kernel scalability.
      It does it by optimising the existing design, rather than by redesign.
      There is little conceptual change to how the VM works.
      
      This is as far as I can tweak it.  It seems that the results are now
      acceptable on SMP.  But things are still bad on NUMA.  It is expected
      that the per-zone LRU and per-zone LRU lock patches will fix NUMA as
      well, but that has yet to be tested.
      
      
      This first patch introduces `struct pagevec', which is the basic unit
      of batched work.  It is simply:
      
      struct pagevec {
      	unsigned nr;
      	struct page *pages[16];
      };
      
      pagevecs are used in the following patches to get the VM away from
      page-at-a-time operations.
      
      This patch includes all the pagevec library functions which are used in
      later patches.
      6a952840
    • Matthew Wilcox's avatar
      [PATCH] lockd shouldn't call posix_unblock_lock here · ecc9d325
      Matthew Wilcox authored
      nlmsvc_notify_blocked() is only called via the fl_notify() pointer which
      is only called immediately after we already did a locks_delete_block(),
      so calling posix_unblock_lock() here is always a NOP.
      ecc9d325
    • Dave Jones's avatar
      [PATCH] Modular x86 MTRR driver. · 6a85ced0
      Dave Jones authored
      This patch from Pat Mochel cleans up the hell that was mtrr.c
      into something a lot more modular and easy to understand, by
      doing the implementation-per-file as has been done to various
      other things by Pat and myself over the last months.
      
      It's functionally identical from a kernel internal point of view,
      and a userspace point of view, and is basically just a very large
      code clean up.
      6a85ced0
    • Ingo Molnar's avatar
      [PATCH] stale thread detach debugging removal · 3b307fd5
      Ingo Molnar authored
      one of the debugging tests triggered a false-positive BUG() when a
      detached thread was straced.
      3b307fd5
    • Ingo Molnar's avatar
      [PATCH] thread release infrastructure · d2b7244f
      Ingo Molnar authored
      it is much cleaner to pass in the address of the user-space VM lock -
      this will also enable arbitrary implementations of the stack-unlock, as
      the fifth clone() parameter.
      d2b7244f
    • Rusty Russell's avatar
      [PATCH] init_tasks is not defined anywhere. · 86ae817e
      Rusty Russell authored
      It's referenced by mips and mips64 (both far out of date), but never
      actually defined anywhere.
      86ae817e
    • Linus Torvalds's avatar
      Merge http://linuxusb.bkbits.net/linus-2.5 · edf3d92b
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      edf3d92b
    • Petr Vandrovec's avatar
      [PATCH] es1371 synchronize_irq · 17454310
      Petr Vandrovec authored
      Update ES1371 to new synchronize_irq() API.
      17454310
    • Petr Vandrovec's avatar
      [PATCH] broken cfb* support in the 2.5.31-bk · 9299c003
      Petr Vandrovec authored
      line_length, type and visual moved from display struct to the fb_info's fix
      structure during last fbdev updates. Unfortunately generic code was not updated
      together, so now every fbdev driver is broken.
      9299c003
    • Petr Vandrovec's avatar
      [PATCH] Unicode characters 0x80-0x9F are valid ISO* characters · 26036678
      Petr Vandrovec authored
      Characters 0x80-0x9F from ISO encodings are U+0080-U+009F, so map
      them both ways. Otherwise you cannot use chars 0x80-0x9F in filenames
      on filesystems using NLS.
      26036678
    • Linus Torvalds's avatar
      Merge http://linux-scsi.bkbits.net/scsi-for-linus-2.5 · f9969cbe
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      f9969cbe
    • Linus Torvalds's avatar
      Merge bk://ldm.bkbits.net/linux-2.5 · ad2d842b
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      ad2d842b
    • Matthew Wilcox's avatar
      [PATCH] Trivial: remove sti from aic7xxx_old · 0352f6f5
      Matthew Wilcox authored
      We don't need to reenable interrupts before calling panic.
      0352f6f5
    • Alexander Viro's avatar
      [PATCH] umem per-disk gendisks · 49ae70c0
      Alexander Viro authored
      49ae70c0
    • Alexander Viro's avatar
      [PATCH] dasd per-disk gendisks · 664aa7b2
      Alexander Viro authored
      664aa7b2
    • Alexander Viro's avatar
      [PATCH] acsi per-disk gendisks · bedbeab4
      Alexander Viro authored
      bedbeab4
  2. 14 Aug, 2002 6 commits