1. 20 Apr, 2003 40 commits
    • Linus Torvalds's avatar
      Merge home.transmeta.com:/home/torvalds/v2.5/akpm · 8f421acb
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      8f421acb
    • Linus Torvalds's avatar
      2d0ed106
    • Andrew Morton's avatar
      [PATCH] fbdev build fix · b009a1c6
      Andrew Morton authored
      - fb_prepare_logo() is calling the undefined find_logo().  I think it wants
        fb_find_logo().
      
      - fb_prepare_logo is not __init, therefore fb_find_logo() cannot be __init.
      b009a1c6
    • Andrew Morton's avatar
      [PATCH] Aggregated disk statistics · 34221611
      Andrew Morton authored
      From: Rick Lindsley <ricklind@us.ibm.com>
      
      To access all the system's disk statitics we currently need to access one
      sysfs file per disk.  This clearly will not be acceptable with thousands of
      disks.
      
      The patch aggregates the system-wide statistics in real time and exposes them
      via /proc/diskstats
      34221611
    • Andrew Morton's avatar
      [PATCH] Fix nfsctl for larger dev_t · 36ba76bb
      Andrew Morton authored
      From: Andries.Brouwer@cwi.nl
      
      The old NFS control interface passes dev_t's in from userspace.  This patch
      keeps it working when the size of dev_t changes.
      
      This is a deprecated interface - new nfs-utils uses an ascii representation
      in exportfs.
      
      Acked by Neil.
      36ba76bb
    • Andrew Morton's avatar
      [PATCH] smbfs: larger dev_t preparation · f0d10803
      Andrew Morton authored
      Discard fewer bits of the device number recd with smb.
      This does not depend on anything else.
      
      Andries
      f0d10803
    • Andrew Morton's avatar
      [PATCH] prepare device mapper for larger dev_t · cc43a08a
      Andrew Morton authored
      From: Joe Thornber <thornber@sistina.com>
      
      The only other thing that will need changing in dm to cope with 64bit
      dev_t concerns the bitset I'm using to keep track of allocated minor
      numbers.  A trivial patch like this would work for now:
      cc43a08a
    • Andrew Morton's avatar
      [PATCH] don't shrink slab for highmem allocations · 5a08774a
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      If one's goal is to free highmem pages, shrink_slab() is an ineffective
      method of recovering them, as slab pages are all ZONE_NORMAL or ZONE_DMA.
      Hence, this "FIXME: do not do for zone highmem".  Presumably this is a
      question of policy, as highmem allocations may be satisfied by reaping slab
      pages and handing them back; but the FIXME says what we should do.
      5a08774a
    • Andrew Morton's avatar
      [PATCH] Extend map_vm_area()/get_vm_area() · 2096040f
      Andrew Morton authored
      From: Christoph Hellwig <hch@infradead.org> and David M-T
      
      The ia64 port can use vmap(), but needs to be able to specify the protection
      flags and the resulting vma's vm_flags.
      
      The patch adds the two extra args to vmap(), updates the two callers and
      fixes some comment spellos.
      2096040f
    • Andrew Morton's avatar
      [PATCH] fix CONFIG_NOMMU mismerges · 4a6b60f2
      Andrew Morton authored
      From: Christoph Hellwig <hch@lst.de>
      
      we already have better stubs in nommu.c, the additional inlines in mm.h only
      cause compile failures.
      4a6b60f2
    • Andrew Morton's avatar
      [PATCH] Allocate hd_structs dynamically · 5fb58500
      Andrew Morton authored
      From: Badari Pulavarty <pbadari@us.ibm.com>
      
      Here is the patch to allocate hd_struct dynamically as we find
      partitions.
      
      There are 3 things I didn't like in the patch.
      
      1) The patch allocates 15 pointers instead of 15 hd_structs.  (incase of
         s= csi).  I was really hoping to get rid of "15" and make it really
         dynamic.  (In ca= se if we ever want to support more than 15 partitions
         per disk etc..).=20 I was thought about making it a linked list, but
         blk_partition_remap() needs to get to hd_struct for a given partition
         everytime we do IO.  So linked list would be bad, we really need direct
         access to partition in= fo.
      
      2) I had to add "partno" to hd_struct, since part_dev_read() used to calc=
         ulate partition number from the address before.
      
      3) kmalloc() failure in add_partition() will be silently ignored.
      
      It saves 2048 bytes per disk.
      5fb58500
    • Andrew Morton's avatar
      [PATCH] shm_get_stat-handle-hugetlb-pages.patch · 88bdd4c3
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      shm_get_stat() didn't know about hugetlbpage-backed shm.
      88bdd4c3
    • Andrew Morton's avatar
      [PATCH] DAC960: add call to blk_queue_bounce_limit · d7b557d1
      Andrew Morton authored
      From: Dave Olien <dmo@osdl.org>
      
      The following patch adds a call to blk_queue_bounce_limit to the DAC960
      driver.  Otherwise, it uses bounce buffering more than it needs to.
      d7b557d1
    • Andrew Morton's avatar
      [PATCH] oom-kill: preferentially kill swapoff · cca095e0
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      The current behaviour is that once swapoff has filled memory, other tasks get
      OOMkilled one by one until swapoff completes, or more likely hangs.  It is
      better that swapoff be the first choice for OOMkill.
      
      The patch changes the oom-killer so that it will kill off any
      currently-running swapoff instance before killing any other task.
      
      (Bit kludgy, couldn't think of a better way)
      cca095e0
    • Andrew Morton's avatar
      [PATCH] Permit interruption of swapoff · 6bf11a46
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      Sometimes you start a swapoff and, seeing how long it takes, wish you had
      not: allow signal to interrupt and stop swapoff.
      6bf11a46
    • Andrew Morton's avatar
      [PATCH] Disallow swapoff if there is insufficient memory · 464f4e78
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      First of three small "stop swapoff" patches based on 2.5.67-mm3:
      
      stop swapoff 1/3 vm_enough_memory?
      
      Before embarking upon swapoff, check vm_enough_memory.  Mainly
      for consistency in the overcommit_memory 2 (strict accounting) case:
      fail with -ENOMEM if it wouldn't let the amount removed be committed.
      
      Will always succeed in the overcommit_memory 1 case, as it should in
      root-shoot-foot mode.  In the overcommit_memory 0 case, well, I don't
      care much either way, so opted for the simplest code: no special case.
      Which means it could now fail at the start; but that's unlikely (case 0
      is over-generous) and only when it would have got stuck later on anyway.
      464f4e78
    • Andrew Morton's avatar
      [PATCH] use __GFP_REPEAT in pmd_alloc_one() · 36f6aa1b
      Andrew Morton authored
      Convert all pmd_alloc_one() implementations to use __GFP_REPEAT
      36f6aa1b
    • Andrew Morton's avatar
      [PATCH] use __GFP_REPEAT in pte_alloc_one() · 68b5a30f
      Andrew Morton authored
      Remove all the open-coded retry loops in various architectures, use
      __GFP_REPEAT.
      
      It could be that at some time in the future we change __GFP_REPEAT to give up
      after ten seconds or so, so all the checks for failed allocations are
      retained.
      68b5a30f
    • Andrew Morton's avatar
      [PATCH] make alloc_buffer_head take gfp_flags · 8db50e8b
      Andrew Morton authored
      - alloc_buffer_head() should take the allocation mode as an arg, and not
        assume.
      
      - Use __GFP_NOFAIL in JBD's call to alloc_buffer_head().
      
      - Remove all the retry code from jbd_kmalloc() - do it via page allocator
        controls.
      8db50e8b
    • Andrew Morton's avatar
      [PATCH] implement __GFP_REPEAT, __GFP_NOFAIL, __GFP_NORETRY · 75908778
      Andrew Morton authored
      This is a cleanup patch.
      
      There are quite a lot of places in the kernel which will infinitely retry a
      memory allocation.
      
      Generally, they get it wrong.  Some do yield(), the semantics of which have
      changed over time.  Some do schedule(), which can lock up if the caller is
      SCHED_FIFO/RR.  Some do schedule_timeout(), etc.
      
      And often it is unnecessary, because the page allocator will do the retry
      internally anyway.  But we cannot rely on that - this behaviour may change
      (-aa and -rmap kernels do not do this, for instance).
      
      So it is good to formalise and to centralise this operation.  If an
      allocation specifies __GFP_REPEAT then the page allocator must infinitely
      retry the allocation.
      
      The semantics of __GFP_REPEAT are "try harder".  The allocation _may_ fail
      (the 2.4 -aa and -rmap VM's do not retry infinitely by default).
      
      The semantics of __GFP_NOFAIL are "cannot fail".  It is a no-op in this VM,
      but needs to be honoured (or fix up the callers) if the VM ischanged to not
      retry infinitely by default.
      
      The semantics of __GFP_NOREPEAT are "try once, don't loop".  This isn't used
      at present (although perhaps it should be, in swapoff).  It is mainly for
      completeness.
      75908778
    • Andrew Morton's avatar
      [PATCH] shmdt() speedup · efbb77b2
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      Micro-optimize sys_shmdt(). There are methods of exploiting knowledge
      of the vma's being searched to restrict the search space. These are:
      
      (1) shm mappings always start their lives at file offset 0, so only
      	vma's above shmaddr need be considered. find_vma() can be used
      	to seek to the proper position in mm->mmap in O(lg(n)) time.
      
      (2) The search is for a vma which could be a fragment of a broken-up
      	shm mapping, which would have been created starting at shmaddr
      	with vm_pgoff 0 and then continued no further into userspace
      	than shmaddr + size. So after having found an initial vma, find
      	the size of the shm segment it maps to calculate an upper bound
      	to the virtualspace that needs to be searched.
      
      (3) mremap() would have caused the original checks to miss vma's mapping
      	the shm segment if shmaddr were the original address at which
      	the shm segments were attached. This does no better and no worse
      	than the original code in that situation.
      
      (4) If the chain of references in vma->vm_file->f_dentry->d_inode->i_size
      	is not guaranteed by refcounting and/or the shm code then this is
      	oopsable; AFAICT an inode is always allocated.
      efbb77b2
    • Andrew Morton's avatar
      [PATCH] AIO mmap fix · bb455250
      Andrew Morton authored
      From: Badari Pulavarty <pbadari@us.ibm.com>
      
      Here is a small bug fix for AIO. get_user_pages() takes number
      of pages to map as argument. (not in bytes)
      bb455250
    • Andrew Morton's avatar
      [PATCH] quotactl(): sync all quotas · d637ceb0
      Andrew Morton authored
      From: Jan Kara <jack@suse.cz>
      
        I'm resending a patch which implements quotactl(2) call for syncing
      all devices. Particulary it allows the caller not to specify the device
      for syncing and in that case quotas on all the devices are written.
      The patch is rather trivial (mostly moving the code).
      d637ceb0
    • Andrew Morton's avatar
      [PATCH] ATI Mach64 build fix · 061fa91f
      Andrew Morton authored
      From: Geert Uytterhoeven <geert@linux-m68k.org>
      
      Atyfb: Add missing parts of reversal of Mobility changes, allowing ATI Mach64
      GX support to compile again.
      061fa91f
    • Andrew Morton's avatar
      [PATCH] hugetlb math overflow fix · 03b83710
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      And this one fixes an overflow when there is more than 4GB of hugetlb.
      03b83710
    • Andrew Morton's avatar
      [PATCH] follow_hugetlb_page fix · a3efc1fa
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      follow_hugetlb_page() drops out of the loop prematurely and fails to take the
      appropriate refcounts if its starting address was not hugepage-aligned.
      
      It looked a bit unclean too, so I rewrote it.  This fixes a bug, and more
      importantly, makes the thing readable by something other than a compiler
      (e.g.  programmers).
      a3efc1fa
    • Andrew Morton's avatar
      [PATCH] Clean up various buffer-head dependencies · cda55f33
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      Remove page_has_buffers() from various functions, document the dependencies
      on buffer_head.h from other files besides filemap.c, and s/this file/core VM/
      in filemap.c
      cda55f33
    • Andrew Morton's avatar
      [PATCH] Move __set_page_dirty_buffers to fs/buffer.c · 5549174d
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      Move __set_page_dirty_buffers() to fs/buffer.c, as per the FIXME.
      5549174d
    • Andrew Morton's avatar
      [PATCH] Turn on NUMA rebalancing · 26fbf90f
      Andrew Morton authored
      From: "Martin J. Bligh" <mbligh@aracnet.com>
      
      I'd forgotten that I'd set this to only fire every 20s in the past, because
      it would rebalance too agressively.  That seems to be fixed now, so we should
      turn it back on.
      26fbf90f
    • Andrew Morton's avatar
      [PATCH] Make PCI scanning order the same as 2.4 · 609b0188
      Andrew Morton authored
      From: Chuck Ebbert <76306.1226@compuserve.com>
      
      2.4 builds its global PCI device list in breadth-first order.
      
      2.5 is doing the scan that way but defers the construction of the global list
      until later and then does it depth-first.  This causes devices to found in
      different order by drivers.  The below fixed that problem for me.
      
      Russell King has acked this change.
      609b0188
    • Andrew Morton's avatar
      [PATCH] keyboard.c Fix SAK in raw mode · 5da505b1
      Andrew Morton authored
      From: Chris Heath <chris@heathens.co.nz>
      
      Trivial fix to get the SAK key working in raw and medium raw modes.  Patch is
      against kernel 2.5.67.
      5da505b1
    • Andrew Morton's avatar
      [PATCH] Minor fix for driver/serial/core.c · 72689e67
      Andrew Morton authored
      From: Jean Tourrilhes <jt@bougret.hpl.hp.com>
      
      	The following command will do nothing at all on 2.5.X :
      		setserial /dev/ttyS0 uart none
      72689e67
    • Andrew Morton's avatar
      [PATCH] detect_lost_tick locking fixes · d9a4b6c5
      Andrew Morton authored
      From: john stultz <johnstul@us.ibm.com>
      
      This patch fixes a race in the timer_interrupt code caused by
      detect_lost_tick().  Since we're doing lost-tick compensation outside
      timer->mark_offset, time can pass between time-source reads which can cause
      gettimeofday inconsistencies.
      
      Additionally detect_lost_tick() was broken for the PIT case, since the whole
      point of detect_lost_tick() is to interpolate between two time sources to
      find inconsistencies.  Additionally this could cause xtime_lock seq_lock
      reader starvation which has been causing machine hangs for SMP boxes that use
      the PIT as a time source.
      
      This patch fixes the described race by removing detect_lost_tick() and
      instead implementing the lost tick detection code inside mark_offset().
      
      Some of the divs and mods being added here might concern folks, but by not
      calling timer->get_offset() in detect_lost_tick() we eliminate much of the
      same math.  I did some simple cycle counting and the new code comes out on
      average equivalent or faster.
      d9a4b6c5
    • Andrew Morton's avatar
      [PATCH] get_offset_pit and do_timer_overflow vs IRQ locking · e2ac56f6
      Andrew Morton authored
      From: john stultz <johnstul@us.ibm.com>, Alexander Atanasov <alex@ssi.bg>
      
      We want to make sure we update jiffies_p and count_p atomically.  So I'm
      inserting the spin_unlock_irqrestore() after we update count_p, rather then
      just before.
      e2ac56f6
    • Andrew Morton's avatar
      [PATCH] Fix jiffies_to_time[spec | val] and converse to use · 0ebcfd99
      Andrew Morton authored
      From: george anzinger <george@mvista.com>
      
      In the current system (2.5.67) time_spec to jiffies, time_val to
      jiffies and the converse (jiffies to time_val and jiffies to
      time_spec) all use 1/HZ as the measure of a jiffie.  Because of the
      inability of the PIT to actually generate an accurate 1/HZ interrupt,
      the wall clock is updated with a more accurate value (999848
      nanoseconds per jiffie for HZ = 1000).  This causes a 1/HZ
      interpretation of jiffies based timing to run faster than the wall
      clock, thus causing sleeps and timers to expire short of the requested
      time.  Try, for example:
      
      time sleep 60
      
      This patch changes the conversion routines to use the same value as
      the wall clock update code to do the conversions.
      
      The actual math is almost all done at compile time.  The run time
      conversions require little if any more execution time.
      
      This patch must be applied after the patch I posted earlier today
      which fixed the CLOCK_MONOTONIC resolution issue.
      0ebcfd99
    • Andrew Morton's avatar
      [PATCH] Fix POSIX timers to give CLOCK_MONOTONIC full · 2f98681f
      Andrew Morton authored
      The POSIX CLOCK_MONOTONIC currently has only 1/HZ resolution.  Further, it is
      tied to jiffies (i.e.  is a restatment of jiffies) rather than "xtime" or the
      gettimeofday() clock.
      
      This patch changes CLOCK_MONOTONIC to be a restatment of gettimeofday() plus
      an offset to remove any clock setting activity from CLOCK_MONOTONIC.  An
      offset is kept that represents the difference between CLOCK_MONOTONIC and
      gettimeofday().  This offset is updated when ever the gettimeofday() clock is
      set to back the clock setting change out of CLOCK_MONOTONIC (which by the
      standard, can not be set).
      
      With this change CLOCK_REALTIME (a direct restatement of gettimeofday()),
      CLOCK_MONOTONIC and gettimeofday() will all tick at the same time and with
      the same rate.  And all will be affected by NTP adjustments (save those which
      actually set the time).
      2f98681f
    • Andrew Morton's avatar
      [PATCH] Fix and clean up DCACHE_REFERENCED usage · 0e3efbd1
      Andrew Morton authored
      From: Maneesh Soni <maneesh@in.ibm.com>
      
      This patch changes the way DCACHE_REFERENCED flag is used. It
      got messed up in dcache_rcu iterations. I hope this will be ok now.
      
      The flag was meant to be advisory flag which is used while
      prune_dcache() so as not to free dentries which have recently
      entered d_lru list. At first pass in prune_dcache the dentries
      marked DCACHE_REFERENCED are left with the flag reset. and they
      are freed in the next pass.
      
      So, now we mark the dentry as DCACHE_REFERENCED when it is first
      entering the d_lru list in dput() and resetthe flag in prune_dcache().
      If the flag remains reset in the next call to prune_dcache(), the
      dentry is then freed.
      
      Also I don't think any file system have to use this flag as it is taken
      care by the dcache layer. The patch removes such code from a few of file
      systems. Moreover these filesystems were anyway doing worng thing as they
      were changing the flag out of dcache_lock.
      
      Changes:
      o dput() marks dentry DCACHE_REFERENCED when it is added to the dentry_unused
        list
      o no need to set the flag in dget, dget_locked, d_lookup as these guys anyway
        increments the ref count.
      o check the ref count in prune_dcache and use DCACHE_REFERENCED flag just for
        two stage aging.
      o remove code for setting DACACHE_REFERENCED from reiserfs, fat, xfs and
        exportfs.
      0e3efbd1
    • Andrew Morton's avatar
      [PATCH] dentry_stat accounting fix · de8e3749
      Andrew Morton authored
      From: Maneesh Soni <maneesh@in.ibm.com>
      
      This patch the corrects the dentry_stat.nr_unused calculation.
      
      In select_parent() and shrink_dcache_anon() we were not doing any adjustments
      to the nr_unused count after manipulating the dentry_unused list.  Now the
      nr_unused count is decremented if the dentry is on dentry_unused list and is
      removed from there.
      
      Further in the same routines, we have to adjust the nr_unused count again if
      the dentry is moved to the end of d_lru list for pruning.
      de8e3749
    • Andrew Morton's avatar
      [PATCH] dmfe: don't free skb with local interrupts disabled · 70d67000
      Andrew Morton authored
      dev_kfree_skb() can end up calling local_bh_enable() which goes BUG if local
      interrupts are disabled.  Apparently it can deadlock.
      
      So move the skb freeing outside the lock in the dmfe driver.  It will
      decrease the lock hold time as well.
      70d67000
    • Andrew Morton's avatar
      [PATCH] Fix nc98 partition parser link error · cb970405
      Andrew Morton authored
      Fix this:
      
      fs/partitions/nec98.c:169: undefined reference to `parse_bsd'
      cb970405