1. 07 Sep, 2002 9 commits
    • Alexander Viro's avatar
      [PATCH] (9/25) update_partition() · 897c924e
      Alexander Viro authored
      	new helper - update_partition(disk, partition_number); does the
      right thing wrt devfs and driverfs (un)registration of partition entries.
      BLKPG ioctls fixed - now they call that beast rather than calling only
      devfs side.  New helper - rescan_partitions(disk, bdev); does all work
      with wiping/rereading/etc. and fs/block_dev.c now uses it instead of
      check_partition().  The latter became static.
      897c924e
    • Alexander Viro's avatar
      [PATCH] (8/25) Removing bogus arrays - ->de_arr[] · 06f55b09
      Alexander Viro authored
      	similar to ->flags and ->driverfs_dev_arr, ->de_arr[] got replaced
      with its (single) element + flag.
      06f55b09
    • Alexander Viro's avatar
      [PATCH] (7/25) Removing bogus arrays - ->part[].number · db09b5fc
      Alexander Viro authored
      	Each hd_struct used to have int number; in it.  It's used _only_
      in disk->part[0] - disk->part[n].number is never assigned/checked for any
      positive n.  Moved from hd_struct to gendisk (disk->part[0].number to
      disk->number).
      db09b5fc
    • Alexander Viro's avatar
      [PATCH] (6/25) Removing bogus arrays - ->driverfs_dev_arr[] · c5f45a70
      Alexander Viro authored
      	disk->driverfs_dev_arr is either NULL or consists of exactly one
      element.  Same change as above (struct device ** -> struct device *); old
      "is the pointer to array itself NULL or not?" replaced with a flag (in
      disk->flags).
      c5f45a70
    • Alexander Viro's avatar
      [PATCH] (5/25) Removing bogus arrays - ->flags[] · ab3bfaa2
      Alexander Viro authored
      	Seeing that now disk->flags[] always consists of one element, we
      replace char *flags with int flags, remove the junk from places that used
      to allocate these "arrays" and do obvious updates of the code
      (s/->flags[0]/->flags/).
      ab3bfaa2
    • Alexander Viro's avatar
      [PATCH] (4/25) Unexporting driverfs_remove_partitions() · 097b3217
      Alexander Viro authored
      	call of driverfs_remove_partitions() pulled into del_gendisk();
      function isn't exported anymore.  Both it and driverfs_create_partitions()
      cleaned up.
      097b3217
    • Alexander Viro's avatar
      [PATCH] (3/25) Removing useless minor arguments · 36bd834b
      Alexander Viro authored
      	driverfs_remove_partitions(), devfs_register_partitions(),
      driverfs_create_partitions(), devfs_register_partition(), devfs_register_disc(),
      had lost 'minor' argument - it's always disk->first_minor these days.
      disk_name() takes partition number instead of minor now.  Callers of
      wipe_partitions() in fs/block_dev.c expanded.  Remaining caller passes
      gendisk instead of kdev_t now.
      36bd834b
    • Alexander Viro's avatar
      [PATCH] (2/25) Removing ->nr_real · 4e493886
      Alexander Viro authored
      	Since ->nr_real is always 1 now, we can remove that field completely.
      Removed the last remnants of switch in disk_name() (it could be killed
      a long time ago, I just forgot to remove the last two cases when md and i2o
      got converted).  Collapsed several instances of
      disk->part[minor - disk->first_minor] - in cases when we know that we deal
      with disk->part[0].
      4e493886
    • Alexander Viro's avatar
      [PATCH] (1/25) Unexporting helper functions · b3152267
      Alexander Viro authored
      	wipe_partitions() and driverfs_register_partitions(..., 1) (i.e.
      unregistering them) pulled into del_gendisk() and removed from callers.
      grok_partitions() merged with register_disk().  devfs_register_partitions(),
      grok_partitions() and wipe_partitions() not exported anymore.
      b3152267
  2. 31 Aug, 2002 9 commits
  3. 30 Aug, 2002 22 commits
    • Greg Kroah-Hartman's avatar
      3c9bd375
    • Linus Torvalds's avatar
      The SCSI layer should _not_ try to decide about non-existent · ba26eacc
      Linus Torvalds authored
      partitions. The higher layers do a better job of it.
      ba26eacc
    • Neil Brown's avatar
    • Neil Brown's avatar
      [PATCH] PATCH - kNFSd - More small fixes for TCP nfsd · 03d7a386
      Neil Brown authored
      sk_inuse should be bigger than "char" as we can
      have more than 255 server threads.  Due to the way the count
      is used, this is unlikely to actually cause a problem, but it
      should nonetheless be fixed.
      
      Also, two printk generate more noise than we would like,
      so turn them into dprintk (debugging printk).
      03d7a386
    • Chuck Lever's avatar
      [PATCH] sock_writeable not appropriate for TCP sockets, for 2.5.32 · d2279c44
      Chuck Lever authored
      sock_writeable determines whether there is space in a socket's output
      buffer.  socket write_space callbacks use it to determine whether to wake
      up those that are waiting for more output buffer space.
      
      however, sock_writeable is not appropriate for TCP sockets.  because the
      RPC layer's write_space callback uses it for TCP sockets, the RPC layer
      hammers on sock_sendmsg with dozens of write requests that are only a few
      hundred bytes long when it is trying to send a large write RPC request.
      this patch adds logic to the RPC layer's write_space callback that
      properly handles TCP sockets.
      
      patch reviewed by Trond and Alexey.
      d2279c44
    • Chuck Lever's avatar
      [PATCH] prevent oops in xprt_lock_write, against 2.5.32 · 1758bdf3
      Chuck Lever authored
      when several RPC requests want to reconnect a TCP transport socket at
      once, xprt_lock_write serializes the tasks to prevent multiple socket
      connects.  however, TCP connects are always done by a RPC child task that
      has no request slot.  xprt_lock_write can oops if there is no request slot
      allocated to the invoking RPC task.  reviewed and accepted by Trond.
      
      the xprt_lock_write changes are not yet in 2.4, so this patch does not
      apply to 2.4.
      1758bdf3
    • Ingo Molnar's avatar
      [PATCH] TLS boot-initialization bugfix on SMP, 2.5.32-BK · 44a05b3e
      Ingo Molnar authored
      This fixes a bad TLS initialization bug found by Andi Kleen.  x86/SMP
      only worked due to luck.
      44a05b3e
    • Ingo Molnar's avatar
      [PATCH] scheduler fixes, 2.5.32-BK · 2c638ab0
      Ingo Molnar authored
      This adds two scheduler related fixes:
      
       - changes the migration code to use struct completion. Andrew pointed out
         that there might be a small window in where the up() touches the
         semaphore while the waiting task goes on and frees its stack. And
         completion is more suited for this kind of stuff anyway.
      
       - removes two unneeded exports, pointed out by Andrew.
      2c638ab0
    • Ingo Molnar's avatar
      [PATCH] clone-cleanup 2.5.32-BK · 1f9d6582
      Ingo Molnar authored
      This moves CLONE_SETTID and CLONE_CLEARTID handling into kernel/fork.c,
      where it belongs.  [the CLONE_SETTLS is x86-specific and thus remains in
      the per-arch process.c] This makes support for these two new flags much
      easier: architectures only have to pass in the user_tid pointer.
      1f9d6582
    • Dominik Brodowski's avatar
      [PATCH] include/asm-i386/msr.h · a27b8fe9
      Dominik Brodowski authored
      It would be helpful if these msr.h #defines could get in.
      a27b8fe9
    • David Mosberger's avatar
      [PATCH] efi.h move · af05fc03
      David Mosberger authored
      It makes no sense to keep efi.h as an ia64-specific header (there really
      are x86 machines coming out with optional EFI BIOS support).
      af05fc03
    • Linus Torvalds's avatar
      Merge http://lia64.bkbits.net/to-linus-2.5 · 9caf366e
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      9caf366e
    • Andrew Morton's avatar
      [PATCH] ext3 __FUNCTION__ pasting fix · 7c0700ff
      Andrew Morton authored
      Fix a __FUNCTION__ paste in revoke.c
      7c0700ff
    • Andrew Morton's avatar
      [PATCH] O_DIRECT for ext3 · a3b71057
      Andrew Morton authored
      O_DIRECT support for ext3.
      
      It works OK in all journalling modes.
      
      Updates to the file metadata and inode are journalled as usual.
      
      If the system crashes during an appending O_DIRECT write then journal
      recovery will truncate the written-to file back to the length which it
      had on entry to that write.
      
      If the system crashes during a file overwrite to existing blocks then
      the file contents will be an unknown mixture of old and new.
      
      If the system crashes during a file overwrite which instantiates new
      blocks in the middle of the file then there is a possibility of
      uninitialised disk blocks being present in the file post-recovery.
      a3b71057
    • Andrew Morton's avatar
      [PATCH] fix an ext3 deadlock · bebff73c
      Andrew Morton authored
      mpage_writepages() does a lock_page() on pages to be written back, even
      when it is being used for page reclaim writeback.
      
      This is normally OK, because the page is unlocked quickly - pages are
      unlocked during writeback and nobody should be performing __GFP_FS
      allocations inside lock_page().
      
      But it has introduced a ranking problem in ext3:
      
      generic_file_write
      ->lock_page
        ->ext3_prepare_write
          ->journal_start	(waits for a commit)
      
      versus
      
      ext3_create()
      ->journal_start()
        ->ext3_new_inode(GFP_KERNEL)
          ->page reclaim
            ->mpage_writepages
              ->lock_page	(locks up, transaction is held open)
      
      Maybe sometime, I'll have to turn mpage_writepages' lock_page into a
      trylock if the caller is PF_MEMALLOC.  But for now, let's make ext3's
      inside-transaction allocations use GFP_NOFS.  There is only one of them.
      bebff73c
    • Andrew Morton's avatar
      [PATCH] writeback correctness and efficiency changes · ec12ac49
      Andrew Morton authored
      This is a performance and correctness fix against the writeback paths.
      
      The writeback code has competing requirements.  Sometimes it is used
      for "memory cleansing": kupdate, bdflush, writer throttling, page
      allocator writeback, etc.  And sometimes this same code is used for
      data integrity pruposes: fsync, msync, fdatasync, sync, umount, various
      other kernel-internal uses.
      
      The problem is: how to handle a dirty buffer or page which is currently
      under writeback.
      
      For memory cleansing, we just want to skip that buffer/page and go onto
      the next one.  But for sync, we must wait on the old writeback and then
      start new writeback.
      
      mpage_writepages() is current correct for cleansing, but incorrect for
      sync.  block_write_full_page() is currently correct for sync, but
      inefficient for cleansing.
      
      The fix is fairly simple.
      
      - In mpage_writepages(), don't skip the page is it's a sync
      operation.
      
      - In block_write_full_page(), skip the buffer if it is a sync
      operation.  And return -EAGAIN to tell the caller that the writeout
      didn't work out.  The caller must then set the page dirty again and
      move it onto mapping->dirty_pages.
      
      This is an extension of the writepage API: writepage can now return
      EAGAIN.  There are only three callers, and they have been updated.
      
      fail_writepage() and ext3_writepage() were actually doing this by
      hand.  They have been changed to return -EAGAIN.  NTFS will want to
      be able to return -EAGAIN from its writepage as well.
      
      - A sticky question is: how to tell the writeout code which mode it
      is operating in?  Cleansing or sync?
      
      It's such a tiny code change that I didn't have the heart to go and
      propagate a `mode' argument down every instance of writepages() and
      writepage() in the kernel.  So I passed it in via current->flags.
      
      Incidentally, the occurrence of a locked-and-dirty buffer in
      block_write_full_page() is fairly rare: normally the collision avoidance
      happens at the address_space level, via PageWriteback.  But some
      mappings (blockdevs, ext3 files, etc) have their dirty buffers written
      out via submit_bh().  It is these buffers which can stall
      block_write_full_page().
      
      This wart will be pretty intrusive to fix.  ext3 needs to become fully
      page-based (ugh.  It's a block-based journalling filesystem, and pages
      are unnatural).  blockdev mappings are still written out by buffers
      because that's how filesystems use them.  Putting _all_ metadata
      (indirects, inodes, superblocks, etc) into standalone address_spaces
      would fix that up.
      
      - filemap_fdatawrite() sets PF_SYNC.  So filemap_fdatawrite() is the
      kernel function which will start writeback against a mapping for
      "data integrity" purposes, whereas the unexported, internal-only
      do_writepages() is the writeback function which is used for memory
      cleansing.  This difference is the reason why I didn't consolidate
      those functions ages ago...
      
      - Lots of code paths had a bogus extra call to filemap_fdatawait(),
      which I previously added in a moment of weak-headedness.  They have
      all been removed.
      ec12ac49
    • Andrew Morton's avatar
      [PATCH] batched freeing of anon pages · 8fd3d458
      Andrew Morton authored
      A reworked version of the batched page freeing and lock amortisation
      for VMA teardown.
      
      It walks the existing 507-page list in the mmu_gather_t in 16-page
      chunks, drops their refcounts in 16-page chunks, and de-LRUs and
      frees any resulting zero-count pages in up-to-16 page chunks.
      8fd3d458
    • Andrew Morton's avatar
      [PATCH] put_page() consolidation · 2b341443
      Andrew Morton authored
      Clean up put_page() and page_cache_release().  It's pretty simple now:
      
      #define page_cache_get(page)           get_page(page)
      #define page_cache_release(page)       put_page(page)
      2b341443
    • Andrew Morton's avatar
      [PATCH] remove pagevec_lru_del() · e035a047
      Andrew Morton authored
      it was only being used in invalidate_inode_pages(), and from there,
      pagevec_release() does the same thing.
      e035a047
    • Andrew Morton's avatar
      [PATCH] debug check in put_page_testzero() · c99b0372
      Andrew Morton authored
      As suggested by Daniel - it's a bug to run put_page_testzero
      against a zero-ref page.
      c99b0372
    • Ingo Molnar's avatar
      [PATCH] MAINTAINERS patch · cdf2f98b
      Ingo Molnar authored
      please apply this patch (Robert ACK-ed it). While there is a preemptible
      kernel entry already, i think listing this at the scheduler entry is
      justfied, preemption has a number of scheduler interactions.
      cdf2f98b
    • Ingo Molnar's avatar
      [PATCH] ldt-fix-2.5.32-A3 · 89d637a8
      Ingo Molnar authored
      this is an updated version of the LDT fixes. It fixes the following kinds
      of problems:
      
       - fix a possible gcc optimization causing a race causing the loading of a
         corrupt LDT descriptor upon context switch. [this fix got simplified
         over previous versions.]
      
       - remove an unconditional OOM printk, and there's no need to set ->size
         in the OOM path.
      
       - fix preemption bugs, load_LDT()/clear_LDT() was not preemption-safe,
         when it was used outside of spinlocks.
      
      the context-switch race is the following. 'LDT modification' is the
      following operation: the seg->ldt pointer is modified, then seg->size is
      modified. In theory gcc is free to reschedule the two modifications, and
      first modify ->size, then ->ldt. Thus if this modification is not
      synchronized with context-switches, another thread might see a temporary
      state of the new ->size [which was increased], but still the old pointer.
      Ie.:
      
      	CPU0				CPU1
      
      	pc->size = newsize;
      					load_LDT(); // (oldptr, newsize)
      	pc->ldt = newptr;
      
      the corrupt LDT is loaded until the SMP cross-call is sent, leaving the
      window open for many usecs.
      
      the fix is to put a wmb() after ->ldt modifications. [this is also in
      preparation of not-write-ordered SMP x86 designs.]
      89d637a8