1. 15 Feb, 2003 22 commits
    • Andrew Morton's avatar
      [PATCH] direct-io: allow reading of the part-filled EOF block · 3f31e635
      Andrew Morton authored
      driect-io will currently return EINVAL when the application tries to read the
      final bit of the file at EOF.  (assuming the file's length is not a multiple
      of the filesystem blocksize).
      
      The 2.4 kernelwill reurn 0 (it won't read it at all).
      
      This patch changes the 2.5 kernel to allow that block to be read.
      3f31e635
    • Andrew Morton's avatar
      [PATCH] direct-io return value fix · fe1659c1
      Andrew Morton authored
      If at the end of direct_io_worker, dio->result is non-zero then we
      unconditionally copy that into the return value, potentially ignoring any I/O
      errors which were accumulated into local variable `ret'.
      
      Only do the assignment if `ret' is zero.
      fe1659c1
    • Andrew Morton's avatar
      [PATCH] cciss, fix array bounds overrun · b3d9f279
      Andrew Morton authored
      Patch from steve cameron <steve.cameron@hp.com>
      
      Fix overrun if you have more than 16 attached tape drives + tape changers.
      Thanks to Mike Anderson for pointing this out.
      b3d9f279
    • Andrew Morton's avatar
      [PATCH] cciss driver update · 3aee2f2f
      Andrew Morton authored
      Patch from: steve cameron <steve.cameron@hp.com>
      
      Steve sent out a nice series of 11 broken-out patches.  I have lumped them
      all together.
      
      - Makes the cciss driver compile in 2.5.60  (from tony@cantech.net.au)
      
      - From randy.dunlap@verizon.net, fix memory leaks in cciss driver
      
      - Allow cciss driver attached disks other than the first to be accessed.
      
      - Zero out cylinders when zeroing out other disk info in cciss driver.
      
      - Remove unused variable from cciss_scsi.c
      
      - This patch makes scsi commands to tape drives have no timeouts.
        Previously the timeout was 1000 seconds, too short, and nothing good
        happens when the timeout expires.  Better to have no timeout.  e.g.  mt -f
        /dev/st0 erase may take about 2 hours 30 min on AIT 100.
      
      - Remove unneeded cciss_scsi init code from cciss driver.
      
      - Remove udelay in command polling routine
      - extend timeout to 20 seconds (need for certain multiport storage box)
      - Remove unneeded init time code in cciss_scsi.c (thus allowing removal
        of udelay in command polling code.)
      
      - Factor out duplicated read capacity code into common routine in cciss
        driver.
      
      - factor duplicated geometry inquiry code into common routine in cciss
        driver.
      3aee2f2f
    • Andrew Morton's avatar
      [PATCH] blk_congestion_wait tuning and lockup fix · ecc3f712
      Andrew Morton authored
      blk_congestion_wait() will currently not wait if there are no write requests
      in flight.  Which is a potential problem if all the dirty data is against NFS
      filesystems.
      
      For write(2) traffic against NFS, things work nicely, because writers
      throttle in nfs_wait_on_requests().  But for MAP_SHARED dirtyings we need to
      avoid spinning in balance_dirty_pages().  So allow callers to fall through to
      the explicit sleep in that case.
      
      This will also fix a weird lockup which the reiser4 developers report.  In
      that case they have managed to have _all_ inodes against a superblock in
      locked state, yet there are no write requests in flight.  Taking a nap in
      blk_congestion_wait() in this case will yield the CPU to the threads which
      are trying to write out pages.
      
      Also tune up the sleep durations in various callers - 250 milliseconds seems
      rather long.
      ecc3f712
    • Andrew Morton's avatar
      [PATCH] xattr: trusted extended attributes · c2e7eeb0
      Andrew Morton authored
      Patch from Andreas Gruenbacher <agruen@suse.de>
      
      This patch adds trusted extended attributes.  Trusted extended attributes are
      visible and accessible only to processes that have the CAP_SYS_ADMIN
      capability.  Attributes in this class are used to implement mechanisms in
      user space (i.e., outside the kernel) which keep information in extended
      attributes to which ordinary processes have no access.  HSM is an example.
      c2e7eeb0
    • Andrew Morton's avatar
      [PATCH] xattr: allow kernel code to override EA permissions · f83b53c5
      Andrew Morton authored
      Patch from Andreas Gruenbacher <agruen@suse.de>
      
      This adds the XATTR_KERNEL_CONTEXT extended attributes flag.  Kernel code may
      use this flag to override extended attribute permission restrictions that
      would otherwise be imposed on the calling process.
      f83b53c5
    • Andrew Morton's avatar
      [PATCH] xattr: infrastructure for permission overrides · 36bc191b
      Andrew Morton authored
      Patch from Andreas Gruenbacher <agruen@suse.de>
      
      This adds flags parameters to the getxattr, listxattr, and removexattr inode
      operations.  This is in preparation for the next patch, which allows
      in-kernel code (i.e., modules) to override extended attribute permission
      restrictions (which in turn is used by HSM implementations and the like).
      36bc191b
    • Andrew Morton's avatar
      [PATCH] xattr: listxattr fix · af8e38c7
      Andrew Morton authored
      Patch from Andreas Gruenbacher <agruen@suse.de>
      
      This patch fixes a bug in the ext2 and ext3 listxattr operation: Even if
      an attribute is hidden from the user, the terminating NULL character was
      included in the listxattr result. After the patch this doesn't happen
      anymore.
      af8e38c7
    • Andrew Morton's avatar
      [PATCH] error checking in ext3 xattr code · 358bae5b
      Andrew Morton authored
      from Andreas Gruenbacher
      358bae5b
    • Andrew Morton's avatar
      [PATCH] dcache_rcu · d8a55dda
      Andrew Morton authored
      Patch from Maneesh Soni <maneesh@in.ibm.com>, Dipankar Sarma
      <dipankar@in.ibm.com> and probably others.
      
      
      This patch provides dcache_lock free d_lookup() using RCU. Al pointed
      races with d_move and lockfree d_lookup() while concurrent rename is
      going on. We tested this with a test doing million renames
      each in 50 threads on 50 different ramfs filesystems. And simultaneously
      running millions of "ls". The tests were done on 4-way SMP box.
      
      1. Lookup going to a different bucket as the current dentry is
         moved to a different bucket due to rename. This is solved by
         having a list_head pointer in the dentry structure which points
         to the bucket head it belongs. The bucket pointer is updated when the
         dentry is added to the hash chain. Lookup checks if the current
         dentry belongs to a different bucket, the cached lookup is
         failed and real lookup will be done. This condition occured nearly
         about 100 times during the heavy_rename test.
      
      2. Lookup has got the dentry it is looking and it is comparing
         various keys and meanwhile a rename operation moves the dentry.
         This is solved by using a per dentry counter (d_move_count) which
         is updated at the end of d_move. Lookup takes a snapshot of the
         d_move_count before comparing the keys and once the comparision
         succeeds, it takes the per dentry lock to check the d_move_count
         again. If move_count differs, then dentry is moved (or renamed)
         and the lookup is failed.
      
      3. There can be a theoritical race when a dentry keeps coming back
         to original bucket due to double moves. Due to this lookup may
         consider that it has never moved and can end up in a infinite loop.
         This is solved by using a loop_counter which is compared with a
         approximate maximum number of dentries per bucket. This never got
         hit during the heavy_rename test.
      
      4. There is one more change regarding the loop termintaion condition
         in d_lookup, now the next hash pointer is compared with the current
         dentries bucket pointer (is_bucket()).
      
      5. memcmp() in d_lookup() can go out of bounds if name pointer and length
         fields are not consistent. For this we used a pointer to qstr to keep
         length and name pointer in one structre.
      
      We also tried solving these by using a rwlock but it could not compete
      with lockless solution.
      d8a55dda
    • Andrew Morton's avatar
      [PATCH] dcache_rcu: revert fast_walk code · 7ac75979
      Andrew Morton authored
      Patch from Maneesh Soni <maneesh@in.ibm.com>
      
      Revert the fast-walk dcache code in preparation for dcache_rcu.
      7ac75979
    • Andrew Morton's avatar
      [PATCH] crc32 improvements · 10507a61
      Andrew Morton authored
      Patch from Joakim Tjernlund <joakim.tjernlund@lumentis.se>
      
      I did the optimizations in the crc32 patch Brian Murphy submitted a while ago.
      Now I have cleaned it up a little and made some more optimizations.
      
      gcc is quite bad at loop optimizations (at least for PPC) so I have
      rewritten them to make gcc to generate better code. Even recent gcc's(3.2.x) produces
      better code.
      
      Also reduced the unrolling since it did not make a noticeable difference.
      10507a61
    • Andrew Morton's avatar
      [PATCH] fix ext3 BUG due to race with truncate · 81eb6906
      Andrew Morton authored
      When ext3_writepage races with truncate, block_write_full_page() will see
      that the page is outside i_size and will bale out with -EIO.  But
      ext3_writepage() will ignore this and will proceed to add the buffers to the
      transaction.
      
      Later, kjournald tries to write them out and goes BUG() because those buffers
      are not mapped to disk.
      
      The fix is to not attach the buffers to the transaction in ext3_writepage()
      if block_write_full_page() failed.
      
      So far so good, but that page now has dirty, unmapped buffers (the buffers
      were attached in a dirty state by ext3_writepage()).  So teach
      block_write_full_page() to clean the buffers against the page if it is wholly
      outside i_size.
      
      (A simpler fix to all of this mught be to just bale out of ext3_writepage()
      if the page is outside i_size.  But that is racy against
      block_write_full_page()'s subsequent execution of the same comparison).
      81eb6906
    • Andrew Morton's avatar
      [PATCH] separate checks from generic_file_aio_write · 7ac1de5d
      Andrew Morton authored
      Patch from: Oleg Drokin <green@namesys.com>
      
      It moves all the arg checking code from the start of generic_file_aio_write()
      into a standalone function, so other filesystems can avoid having to
      cut-n-paste them.
      
      The new function is exported to modules, and also inlined in filemap.c so
      that the current filesystems are unaffected.  If someone is using ext2 and
      reiserfs at the same time, they lose a bit of icache.
      7ac1de5d
    • Andrew Morton's avatar
      [PATCH] move fault_in_pages_readable/writeable to header · 3172a7c4
      Andrew Morton authored
      Patch from Oleg Drokin <green@namesys.com>
      
      Move these already-inline functions to a header file so that filesystems can
      reuse them.  For the reiserfs_file_write patch.
      3172a7c4
    • Andrew Morton's avatar
      [PATCH] flush_tlb_all is not preempt safe · 3e124416
      Andrew Morton authored
      Patch from: Zwane Mwaikambo <zwane@holomorphy.com>
      
      Considering that smp_call_function isn't allowed to hold a lock reference and
      within smp_call_function we lock and unlock call_lock thus triggering a
      preempt point.  Therefore we can't guarantee that we'll be on the same
      processor when we hit do_flush_tlb_all_local.
      
      void flush_tlb_all(void)
      {
      	smp_call_function (flush_tlb_all_ipi,0,1,1);
      
      	do_flush_tlb_all_local();
      }
      
      ...
      
      smp_call_function()
      {
      	spin_lock(call_lock);
      	...
      	spin_unlock(call_lock);
      	<preemption point>
      }
      
      ...
      
      do_flush_tlb_all_local() - possibly not executing on same processor
      anymore.
      3e124416
    • Andrew Morton's avatar
      [PATCH] JFS build fix with gcc-2.95.3 · ea609f54
      Andrew Morton authored
      I'm getting a build error:
      
      	fs/jfs/super.c: In function `jfs_fill_super':
      	fs/jfs/super.c:335: parse error before `)'
      
      and it doesn't happen with gcc-3.2.x.
      
      Taking out the file-n-line fixes it up.  This patch was acked by shaggy.
      ea609f54
    • Matthew Wilcox's avatar
      [PATCH] Fix mandatory locking · 9c57693e
      Matthew Wilcox authored
      Robbie Williamson found some bugs in the mandatory locking implementation.
      This patch fixes all the problems he found:
      
       - Fix null pointer dereference caused by sys_truncate() passing a null filp.
       - Honour the O_NONBLOCK flag when calling ftruncate()
       - Local variable `fl' wasn't being initialised correctly in
         locks_mandatory_area()
       - Don't return -ENOLCK from __posix_lock_file() when FL_ACCESS is set.
      9c57693e
    • Jens Axboe's avatar
      [PATCH] missing lock in get_request_wait() · ffd23335
      Jens Axboe authored
      We must grab lock before checking rl->count.
      ffd23335
    • Jens Axboe's avatar
      [PATCH] fix request-to-request front merging · 409499e9
      Jens Axboe authored
      bio-to-request front merging works, but request-to-request has been
      broken due to a bit too much copy'n pasting.
      409499e9
    • Jens Axboe's avatar
      [PATCH] deadline ioscheduler bug fixes · 7e843937
      Jens Axboe authored
      Cleaner fix for the ioscheduler:
      
      - Problem with alias request, the new request gets lost.
      - Must always clear merge hash in move_to_dispatch()
      7e843937
  2. 14 Feb, 2003 18 commits