1. 01 Mar, 2010 8 commits
    • Christoph Hellwig's avatar
      xfs: fix inode pincount check in fsync · 024910cb
      Christoph Hellwig authored
      We need to hold the ilock to check the inode pincount safely.  While
      we're at it also remove the check for ip->i_itemp->ili_last_lsn, a
      pinned inode always has it set.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      024910cb
    • Dave Chinner's avatar
      xfs: Non-blocking inode locking in IO completion · 77d7a0c2
      Dave Chinner authored
      The introduction of barriers to loop devices has created a new IO
      order completion dependency that XFS does not handle. The loop
      device implements barriers using fsync and so turns a log IO in the
      XFS filesystem on the loop device into a data IO in the backing
      filesystem. That is, the completion of log IOs in the loop
      filesystem are now dependent on completion of data IO in the backing
      filesystem.
      
      This can cause deadlocks when a flush daemon issues a log force with
      an inode locked because the IO completion of IO on the inode is
      blocked by the inode lock. This in turn prevents further data IO
      completion from occuring on all XFS filesystems on that CPU (due to
      the shared nature of the completion queues). This then prevents the
      log IO from completing because the log is waiting for data IO
      completion as well.
      
      The fix for this new completion order dependency issue is to make
      the IO completion inode locking non-blocking. If the inode lock
      can't be grabbed, simply requeue the IO completion back to the work
      queue so that it can be processed later. This prevents the
      completion queue from being blocked and allows data IO completion on
      other inodes to proceed, hence avoiding completion order dependent
      deadlocks.
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      77d7a0c2
    • Christoph Hellwig's avatar
      xfs: implement optimized fdatasync · 66d834ea
      Christoph Hellwig authored
      Allow us to track the difference between timestamp and size updates
      by using mark_inode_dirty from the I/O completion code, and checking
      the VFS inode flags in xfs_file_fsync.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      66d834ea
    • Christoph Hellwig's avatar
      xfs: remove wrapper for the fsync file operation · fd3200be
      Christoph Hellwig authored
      Currently the fsync file operation is divided into a low-level
      routine doing all the work and one that implements the Linux file
      operation and does minimal argument wrapping.  This is a leftover
      from the days of the vnode operations layer and can be removed to
      simplify the code a bit, as well as preparing for the implementation
      of an optimized fdatasync which needs to look at the Linux inode
      state.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      fd3200be
    • Christoph Hellwig's avatar
      xfs: remove wrappers for read/write file operations · 00258e36
      Christoph Hellwig authored
      Currently the aio_read, aio_write, splice_read and splice_write file
      operations are divided into a low-level routine doing all the work
      and one that implements the Linux file operations and does minimal
      argument wrapping.  This is a leftover from the days of the vnode
      operations layer and can be removed to simplify the code a lot.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      00258e36
    • Christoph Hellwig's avatar
      xfs: merge xfs_lrw.c into xfs_file.c · dda35b8f
      Christoph Hellwig authored
      Currently the code to implement the file operations is split over
      two small files.  Merge the content of xfs_lrw.c into xfs_file.c to
      have it in one place.  Note that I haven't done various cleanups
      that are possible after this yet, they will follow in the next
      patch.  Also the function xfs_dev_is_read_only which was in
      xfs_lrw.c before really doesn't fit in here at all and was moved to
      xfs_mount.c.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      dda35b8f
    • Christoph Hellwig's avatar
      xfs: fix dquota trace format · b262e5df
      Christoph Hellwig authored
      The be32_to_cpu in the TP_printk output breaks automatic parsing of
      the trace format by the trace-cmd tools, so we have to move it into
      the TP_assign block.  While we're at it also fix the format for the
      quota limits to more regular and easier parseable.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      b262e5df
    • Eric Sandeen's avatar
      xfs: increase readdir buffer size · a9cc799e
      Eric Sandeen authored
      While doing some testing of readdir perf a while back,
      I noticed that the buffer size we're using internally is
      smaller than what glibc gives us by default.  Upping this
      size helped a bit, and seems safe.
      
      glibc's __alloc_dir() does:
      
        const size_t default_allocation = (4 * BUFSIZ < sizeof (struct dirent64)
                                           ? sizeof (struct dirent64) : 4 * BUFSIZ);
        const size_t small_allocation = (BUFSIZ < sizeof (struct dirent64)
                                         ? sizeof (struct dirent64) : BUFSIZ);
        size_t allocation = default_allocation;
      #ifdef _STATBUF_ST_BLKSIZE
        if (statp != NULL && default_allocation < statp->st_blksize)
          allocation = statp->st_blksize;
      #endif
      
      and
      
      #define _G_BUFSIZ 8192
      #define _IO_BUFSIZ _G_BUFSIZ
      # define BUFSIZ _IO_BUFSIZ
      
      so the default buffer is 4 * 8192 = 32768
      (except in the unlikely case of blocks > 32k....)
      Signed-off-by: default avatarEric Sandeen <sandeen@sandeen.net>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      a9cc799e
  2. 26 Feb, 2010 1 commit
  3. 24 Feb, 2010 13 commits
  4. 23 Feb, 2010 18 commits