1. 27 Apr, 2013 6 commits
    • Dave Chinner's avatar
      xfs: shortform directory offsets change for dir3 format · 6b2647a1
      Dave Chinner authored
      Because the header size for the CRC enabled directory blocks is
      larger, the offset of the first entry into a directory block is
      different to the dir2 format. The shortform directory stores the
      dirent's offset so that it doesn't change when moving from shortform
      to block form and back again, and hence it needs to take into
      account the different header sizes to maintain the correct offsets.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      6b2647a1
    • Dave Chinner's avatar
      xfs: add CRC checking to dir2 leaf blocks · 24df33b4
      Dave Chinner authored
      This addition follows the same pattern as the dir2 block CRCs.
      Seeing as both LEAF1 and LEAFN types need to changed at the same
      time, this is a pretty large amount of change. leaf block headers
      need to be abstracted away from the on-disk structures (struct
      xfs_dir3_icleaf_hdr), as do the base leaf entry locations.
      
      This header abstract allows the in-core header and leaf entry
      location to be passed around instead of the leaf block itself. This
      saves a lot of converting individual variables from on-disk format
      to host format where they are used, so there's a good chance that
      the compiler will be able to produce much more optimal code as it's
      not having to byteswap variables all over the place.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      24df33b4
    • Dave Chinner's avatar
      xfs: add CRC checking to dir2 data blocks · 33363fee
      Dave Chinner authored
      This addition follows the same pattern as the dir2 block CRCs.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      33363fee
    • Dave Chinner's avatar
      xfs: add CRC checking to dir2 free blocks · cbc8adf8
      Dave Chinner authored
      This addition follows the same pattern as the dir2 block CRCs, but
      with a few differences. The main difference is that the free block
      header is different between the v2 and v3 formats, so an "in-core"
      free block header has been added and _todisk/_from_disk functions
      used to abstract the differences in structure format from the code.
      This is similar to the on-disk superblock versus the in-core
      superblock setup. The in-core strucutre is populated when the buffer
      is read from disk, all the in memory checks and modifications are
      done on the in-core version of the structure which is written back
      to the buffer before the buffer is logged.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      cbc8adf8
    • Dave Chinner's avatar
      xfs: add CRC checks to block format directory blocks · f5f3d9b0
      Dave Chinner authored
      Now that directory buffers are made from a single struct xfs_buf, we
      can add CRC calculation and checking callbacks. While there, add all
      the fields to the on disk structures for future functionality such
      as d_type support, uuids, block numbers, owner inode, etc.
      
      To distinguish between the different on disk formats, change the
      magic numbers for the new format directory blocks.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      f5f3d9b0
    • Dave Chinner's avatar
      xfs: add CRC checks to remote symlinks · f948dd76
      Dave Chinner authored
      Add a header to the remote symlink block, containing location and
      owner information, as well as CRCs and LSN fields. This requires
      verifiers to be added to the remote symlink buffers for CRC enabled
      filesystems.
      
      This also fixes a bug reading multiple block symlinks, where the second
      block overwrites the first block when copying out the link name.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      f948dd76
  2. 21 Apr, 2013 8 commits
  3. 16 Apr, 2013 2 commits
  4. 05 Apr, 2013 1 commit
    • Dave Chinner's avatar
      xfs: don't free EFIs before the EFDs are committed · 666d644c
      Dave Chinner authored
      Filesystems are occasionally being shut down with this error:
      
      xfs_trans_ail_delete_bulk: attempting to delete a log item that is
      not in the AIL.
      
      It was diagnosed to be related to the EFI/EFD commit order when the
      EFI and EFD are in different checkpoints and the EFD is committed
      before the EFI here:
      
      http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
      
      The real problem is that a single bit cannot fully describe the
      states that the EFI/EFD processing can be in. These completion
      states are:
      
      EFI			EFI in AIL	EFD		Result
      committed/unpinned	Yes		committed	OK
      committed/pinned	No		committed	Shutdown
      uncommitted		No		committed	Shutdown
      
      
      Note that the "result" field is what should happen, not what does
      happen. The current logic is broken and handles the first two cases
      correctly by luck.  That is, the code will free the EFI if the
      XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
      inverted logic "works" because if both EFI and EFD are committed,
      then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
      bit, and the second frees the EFI item. Hence as long as
      xfs_efi_item_committed() has been called, everything appears to be
      fine.
      
      It is the third case where the logic fails - where
      xfs_efd_item_committed() is called before xfs_efi_item_committed(),
      and that results in the EFI being freed before it has been
      committed. That is the bug that triggered the shutdown, and hence
      keeping track of whether the EFI has been committed or not is
      insufficient to correctly order the EFI/EFD operations w.r.t. the
      AIL.
      
      What we really want is this: the EFI is always placed into the
      AIL before the last reference goes away. The only way to guarantee
      that is that the EFI is not freed until after it has been unpinned
      *and* the EFD has been committed. That is, restructure the logic so
      that the only case that can occur is the first case.
      
      This can be done easily by replacing the XFS_EFI_COMMITTED with an
      EFI reference count. The EFI is initialised with it's own count, and
      that is not released until it is unpinned. However, there is a
      complication to this method - the high level EFI/EFD code in
      xfs_bmap_finish() does not hold direct references to the EFI
      structure, and runs a transaction commit between the EFI and EFD
      processing. Hence the EFI can be freed even before the EFD is
      created using such a method.
      
      Further, log recovery uses the AIL for tracking EFI/EFDs that need
      to be recovered, but it uses the AIL *differently* to the EFI
      transaction commit. Hence log recovery never pins or unpins EFIs, so
      we can't drop the EFI reference count indirectly to free the EFI.
      
      However, this doesn't prevent us from using a reference count here.
      There is a 1:1 relationship between EFIs and EFDs, so when we
      initialise the EFI we can take a reference count for the EFD as
      well. This solves the xfs_bmap_finish() issue - the EFI will never
      be freed until the EFD is processed. In terms of log recovery,
      during the committing of the EFD we can look for the
      XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
      thereby ensuring everything works correctly there as well.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      666d644c
  5. 03 Apr, 2013 1 commit
  6. 22 Mar, 2013 7 commits
  7. 14 Mar, 2013 3 commits
  8. 07 Mar, 2013 6 commits
    • Dave Chinner's avatar
      xfs: rearrange some code in xfs_bmap for better locality · 9e5987a7
      Dave Chinner authored
      xfs_bmap.c is a big file, and some of the related code is spread all
      throughout the file requiring function prototypes for static
      function and jumping all through the file to follow a single call
      path. Rearrange the code so that:
      
      	a) related functionality is grouped together; and
      	b) functions are grouped in call dependency order
      
      While the diffstat is large, there are no code changes in the patch;
      it is just moving the functionality around and removing the function
      prototypes at the top of the file. The resulting layout of the code
      is as follows (top of file to bottom):
      
      	- miscellaneous helper functions
      	- extent tree block counting routines
      	- debug/sanity checking code
      	- bmap free list manipulation functions
      	- inode fork format manipulation functions
      	- internal/external extent tree seach functions
      	- extent tree manipulation functions used during allocation
      	- functions used during extent read/allocate/removal
      	  operations (i.e. xfs_bmapi_write, xfs_bmapi_read,
      	  xfs_bunmapi and xfs_getbmap)
      
      This means that following logic paths through the bmapi code is much
      simpler - most of the code relevant to a specific operation is now
      clustered together rather than spread all over the file....
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      9e5987a7
    • Akinobu Mita's avatar
      xfs: rename random32() to prandom_u32() · ecb3403d
      Akinobu Mita authored
      Use more preferable function name which implies using a pseudo-random
      number generator.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Acked-by: <bpm@sgi.com>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: Alex Elder <elder@kernel.org>
      Cc: xfs@oss.sgi.com
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      ecb3403d
    • Dave Chinner's avatar
      xfs: don't verify buffers after IO errors · d5929de8
      Dave Chinner authored
      When we read a buffer, we might get an error from the underlying
      block device and not the real data. Hence if we get an IO error, we
      shouldn't run the verifier but instead just pass the IO error
      straight through.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      d5929de8
    • Mark Tinguely's avatar
      xfs: fix xfs_iomap_eof_prealloc_initial_size type · e8108ced
      Mark Tinguely authored
      Fix the return type of xfs_iomap_eof_prealloc_initial_size() to
      xfs_fsblock_t to reflect the fact that the return value may be an
      unsigned 64 bits if XFS_BIG_BLKNOS is defined.
      Signed-off-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e8108ced
    • Brian Foster's avatar
      xfs: increase prealloc size to double that of the previous extent · e114b5fc
      Brian Foster authored
      The updated speculative preallocation algorithm for handling sparse
      files can becomes less effective in situations with a high number of
      concurrent, sequential writers. The number of writers and amount of
      available RAM affect the writeback bandwidth slicing algorithm,
      which in turn affects the block allocation pattern of XFS. For
      example, running 32 sequential writers on a system with 32GB RAM,
      preallocs become fixed at a value of around 128MB (instead of
      steadily increasing to the 8GB maximum as sequential writes
      proceed).
      
      Update the speculative prealloc heuristic to base the size of the
      next prealloc on double the size of the preceding extent. This
      preserves the original aggressive speculative preallocation
      behavior and continues to accomodate sparse files at a slight cost
      of increasing the size of preallocated data regions following holes
      of sparse files.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e114b5fc
    • Brian Foster's avatar
      xfs: fix potential infinite loop in xfs_iomap_prealloc_size() · e78c420b
      Brian Foster authored
      If freesp == 0, we could end up in an infinite loop while squashing
      the preallocation. Break the loop when we've killed the prealloc
      entirely.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      e78c420b
  9. 03 Mar, 2013 6 commits
    • Linus Torvalds's avatar
      Linux 3.9-rc1 · 6dbe51c2
      Linus Torvalds authored
      6dbe51c2
    • Linus Torvalds's avatar
      Merge tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers · ea882c2e
      Linus Torvalds authored
      Pull fbdev UAPI disintegration from David Howells:
       "You'll be glad to here that the end is nigh for the UAPI patches.
        Only the fbdev/framebuffer piece remains now that the SCSI stuff has
        gone in.
      
        Here are the UAPI disintegration bits for the fbdev drivers.  It
        appears that Florian hasn't had time to deal with my patch, but back
        in December he did say he didn't mind if I pushed it forward."
      
      Yay.  No more uapi movement.  And hopefully no more big header file
      cleanups coming up either, it just tends to be very painful.
      
      * tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers:
        UAPI: (Scripted) Disintegrate include/video
      ea882c2e
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.9-rc1-tag' of... · 8e8b180a
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
       - Update the Xen ACPI memory and CPU hotplug locking mechanism.
       - Fix PAT issues wherein various applications would not start
       - Fix handling of multiple MSI as AHCI now does it.
       - Fix ARM compile failures.
      
      * tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xenbus: fix compile failure on ARM with Xen enabled
        xen/pci: We don't do multiple MSI's.
        xen/pat: Disable PAT using pat_enabled value.
        xen/acpi: xen cpu hotplug minor updates
        xen/acpi: xen memory hotplug minor updates
      8e8b180a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 56a79b7b
      Linus Torvalds authored
      Pull  more VFS bits from Al Viro:
       "Unfortunately, it looks like xattr series will have to wait until the
        next cycle ;-/
      
        This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
        etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
        more file_inode() work"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        constify path_get/path_put and fs_struct.c stuff
        fix nommu breakage in shmem.c
        cache the value of file_inode() in struct file
        9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
        9p: make sure ->lookup() adds fid to the right dentry
        9p: untangle ->lookup() a bit
        9p: double iput() in ->lookup() if d_materialise_unique() fails
        9p: v9fs_fid_add() can't fail now
        v9fs: get rid of v9fs_dentry
        9p: turn fid->dlist into hlist
        9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
        more file_inode() open-coded instances
        selinux: opened file can't have NULL or negative ->f_path.dentry
      
      (In the meantime, the hlist traversal macros have changed, so this
      required a semantic conflict fixup for the newly hlistified fid->dlist)
      56a79b7b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 1c82315a
      Linus Torvalds authored
      Pull btrfs fixup from Chris Mason:
       "Geert and James both sent this one in, sorry guys"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs/raid56: Add missing #include <linux/vmalloc.h>
      1c82315a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 530ede14
      Linus Torvalds authored
      Pull second set of s390 patches from Martin Schwidefsky:
       "The main part of this merge are Heikos uaccess patches.  Together with
        commit 09884964 ("mm: do not grow the stack vma just because of an
        overrun on preceding vma") the user string access is hopefully fixed
        for good.
      
        In addition some bug fixes and two cleanup patches."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/module: fix compile warning
        qdio: remove unused parameters
        s390/uaccess: fix kernel ds access for page table walk
        s390/uaccess: fix strncpy_from_user string length check
        input: disable i8042 PC Keyboard controller for s390
        s390/dis: Fix invalid array size
        s390/uaccess: remove pointless access_ok() checks
        s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
        s390/uaccess: shorten strncpy_from_user/strnlen_user
        s390/dasd: fix unresponsive device after all channel paths were lost
        s390/mm: ignore change bit for vmemmap
        s390/page table dumper: add support for change-recording override bit
      530ede14