1. 10 Sep, 2010 14 commits
    • Tao Ma's avatar
      ocfs2: Cache system inodes of other slots. · b4d693fc
      Tao Ma authored
      Durring orphan scan, if we are slot 0, and we are replaying
      orphan_dir:0001, the general process is that for every file
      in this dir:
      1. we will iget orphan_dir:0001, since there is no inode for it.
         we will have to create an inode and read it from the disk.
      2. do the normal work, such as delete_inode and remove it from
         the dir if it is allowed.
      3. call iput orphan_dir:0001 when we are done. In this case,
         since we have no dcache for this inode, i_count will
         reach 0, and VFS will have to call clear_inode and in
         ocfs2_clear_inode we will checkpoint the inode which will let
         ocfs2_cmt and journald begin to work.
      4. We loop back to 1 for the next file.
      
      So you see, actually for every deleted file, we have to read the
      orphan dir from the disk and checkpoint the journal. It is very
      time consuming and cause a lot of journal checkpoint I/O.
      A better solution is that we can have another reference for these
      inodes in ocfs2_super. So if there is no other race among
      nodes(which will let dlmglue to checkpoint the inode), for step 3,
      clear_inode won't be called and for step 1, we may only need to
      read the inode for the 1st time. This is a big win for us.
      
      So this patch will try to cache system inodes of other slots so
      that we will have one more reference for these inodes and avoid
      the extra inode read and journal checkpoint.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      b4d693fc
    • Joel Becker's avatar
      libfs: Fix shift bug in generic_check_addressable() · a33f13ef
      Joel Becker authored
      generic_check_addressable() erroneously shifts pages down by a block
      factor when it should be shifting up.  To prevent overflow, we shift
      blocks down to pages.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      a33f13ef
    • Patrick J. LoPresti's avatar
      OCFS2: Allow huge (> 16 TiB) volumes to mount · 3bdb8efd
      Patrick J. LoPresti authored
      The OCFS2 developers have already done all of the hard work to allow
      volumes larger than 16 TiB.  But there is still a "sanity check" in
      fs/ocfs2/super.c that prevents the mounting of such volumes, even when
      the cluster size and journal options would allow it.
      
      This patch replaces that sanity check with a more sophisticated one to
      mount a huge volume provided that (a) it is addressable by the raw
      word/address size of the system (borrowing a test from ext4); (b) the
      volume is using JBD2; and (c) the JBD2_FEATURE_INCOMPAT_64BIT flag is
      set on the journal.
      
      I factored out the sanity check into its own function.  I also moved it
      from ocfs2_initialize_super() down to ocfs2_check_volume(); any earlier,
      and the journal will not have been initialized yet.
      
      This patch is one of a pair, and it depends on the other ("JBD2: Allow
      feature checks before journal recovery").
      
      I have tested this patch on small volumes, huge volumes, and huge
      volumes without 64-bit block support in the journal.  All of them appear
      to work or to fail gracefully, as appropriate.
      Signed-off-by: default avatarPatrick LoPresti <lopresti@gmail.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      3bdb8efd
    • Patrick J. LoPresti's avatar
      JBD2: Allow feature checks before journal recovery · 1113e1b5
      Patrick J. LoPresti authored
      Before we start accessing a huge (> 16 TiB) OCFS2 volume, we need to
      confirm that its journal supports 64-bit offsets.  In particular, we
      need to check the journal's feature bits before recovering the journal.
      
      This is not possible with JBD2 at present, because the journal
      superblock (where the feature bits reside) is not loaded from disk until
      the journal is recovered.
      
      This patch loads the journal superblock in
      jbd2_journal_check_used_features() if it has not already been loaded,
      allowing us to check the feature bits before journal recovery.
      Signed-off-by: default avatarPatrick LoPresti <lopresti@gmail.com>
      Cc: linux-ext4@vger.kernel.org
      Acked-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      1113e1b5
    • Patrick J. LoPresti's avatar
      ext3/ext4: Factor out disk addressability check · 30ca22c7
      Patrick J. LoPresti authored
      As part of adding support for OCFS2 to mount huge volumes, we need to
      check that the sector_t and page cache of the system are capable of
      addressing the entire volume.
      
      An identical check already appears in ext3 and ext4.  This patch moves
      the addressability check into its own function in fs/libfs.c and
      modifies ext3 and ext4 to invoke it.
      
      [Edited to -EINVAL instead of BUG_ON() for bad blocksize_bits -- Joel]
      Signed-off-by: default avatarPatrick LoPresti <lopresti@gmail.com>
      Cc: linux-ext4@vger.kernel.org
      Acked-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      30ca22c7
    • Joel Becker's avatar
    • Tao Ma's avatar
      17ae5211
    • Tao Ma's avatar
      ocfs2: Remove unused old_id in ocfs2_commit_cache. · f9c57ada
      Tao Ma authored
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      f9c57ada
    • Jan Kara's avatar
      ocfs2: Remove ocfs2_sync_inode() · 4c38881f
      Jan Kara authored
      ocfs2_sync_inode() is used only from ocfs2_sync_file(). But all data has
      already been written before calling ocfs2_sync_file() and ocfs2 doesn't use
      inode's private_list for tracking metadata buffers thus sync_mapping_buffers()
      is superfluous as well.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      4c38881f
    • Goldwyn Rodrigues's avatar
      Reorganize data elements to reduce struct sizes · 83fd9c7f
      Goldwyn Rodrigues authored
      Thanks for the comments. I have incorportated them all.
      
      CONFIG_OCFS2_FS_STATS is enabled and CONFIG_DEBUG_LOCK_ALLOC is disabled.
      Statistics now look like -
      ocfs2_write_ctxt: 2144 - 2136 = 8
      ocfs2_inode_info: 1960 - 1848 = 112
      ocfs2_journal: 168 - 160 = 8
      ocfs2_lock_res: 336 - 304 = 32
      ocfs2_refcount_tree: 512 - 472 = 40
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.de>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      83fd9c7f
    • Tao Ma's avatar
      ocfs2: Remove obscure error handling in direct_write. · 95fa859a
      Tao Ma authored
      In ocfs2, actually we don't allow any direct write pass i_size,
      see the function ocfs2_prepare_inode_for_write. So we don't
      need the bogus simple_setsize.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      95fa859a
    • Tao Ma's avatar
      ocfs2: Add some trace log for orphan scan. · 3c3f20c9
      Tao Ma authored
      Now orphan scan worker has no trace log, so it is
      very hard to tell whether it is finished or blocked.
      So add 2 mlog trace log so that we can tell whether
      the current orphan scan worker is blocked or not.
      It does help when I analyzed a orphan scan bug.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      3c3f20c9
    • Tristan Ye's avatar
      Ocfs2: Add new OCFS2_IOC_INFO ioctl for ocfs2 v8. · ddee5cdb
      Tristan Ye authored
      The reason why we need this ioctl is to offer the none-privileged
      end-user a possibility to get filesys info gathering.
      
      We use OCFS2_IOC_INFO to manipulate the new ioctl, userspace passes a
      structure to kernel containing an array of request pointers and request
      count, such as,
      
      * From userspace:
      
      struct ocfs2_info_blocksize oib = {
              .ib_req = {
                      .ir_magic = OCFS2_INFO_MAGIC,
                      .ir_code = OCFS2_INFO_BLOCKSIZE,
                      ...
              }
              ...
      }
      
      struct ocfs2_info_clustersize oic = {
              ...
      }
      
      uint64_t reqs[2] = {(unsigned long)&oib,
                          (unsigned long)&oic};
      
      struct ocfs2_info info = {
              .oi_requests = reqs,
              .oi_count = 2,
      }
      
      ret = ioctl(fd, OCFS2_IOC_INFO, &info);
      
      * In kernel:
      
      Get the request pointers from *info*, then handle each request one bye one.
      
      Idea here is to make the spearated request small enough to guarantee
      a better backward&forward compatibility since a small piece of request
      would be less likely to be broken if filesys on raw disk get changed.
      
      Currently, the following 7 requests are supported per the requirement from
      userspace tool o2info, and I believe it will grow over time:-)
      
              OCFS2_INFO_CLUSTERSIZE
              OCFS2_INFO_BLOCKSIZE
              OCFS2_INFO_MAXSLOTS
              OCFS2_INFO_LABEL
              OCFS2_INFO_UUID
              OCFS2_INFO_FS_FEATURES
              OCFS2_INFO_JOURNAL_SIZE
      
      This ioctl is only specific to OCFS2.
      Signed-off-by: default avatarTristan Ye <tristan.ye@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      ddee5cdb
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 152831be
      Linus Torvalds authored
      * master.kernel.org:/home/rmk/linux-2.6-arm: (30 commits)
        ARM: Update mach-types
        ARM: Partially revert "Auto calculate ZRELADDR and provide option for exceptions"
        ARM: Ensure PTE modifications via dma_alloc_coherent are visible
        ARM: 6359/1: ep93xx: move clock initialization earlier
        Revert "[ARM] pxa: remove now unnecessary dma_needs_bounce()"
        ARM: 6352/1: perf: fix event validation
        ARM: 6344/1: Mark CPU_32v6K as depended on CPU_V7
        ARM: 6343/1: wire up fanotify and prlimit64 syscalls on ARM
        ARM: 6330/1: perf: reword comments relating to perf_event_do_pending
        ARM: pxa168fb: fix section mismatch
        ARM: pxa: Make id const in pwm_probe()
        ARM: pxa: fix CI_HSYNC and CI_VSYNC MFP defines for pxa300
        ARM: pxa: remove __init from cpufreq_driver->init()
        ARM: imx: set cache line size to 64 bytes for i.MX5
        mx5/clock: fix clear bit fields issue in _clk_ccgr_disable function
        mxc/tzic: add base address when accessing TZIC registers
        ARM: mach-shmobile: ap4evb: fix write protect for SDHI1
        ARM: mach-shmobile: ap4evb: modify FSI2 ID
        ARM: mach-shmobile: do not enable the PLLC2 clock on init
        ARM: mach-shmobile: Clock framework comment fix
        ...
      152831be
  2. 09 Sep, 2010 7 commits
  3. 08 Sep, 2010 19 commits