1. 26 Jul, 2010 14 commits
    • NeilBrown's avatar
      md/bitmap: optimise scanning of empty bitmaps. · ef425673
      NeilBrown authored
      A bitmap is stored as one page per 2048 bits.
      If none of the bits are set, the page is not allocated.
      
      When bitmap_get_counter finds that a page isn't allocate,
      it just reports that one bit work of space isn't flagged,
      rather than reporting that 2048 bits worth of space are
      unflagged.
      This can cause searches for flagged bits (e.g. bitmap_close_sync)
      to do more work than is really necessary.
      
      So change bitmap_get_counter (when creating) to report a number of
      blocks that more accurately reports the range of the device for which
      no counter currently exists.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      ef425673
    • NeilBrown's avatar
      md/bitmap: clean up plugging calls. · b63d7c2e
      NeilBrown authored
      1/ use md_unplug in bitmap.c as we will soon be using bitmaps under
        arrays with no queue attached.
      
      2/ Don't bother plugging the queue when we set a bit in the bitmap.
         The reason for this was to encourage as many bits as possible to
         get set before we unplug and write stuff out.
         However every personality already plugs the queue after
         bitmap_startwrite either directly (raid1/raid10) or be setting
         STRIPE_BIT_DELAY which causes the queue to be plugged later
         (raid5).
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      b63d7c2e
    • NeilBrown's avatar
      md/bitmap: reduce dependence on sysfs. · 5ff5afff
      NeilBrown authored
      For dm-raid45 we will want to use bitmaps in dm-targets which don't
      have entries in sysfs, so cope with the mddev not living in sysfs.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      5ff5afff
    • NeilBrown's avatar
      md/bitmap: white space clean up and similar. · ac2f40be
      NeilBrown authored
      Fixes some whitespace problems
      Fixed some checkpatch.pl complaints.
      Replaced kmalloc ... memset(0), with kzalloc
      Fixed an unlikely memory leak on an error path.
      Reformatted a number of 'if/else' sets, sometimes
      replacing goto with an else clause.
      Removed some old comments and commented-out code.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      ac2f40be
    • NeilBrown's avatar
      md/raid5: export raid5 unplugging interface. · 9f7c2220
      NeilBrown authored
      Also remove remaining accesses to ->queue and ->gendisk when ->queue
      is NULL (As it is in a DM target).
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      9f7c2220
    • NeilBrown's avatar
      md/plug: optionally use plugger to unplug an array during resync/recovery. · 252ac522
      NeilBrown authored
      If an array doesn't have a 'queue' then md_do_sync cannot
      unplug it.
      In that case it will have a 'plugger', so make that available
      to the mddev, and use it to unplug the array if needed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      252ac522
    • NeilBrown's avatar
      md/raid5: add simple plugging infrastructure. · 2ac87401
      NeilBrown authored
      md/raid5 uses the plugging infrastructure provided by the block layer
      and 'struct request_queue'.  However when we plug raid5 under dm there
      is no request queue so we cannot use that.
      
      So create a similar infrastructure that is much lighter weight and use
      it for raid5.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      2ac87401
    • NeilBrown's avatar
      md/raid5: export is_congested test · 11d8a6e3
      NeilBrown authored
      the dm module will need this for dm-raid45.
      
      Also only access ->queue->backing_dev_info->congested_fn
      if ->queue actually exists.  It won't in a dm target.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      11d8a6e3
    • NeilBrown's avatar
      raid5: Don't set read-ahead when there is no queue · 4a5add49
      NeilBrown authored
      dm-raid456 does not provide a 'queue' for raid5 to use,
      so we must make raid5 stop depending on the queue.
      
      First: read_ahead
      dm handles read-ahead adjustment fully in userspace, so
      simply don't do any readahead adjustments if there is
      no queue.
      
      Also re-arrange code slightly so all the accesses to ->queue are
      together.
      
      Finally, move the blk_queue_merge_bvec function into the 'if' as
      the ->split_io setting in dm-raid456 has the same effect.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      4a5add49
    • NeilBrown's avatar
      md: add support for raising dm events. · 768a418d
      NeilBrown authored
      dm uses scheduled work to raise events to user-space.
      So allow md device to have work_structs and schedule them on an error.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      768a418d
    • NeilBrown's avatar
      md: export various start/stop interfaces · 390ee602
      NeilBrown authored
      export entry points for starting and stopping md arrays.
      This will be used by a module to make md/raid5 work under
      dm.
      Also stop calling md_stop_writes from md_stop, as that won't
      work well with dm - it will want to call the two separately.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      390ee602
    • NeilBrown's avatar
      md: split out md_rdev_init · e8bb9a83
      NeilBrown authored
      This functionality will be needed separately in a subsequent patch, so
      split it into it's own exported function.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      e8bb9a83
    • NeilBrown's avatar
      md: be more careful setting MD_CHANGE_CLEAN · 676e42d8
      NeilBrown authored
      When MD_CHANGE_CLEAN is set we might block in md_write_start.
      So we should only set it when fairly sure that something will clear
      it.
      
      There are two places where it is set so as to encourage a metadata
      update to record the progress of resync/recovery.  This should only
      be done if the internal metadata update mechanisms are in use, which
      can be tested by by inspecting '->persistent'.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      676e42d8
    • NeilBrown's avatar
      md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk · f4be6b43
      NeilBrown authored
      We will shortly allow md devices with no gendisk (they are attached to
      a dm-target instead).  That will cause mdname() to return 'mdX'.
      There is one place where mdname really needs to be unique: when
      creating the name for a slab cache.
      So in that case, if there is no gendisk, you the address of the mddev
      formatted in HEX to provide a unique name.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      f4be6b43
  2. 21 Jul, 2010 2 commits
  3. 19 Jul, 2010 9 commits
  4. 18 Jul, 2010 5 commits
  5. 16 Jul, 2010 9 commits
  6. 15 Jul, 2010 1 commit
    • Jan Kara's avatar
      jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions · 13ceef09
      Jan Kara authored
      OCFS2 uses t_commit trigger to compute and store checksum of the just
      committed blocks. When a buffer has b_frozen_data, checksum is computed
      for it instead of b_data but this can result in an old checksum being
      written to the filesystem in the following scenario:
      
      1) transaction1 is opened
      2) handle1 is opened
      3) journal_access(handle1, bh)
          - This sets jh->b_transaction to transaction1
      4) modify(bh)
      5) journal_dirty(handle1, bh)
      6) handle1 is closed
      7) start committing transaction1, opening transaction2
      8) handle2 is opened
      9) journal_access(handle2, bh)
          - This copies off b_frozen_data to make it safe for transaction1 to commit.
            jh->b_next_transaction is set to transaction2.
      10) jbd2_journal_write_metadata() checksums b_frozen_data
      11) the journal correctly writes b_frozen_data to the disk journal
      12) handle2 is closed
          - There was no dirty call for the bh on handle2, so it is never queued for
            any more journal operation
      13) Checkpointing finally happens, and it just spools the bh via normal buffer
      writeback.  This will write b_data, which was never triggered on and thus
      contains a wrong (old) checksum.
      
      This patch fixes the problem by calling the trigger at the moment data is
      frozen for journal commit - i.e., either when b_frozen_data is created by
      do_get_write_access or just before we write a buffer to the log if
      b_frozen_data does not exist. We also rename the trigger to t_frozen as
      that better describes when it is called.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      13ceef09