1. 14 Feb, 2012 1 commit
    • NeilBrown's avatar
      md/raid10: fix handling of error on last working device in array. · fae8cc5e
      NeilBrown authored
      If we get a read error on the last working device in a RAID10 which
      contains the target block, then we don't fail the device (which is
      good) but we don't abort retries, which is wrong.
      We end up in an infinite loop retrying the read on the one device.
      
      This patch fixes the problem in two places:
      1/ in raid10_end_read_request we don't even ask for a retry if this
         was the last usable device.  This is efficient but a little racy
         and will sometimes retry when it should not.
      
      2/ in handle_read_error we are careful to exclude any device from
         retry which we tried to mark as faulty (that might have failed if
         it was the last device).  This is race-free but less efficient.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      fae8cc5e
  2. 13 Feb, 2012 1 commit
  3. 07 Feb, 2012 1 commit
    • NeilBrown's avatar
      md: two small fixes to handling interrupt resync. · db91ff55
      NeilBrown authored
      1/ If a resync is aborted we should record how far we got
       (recovery_cp) the last request that we know has completed
       (->curr_resync_completed) rather than the last request that was
       submitted (->curr_resync).
      
      2/ When a resync aborts we still want to update the metadata with
       any changes, so set MD_CHANGE_DEVS even if we 'skip'.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      db91ff55
  4. 30 Jan, 2012 1 commit
    • Jonathan Brassow's avatar
      Prevent DM RAID from loading bitmap twice. · 34f8ac6d
      Jonathan Brassow authored
      The life cycle of a device-mapper target is:
      1) create
      2) resume
      3) suspend
      *) possibly repeat from 2
      4) destroy
      
      The dm-raid target is unconditionally calling MD's bitmap_load function upon
      every resume.  If steps 2 & 3 above are repeated, bitmap_load is called
      multiple times.  It is only written to be called once; otherwise, it allocates
      new memory for the bitmap (without freeing the old) and incrementing the number
      of pages it thinks it has without zeroing first.  This ultimately leads to
      access beyond allocated memory and lost memory.
      
      Simply avoiding the bitmap_load call upon resume is not sufficient.  If the
      target was suspended while the initial recovery was only partially complete,
      it needs to be restarted when the target is resumed.  This is why
      'md_wakeup_thread' is called before issuing the 'mddev_resume'.
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      34f8ac6d
  5. 10 Jan, 2012 2 commits
  6. 22 Dec, 2011 34 commits