1. 31 Aug, 2015 31 commits
    • NeilBrown's avatar
      md/raid5: ensure device failure recorded before write request returns. · c3cce6cd
      NeilBrown authored
      When a write to one of the devices of a RAID5/6 fails, the failure is
      recorded in the metadata of the other devices so that after a restart
      the data on the failed drive wont be trusted even if that drive seems
      to be working again (maybe a cable was unplugged).
      
      Similarly when we record a bad-block in response to a write failure,
      we must not let the write complete until the bad-block update is safe.
      
      Currently there is no interlock between the write request completing
      and the metadata update.  So it is possible that the write will
      complete, the app will confirm success in some way, and then the
      machine will crash before the metadata update completes.
      
      This is an extremely small hole for a racy to fit in, but it is
      theoretically possible and so should be closed.
      
      So:
       - set MD_CHANGE_PENDING when requesting a metadata update for a
         failed device, so we can know with certainty when it completes
       - queue requests that completed when MD_CHANGE_PENDING is set to
         only be processed after the metadata update completes
       - call raid_end_bio_io() on bios in that queue when the time comes.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      c3cce6cd
    • NeilBrown's avatar
      md/raid5: use bio_list for the list of bios to return. · 34a6f80e
      NeilBrown authored
      This will make it easier to splice two lists together which will
      be needed in future patch.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      34a6f80e
    • NeilBrown's avatar
      md/raid10: ensure device failure recorded before write request returns. · 95af587e
      NeilBrown authored
      When a write to one of the legs of a RAID10 fails, the failure is
      recorded in the metadata of the other legs so that after a restart
      the data on the failed drive wont be trusted even if that drive seems
      to be working again (maybe a cable was unplugged).
      
      Currently there is no interlock between the write request completing
      and the metadata update.  So it is possible that the write will
      complete, the app will confirm success in some way, and then the
      machine will crash before the metadata update completes.
      
      This is an extremely small hole for a racy to fit in, but it is
      theoretically possible and so should be closed.
      
      So:
       - set MD_CHANGE_PENDING when requesting a metadata update for a
         failed device, so we can know with certainty when it completes
       - queue requests that experienced an error on a new queue which
         is only processed after the metadata update completes
       - call raid_end_bio_io() on bios in that queue when the time comes.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      95af587e
    • NeilBrown's avatar
      md/raid1: ensure device failure recorded before write request returns. · 55ce74d4
      NeilBrown authored
      When a write to one of the legs of a RAID1 fails, the failure is
      recorded in the metadata of the other leg(s) so that after a restart
      the data on the failed drive wont be trusted even if that drive seems
      to be working again  (maybe a cable was unplugged).
      
      Similarly when we record a bad-block in response to a write failure,
      we must not let the write complete until the bad-block update is safe.
      
      Currently there is no interlock between the write request completing
      and the metadata update.  So it is possible that the write will
      complete, the app will confirm success in some way, and then the
      machine will crash before the metadata update completes.
      
      This is an extremely small hole for a racy to fit in, but it is
      theoretically possible and so should be closed.
      
      So:
       - set MD_CHANGE_PENDING when requesting a metadata update for a
         failed device, so we can know with certainty when it completes
       - queue requests that experienced an error on a new queue which
         is only processed after the metadata update completes
       - call raid_end_bio_io() on bios in that queue when the time comes.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      55ce74d4
    • NeilBrown's avatar
      md-cluster: remove inappropriate try_module_get from join() · 18b9f679
      NeilBrown authored
      md_setup_cluster already calls try_module_get(), so this
      try_module_get isn't needed.
      Also, there is no matching module_put (except in error patch),
      so this leaves an unbalanced module count.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      18b9f679
    • NeilBrown's avatar
      md: extend spinlock protection in register_md_cluster_operations · 6022e75b
      NeilBrown authored
      This code looks racy.
      
      The only possible race is if two modules try to register at the same
      time and that won't happen.  But make the code look safe anyway.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      6022e75b
    • Guoqing Jiang's avatar
      md-cluster: Read the disk bitmap sb and check if it needs recovery · abb9b22a
      Guoqing Jiang authored
      In gather_all_resync_info, we need to read the disk bitmap sb and
      check if it needs recovery.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      abb9b22a
    • Guoqing Jiang's avatar
      md-cluster: only call complete(&cinfo->completion) when node join cluster · eece075c
      Guoqing Jiang authored
      Introduce MD_CLUSTER_BEGIN_JOIN_CLUSTER flag to make sure
      complete(&cinfo->completion) is only be invoked when node
      join cluster. Otherwise node failure could also call the
      complete, and it doesn't make sense to do it.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      eece075c
    • Guoqing Jiang's avatar
      md-cluster: add missed lockres_free · 6e6d9f2c
      Guoqing Jiang authored
      We also need to free the lock resource before goto out.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      6e6d9f2c
    • Guoqing Jiang's avatar
      md-cluster: remove the unused sb_lock · b2b9bfff
      Guoqing Jiang authored
      The sb_lock is not used anywhere, so let's remove it.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      b2b9bfff
    • Guoqing Jiang's avatar
      md-cluster: init suspend_list and suspend_lock early in join · 9e3072e3
      Guoqing Jiang authored
      If the node just join the cluster, and receive the msg from other nodes
      before init suspend_list, it will cause kernel crash due to NULL pointer
      dereference, so move the initializations early to fix the bug.
      
      md-cluster: Joined cluster 3578507b-e0cb-6d4f-6322-696cd7b1b10c slot 3
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      ... ... ...
      Call Trace:
      [<ffffffffa0444924>] process_recvd_msg+0x2e4/0x330 [md_cluster]
      [<ffffffffa0444a06>] recv_daemon+0x96/0x170 [md_cluster]
      [<ffffffffa045189d>] md_thread+0x11d/0x170 [md_mod]
      [<ffffffff810768c4>] kthread+0xb4/0xc0
      [<ffffffff8151927c>] ret_from_fork+0x7c/0xb0
      ... ... ...
      RIP  [<ffffffffa0443581>] __remove_suspend_info+0x11/0xa0 [md_cluster]
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      9e3072e3
    • Guoqing Jiang's avatar
      md-cluster: add the error check if failed to get dlm lock · b5ef5678
      Guoqing Jiang authored
      In complicated cluster environment, it is possible that the
      dlm lock couldn't be get/convert on purpose, the related err
      info is added for better debug potential issue.
      
      For lockres_free, if the lock is blocking by a lock request or
      conversion request, then dlm_unlock just put it back to grant
      queue, so need to ensure the lock is free finally.
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      b5ef5678
    • Guoqing Jiang's avatar
      md-cluster: init completion within lockres_init · b83d51c0
      Guoqing Jiang authored
      We should init completion within lockres_init, otherwise
      completion could be initialized more than one time during
      it's life cycle.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      b83d51c0
    • Guoqing Jiang's avatar
      md-cluster: fix deadlock issue on message lock · 66099bb0
      Guoqing Jiang authored
      There is problem with previous communication mechanism, and we got below
      deadlock scenario with cluster which has 3 nodes.
      
      	Sender                	    Receiver        		Receiver
      
      	token(EX)
             message(EX)
            writes message
         downconverts message(CR)
            requests ack(EX)
      		                  get message(CR)            gets message(CR)
                      		  reads message                reads message
      		               requests EX on message    requests EX on message
      
      To fix this problem, we do the following changes:
      
      1. the sender downconverts MESSAGE to CW rather than CR.
      2. and the receiver request PR lock not EX lock on message.
      
      And in case we failed to down-convert EX to CW on message, it is better to
      unlock message otherthan still hold the lock.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarLidong Zhong <ldzhong@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      66099bb0
    • Guoqing Jiang's avatar
      md-cluster: transfer the resync ownership to another node · dc737d7c
      Guoqing Jiang authored
      When node A stops an array while the array is doing a resync, we need
      to let another node B take over the resync task.
      
      To achieve the goal, we need the A send an explicit BITMAP_NEEDS_SYNC
      message to the cluster. And the node B which received that message will
      invoke __recover_slot to do resync.
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      dc737d7c
    • Guoqing Jiang's avatar
      md-cluster: split recover_slot for future code reuse · 05cd0e51
      Guoqing Jiang authored
      Make recover_slot as a wraper to __recover_slot, since the
      logic of __recover_slot can be reused for the condition
      when other nodes need to take over the resync job.
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      05cd0e51
    • Guoqing Jiang's avatar
      md-cluster: use %pU to print UUIDs · b89f704a
      Guoqing Jiang authored
      Reviewed-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      b89f704a
    • Sasha Levin's avatar
      md: setup safemode_timer before it's being used · 25b2edfa
      Sasha Levin authored
      We used to set up the safemode_timer timer in md_run. If md_run
      would fail before the timer was set up we'd end up trying to modify
      a timer that doesn't have a callback function when we access safe_delay_store,
      which would trigger a BUG.
      
      neilb: delete init_timer() call as setup_timer() does that.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      25b2edfa
    • NeilBrown's avatar
      md/raid5: handle possible race as reshape completes. · 6cbd8148
      NeilBrown authored
      It is possible (though unlikely) for a reshape to be
      interrupted between the time that end_reshape is called
      and the time when raid5_finish_reshape is called.
      
      This can leave conf->reshape_progress set to MaxSector,
      but mddev->reshape_position not.
      
      This combination confused reshape_request() when ->reshape_backwards.
      As conf->reshape_progress is so high, it seems the reshape hasn't
      really begun.  But assuming MaxSector is a valid address only
      leads to sorrow.
      
      So ensure reshape_position and reshape_progress both agree,
      and add an extra check in reshape_request() just in case they don't.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      6cbd8148
    • NeilBrown's avatar
      md: sync sync_completed has correct value as recovery finishes. · 5ed1df2e
      NeilBrown authored
      There can be a small window between the moment that recovery
      actually writes the last block and the time when various sysfs
      and /proc/mdstat attributes report that it has finished.
      During this time, 'sync_completed' can have the wrong value.
      This can confuse monitoring software.
      
      So:
       - don't set curr_resync_completed beyond the end of the devices,
       - set it correctly when resync/recovery has completed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      5ed1df2e
    • NeilBrown's avatar
      md: be careful when testing resync_max against curr_resync_completed. · c5e19d90
      NeilBrown authored
      While it generally shouldn't happen, it is not impossible for
      curr_resync_completed to exceed resync_max.
      This can particularly happen when reshaping RAID5 - the current
      status isn't copied to curr_resync_completed promptly, so when it
      is, it can exceed resync_max.
      This happens when the reshape is 'frozen', resync_max is set low,
      and reshape is re-enabled.
      
      Taking a difference between two unsigned numbers is always dangerous
      anyway, so add a test to behave correctly if
         curr_resync_completed > resync_max
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      c5e19d90
    • NeilBrown's avatar
      md: set MD_RECOVERY_RECOVER when starting a degraded array. · a4a3d26d
      NeilBrown authored
      This ensures that 'sync_action' will show 'recover' immediately the
      array is started.  If there is no spare the status will change to
      'idle' once that is detected.
      
      Clear MD_RECOVERY_RECOVER for a read-only array to ensure this change
      happens.
      
      This allows scripts which monitor status not to get confused -
      particularly my test scripts.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      a4a3d26d
    • NeilBrown's avatar
      md/raid5: remove incorrect "min_t()" when calculating writepos. · c74c0d76
      NeilBrown authored
      This code is calculating:
        writepos, which is the furthest along address (device-space) that we
           *will* be writing to
        readpos, which is the earliest address that we *could* possible read
           from, and
        safepos, which is the earliest address in the 'old' section that we
           might read from after a crash when the reshape position is
           recovered from metadata.
      
        The first is a precise calculation, so clipping at zero doesn't
        make sense.  As the reshape position is now guaranteed to always be
        a multiple of reshape_sectors and as we already BUG_ON when
        reshape_progress is zero, there is no point in this min_t() call.
      
        The readpos and safepos are worst case - actual value depends on
        precise geometry.  That worst case could be negative, which is only
        a problem because we are storing the value in an unsigned.
        So leave the min_t() for those.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      
      c74c0d76
    • NeilBrown's avatar
      md/raid5: strengthen check on reshape_position at run. · 05256d98
      NeilBrown authored
      When reshaping, we work in units of the largest chunk size.
      If changing from a larger to a smaller chunk size, that means we
      reshape more than one stripe at a time.  So the required alignment
      of reshape_position needs to take into account both the old
      and new chunk size.
      
      This means that both 'here_new' and 'here_old' are calculated with
      respect to the same (maximum) chunk size, so testing if they are the
      same when delta_disks is zero becomes pointless.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      05256d98
    • NeilBrown's avatar
      md/raid5: switch to use conf->chunk_sectors in place of mddev->chunk_sectors where possible · 3cb5edf4
      NeilBrown authored
      The chunk_sectors and new_chunk_sectors fields of mddev can be changed
      any time (via sysfs) that the reconfig mutex can be taken.  So raid5
      keeps internal copies in 'conf' which are stable except for a short
      locked moment when reshape stops/starts.
      
      So any access that does not hold reconfig_mutex should use the 'conf'
      values, not the 'mddev' values.
      Several don't.
      
      This could result in corruption if new values were written at awkward
      times.
      
      Also use min() or max() rather than open-coding.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      3cb5edf4
    • NeilBrown's avatar
      md/raid5: always set conf->prev_chunk_sectors and ->prev_algo · 5cac6bcb
      NeilBrown authored
      These aren't really needed when no reshape is happening,
      but it is safer to have them always set to a meaningful value.
      The next patch will use ->prev_chunk_sectors without checking
      if a reshape is happening (because that makes the code simpler),
      and this patch makes that safe.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      5cac6bcb
    • NeilBrown's avatar
      md/raid10: fix a few typos in comments · 02ec5026
      NeilBrown authored
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      02ec5026
    • NeilBrown's avatar
      md/raid5: consider updating reshape_position at start of reshape. · 92140480
      NeilBrown authored
      md/raid5 only updates ->reshape_position (which is stored in
      metadata and is authoritative) occasionally, but particularly
      when getting closed to ->resync_max as it must be correct
      when ->resync_max is reached.
      
      When mdadm tries to stop an array which is reshaping it will:
       - freeze the reshape,
       - set resync_max to where the reshape has reached.
       - unfreeze the reshape.
      When this happens, the reshape is aborted and then restarted.
      
      The restart doesn't check that resync_max is close, and so doesn't
      update ->reshape_position like it should.
      This results in the reshape stopping, but ->reshape_position being
      incorrect.
      
      So on that first call to reshape_request, make sure ->reshape_position
      is updated if needed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      92140480
    • NeilBrown's avatar
      md: close some races between setting and checking sync_action. · 985ca973
      NeilBrown authored
      When checking sync_action in a script, we want to be sure it is
      as accurate as possible.
      As resync/reshape etc doesn't always start immediately (a separate
      thread is scheduled to do it), it is best if 'action_show'
      checks if MD_RECOVER_NEEDED is set (which it does) and in that
      case reports what is likely to start soon (which it only sometimes
      does).
      
      So:
       - report 'reshape' if reshape_position suggests one might start.
       - set MD_RECOVERY_RECOVER in raid1_reshape(), because that is very
         likely to happen next.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      985ca973
    • NeilBrown's avatar
      md: Keep /proc/mdstat reporting recovery until fully DONE. · f7851be7
      NeilBrown authored
      Currently when a recovery completes, mdstat shows that it has finished
      before the new device is marked as a full member.  Because of this it
      can appear to a script that the recovery finished but the array isn't
      in sync.
      
      So while MD_RECOVERY_DONE is still set, keep mdstat reporting "recovery".
      Once md_reap_sync_thread() completes, the spare will be active and then
      MD_RECOVERY_DONE will be cleared.
      
      To ensure this is race-free, set MD_RECOVERY_DONE before clearning
      curr_resync.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      f7851be7
    • Ard Biesheuvel's avatar
      md/raid6: delta syndrome for ARM NEON · 0e833e69
      Ard Biesheuvel authored
      This implements XOR syndrome calculation using NEON intrinsics.
      As before, the module can be built for ARM and arm64 from the
      same source.
      
      Relative performance on a Cortex-A57 based system:
      
        raid6: int64x1  gen()   905 MB/s
        raid6: int64x1  xor()   881 MB/s
        raid6: int64x2  gen()  1343 MB/s
        raid6: int64x2  xor()  1286 MB/s
        raid6: int64x4  gen()  1896 MB/s
        raid6: int64x4  xor()  1321 MB/s
        raid6: int64x8  gen()  1773 MB/s
        raid6: int64x8  xor()  1165 MB/s
        raid6: neonx1   gen()  1834 MB/s
        raid6: neonx1   xor()  1278 MB/s
        raid6: neonx2   gen()  2528 MB/s
        raid6: neonx2   xor()  1942 MB/s
        raid6: neonx4   gen()  2888 MB/s
        raid6: neonx4   xor()  2334 MB/s
        raid6: neonx8   gen()  2957 MB/s
        raid6: neonx8   xor()  2232 MB/s
        raid6: using algorithm neonx8 gen() 2957 MB/s
        raid6: .... xor() 2232 MB/s, rmw enabled
      
      Cc: Markus Stockhausen <stockhausen@collogia.de>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      0e833e69
  2. 03 Aug, 2015 7 commits
    • NeilBrown's avatar
      md/raid0: update queue parameter in a safer location. · 199dc6ed
      NeilBrown authored
      When a (e.g.) RAID5 array is reshaped to RAID0, the updating
      of queue parameters (e.g. max number of sectors per bio) is
      done in the wrong place.
      It should be part of ->run, but it is actually part of ->takeover.
      This means it happens before level_store() calls:
      
      	blk_set_stacking_limits(&mddev->queue->limits);
      
      and so it ineffective.  This can lead to errors from underlying
      devices.
      
      So move all the relevant settings out of create_stripe_zones()
      and into raid0_run().
      
      As this can lead to a bug-on it is suitable for any -stable
      kernel which supports reshape to RAID0.  So 2.6.35 or later.
      As the bug has been present for five years there is no urgency,
      so no need to rush into -stable.
      
      Fixes: 9af204cf ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover")
      Cc: stable@vger.kernel.org (v2.6.35+ - please delay until after -final release).
      Reported-by: default avatarYi Zhang <yizhan@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      199dc6ed
    • Benjamin Randazzo's avatar
      md: simplify get_bitmap_file now that "file" is zeroed. · 25eafe1a
      Benjamin Randazzo authored
      There is no point assigning '\0' to file->pathname[0] as
      file is now zeroed out, so remove that branch and
      simplify the code.
      
      [Original patch combined this with the change to use
       kzalloc.  I split the two so that the change to kzalloc
       is easier to backport. - neilb]
      Signed-off-by: default avatarBenjamin Randazzo <benjamin@randazzo.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      25eafe1a
    • NeilBrown's avatar
      md/raid5: don't let shrink_slab shrink too far. · 49895bcc
      NeilBrown authored
      I have a report of drop_one_stripe() called from
      raid5_cache_scan() apparently finding ->max_nr_stripes == 0.
      
      This should not be allowed.
      
      So add a test to keep max_nr_stripes above min_nr_stripes.
      
      Also use a 'mask' rather than a 'mod' in drop_one_stripe
      to ensure 'hash' is valid even if max_nr_stripes does reach zero.
      
      
      Fixes: edbe83ab ("md/raid5: allow the stripe_cache to grow and shrink.")
      Cc: stable@vger.kernel.org (4.1 - please release with 2d5b569b)
      Reported-by: default avatarTomas Papan <tomas.papan@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      49895bcc
    • Benjamin Randazzo's avatar
      md: use kzalloc() when bitmap is disabled · b6878d9e
      Benjamin Randazzo authored
      In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
      mdu_bitmap_file_t called "file".
      
      5769         file = kmalloc(sizeof(*file), GFP_NOIO);
      5770         if (!file)
      5771                 return -ENOMEM;
      
      This structure is copied to user space at the end of the function.
      
      5786         if (err == 0 &&
      5787             copy_to_user(arg, file, sizeof(*file)))
      5788                 err = -EFAULT
      
      But if bitmap is disabled only the first byte of "file" is initialized
      with zero, so it's possible to read some bytes (up to 4095) of kernel
      space memory from user space. This is an information leak.
      
      5775         /* bitmap disabled, zero the first byte and copy out */
      5776         if (!mddev->bitmap_info.file)
      5777                 file->pathname[0] = '\0';
      Signed-off-by: default avatarBenjamin Randazzo <benjamin@randazzo.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      b6878d9e
    • NeilBrown's avatar
      md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies · 423f04d6
      NeilBrown authored
      raid1_end_read_request() assumes that the In_sync bits are consistent
      with the ->degaded count.
      raid1_spare_active updates the In_sync bit before the ->degraded count
      and so exposes an inconsistency, as does error()
      So extend the spinlock in raid1_spare_active() and error() to hide those
      inconsistencies.
      
      This should probably be part of
        Commit: 34cab6f4 ("md/raid1: fix test for 'was read error from
        last working device'.")
      as it addresses the same issue.  It fixes the same bug and should go
      to -stable for same reasons.
      
      Fixes: 76073054 ("md/raid1: clean up read_balance.")
      Cc: stable@vger.kernel.org (v3.0+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      423f04d6
    • Linus Torvalds's avatar
      Linux 4.2-rc5 · 74d33293
      Linus Torvalds authored
      74d33293
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d08c3181
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       - TCE table memory calculation fix from Alexey
       - Build fix for ans-lcd from Luis
       - Unbalanced IRQ warning fix from Alistair
      
      * tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/eeh-powernv: Fix unbalanced IRQ warning
        macintosh/ans-lcd: fix build failure after module_init/exit relocation
        powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table
      d08c3181
  3. 02 Aug, 2015 2 commits
    • Linus Torvalds's avatar
      i915: temporary fix for DP MST docking station NULL pointer dereference · 27667f47
      Linus Torvalds authored
      Ted Ts'o reports that his Lenovo T540p ThinkPad crashes at boot if
      attached to the docking station.  This is a regression that he was able
      to bisect to commit 8c7b5ccb: "drm/i915: Use atomic helpers for
      computing changed flags:"
      
      The reason seems to be the new call to drm_atomic_helper_check_modeset()
      added to intel_modeset_compute_config(), which in turn calls
      update_connector_routing(), and somehow ends up picking a NULL crtc for
      the connector state, causing the subsequent drm_crtc_index() to OOPS.
      
      Daniel Vetter says that the fundamental issue seems to be confusion in
      the encoder selection, and this isn't the right fix, but while he chases
      down the proper fix, this at least avoids the NULL pointer dereference
      and makes Ted's docking station work again.
      Reported-bisected-and-tested-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      Cc: Mani Nikula <jani.nikula@linux.intel.com>
      Cc: Dave Airlie <airlied@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27667f47
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · d4edea40
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "A set of three fixes for the ipr driver and one fairly major one for
        memory leaks in the mq path of SCSI"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: fix memory leak with scsi-mq
        ipr: Fix invalid array indexing for HRRQ
        ipr: Fix incorrect trace indexing
        ipr: Fix locking for unit attention handling
      d4edea40