1. 29 Oct, 2016 3 commits
    • NeilBrown's avatar
      md: be careful not lot leak internal curr_resync value into metadata. -- (all) · 1217e1d1
      NeilBrown authored
      mddev->curr_resync usually records where the current resync is up to,
      but during the starting phase it has some "magic" values.
      
       1 - means that the array is trying to start a resync, but has yielded
           to another array which shares physical devices, and also needs to
           start a resync
       2 - means the array is trying to start resync, but has found another
           array which shares physical devices and has already started resync.
      
       3 - means that resync has commensed, but it is possible that nothing
           has actually been resynced yet.
      
      It is important that this value not be visible to user-space and
      particularly that it doesn't get written to the metadata, as the
      resync or recovery checkpoint.  In part, this is because it may be
      slightly higher than the correct value, though this is very rare.
      In part, because it is not a multiple of 4K, and some devices only
      support 4K aligned accesses.
      
      There are two places where this value is propagates into either
      ->curr_resync_completed or ->recovery_cp or ->recovery_offset.
      These currently avoid the propagation of values 1 and 3, but will
      allow 3 to leak through.
      
      Change them to only propagate the value if it is > 3.
      
      As this can cause an array to fail, the patch is suitable for -stable.
      
      Cc: stable@vger.kernel.org (v3.7+)
      Reported-by: default avatarViswesh <viswesh.vichu@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      1217e1d1
    • Tomasz Majchrzak's avatar
      raid1: handle read error also in readonly mode · 7449f699
      Tomasz Majchrzak authored
      If write is the first operation on a disk and it happens not to be
      aligned to page size, block layer sends read request first. If read
      operation fails, the disk is set as failed as no attempt to fix the
      error is made because array is in auto-readonly mode. Similarily, the
      disk is set as failed for read-only array.
      
      Take the same approach as in raid10. Don't fail the disk if array is in
      readonly or auto-readonly mode. Try to redirect the request first and if
      unsuccessful, return a read error.
      Signed-off-by: default avatarTomasz Majchrzak <tomasz.majchrzak@intel.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      7449f699
    • Shaohua Li's avatar
      raid5-cache: correct condition for empty metadata write · 9a8b27fa
      Shaohua Li authored
      As long as we recover one metadata block, we should write the empty metadata
      write. The original code could make recovery corrupted if only one meta is
      valid.
      Reported-by: default avatarZhengyuan Liu <liuzhengyuan@kylinos.cn>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      9a8b27fa
  2. 24 Oct, 2016 6 commits
    • Tomasz Majchrzak's avatar
      md: report 'write_pending' state when array in sync · 16f88949
      Tomasz Majchrzak authored
      If there is a bad block on a disk and there is a recovery performed from
      this disk, the same bad block is reported for a new disk. It involves
      setting MD_CHANGE_PENDING flag in rdev_set_badblocks. For external
      metadata this flag is not being cleared as array state is reported as
      'clean'. The read request to bad block in RAID5 array gets stuck as it
      is waiting for a flag to be cleared - as per commit c3cce6cd
      ("md/raid5: ensure device failure recorded before write request
      returns.").
      
      The meaning of MD_CHANGE_PENDING and MD_CHANGE_CLEAN flags has been
      clarified in commit 070dc6dd ("md: resolve confusion of
      MD_CHANGE_CLEAN"), however MD_CHANGE_PENDING flag has been used in
      personality error handlers since and it doesn't fully comply with
      initial purpose. It was supposed to notify that write request is about
      to start, however now it is also used to request metadata update.
      Initially (in md_allow_write, md_write_start) MD_CHANGE_PENDING flag has
      been set and in_sync has been set to 0 at the same time. Error handlers
      just set the flag without modifying in_sync value. Sysfs array state is
      a single value so now it reports 'clean' when MD_CHANGE_PENDING flag is
      set and in_sync is set to 1. Userspace has no idea it is expected to
      take some action.
      
      Swap the order that array state is checked so 'write_pending' is
      reported ahead of 'clean' ('write_pending' is a misleading name but it
      is too late to rename it now).
      Signed-off-by: default avatarTomasz Majchrzak <tomasz.majchrzak@intel.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      16f88949
    • Zhengyuan Liu's avatar
      md/raid5: write an empty meta-block when creating log super-block · 56056c2e
      Zhengyuan Liu authored
      If superblock points to an invalid meta block, r5l_load_log will set
      create_super with true and create an new superblock, this runtime path
      would always happen if we do no writing I/O to this array since it was
      created. Writing an empty meta block could avoid this unnecessary
      action at the first time we created log superblock.
      
      Another reason is for the corretness of log recovery. Currently we have
      bellow code to guarantee log revocery to be correct.
      
              if (ctx.seq > log->last_cp_seq + 1) {
                      int ret;
      
                      ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
                      if (ret)
                              return ret;
                      log->seq = ctx.seq + 11;
                      log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
                      r5l_write_super(log, ctx.pos);
              } else {
                      log->log_start = ctx.pos;
                      log->seq = ctx.seq;
              }
      
      If we just created a array with a journal device, log->log_start and
      log->last_checkpoint should all be 0, then we write three meta block
      which are valid except mid one and supposed crash happened. The ctx.seq
      would equal to log->last_cp_seq + 1 and log->log_start would be set to
      position of mid invalid meta block after we did a recovery, this will
      lead to problems which could be avoided with this patch.
      Signed-off-by: default avatarZhengyuan Liu <liuzhengyuan@kylinos.cn>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      56056c2e
    • Zhengyuan Liu's avatar
      md/raid5: initialize next_checkpoint field before use · 28cd88e2
      Zhengyuan Liu authored
      No initial operation was done to this field when we
      load/recovery the log, it got assignment only when IO
      to raid disk was finished. So r5l_quiesce may use wrong
      next_checkpoint to reclaim log space, that would make
      reclaimable space calculation confused.
      Signed-off-by: default avatarZhengyuan Liu <liuzhengyuan@kylinos.cn>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      28cd88e2
    • Shaohua Li's avatar
      RAID10: ignore discard error · 579ed34f
      Shaohua Li authored
      This is the counterpart of raid10 fix. If a write error occurs, raid10
      will try to rewrite the bio in small chunk size. If the rewrite fails,
      raid10 will record the error in bad block. narrow_write_error will
      always use WRITE for the bio, but actually it could be a discard. Since
      discard bio hasn't payload, write the bio will cause different issues.
      But discard error isn't fatal, we can safely ignore it. This is what
      this patch does.
      
      This issue should exist since discard is added, but only exposed with
      recent arbitrary bio size feature.
      
      Cc: Sitsofe Wheeler <sitsofe@gmail.com>
      Cc: stable@vger.kernel.org (v3.6)
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      579ed34f
    • Shaohua Li's avatar
      RAID1: ignore discard error · e3f948cd
      Shaohua Li authored
      If a write error occurs, raid1 will try to rewrite the bio in small
      chunk size. If the rewrite fails, raid1 will record the error in bad
      block. narrow_write_error will always use WRITE for the bio, but
      actually it could be a discard. Since discard bio hasn't payload, write
      the bio will cause different issues. But discard error isn't fatal, we
      can safely ignore it. This is what this patch does.
      
      This issue should exist since discard is added, but only exposed with
      recent arbitrary bio size feature.
      Reported-and-tested-by: default avatarSitsofe Wheeler <sitsofe@gmail.com>
      Cc: stable@vger.kernel.org (v3.6)
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      e3f948cd
    • Linus Torvalds's avatar
      Linux 4.9-rc2 · 07d9a380
      Linus Torvalds authored
      07d9a380
  3. 23 Oct, 2016 5 commits
    • Linus Torvalds's avatar
      Merge tag 'upstream-4.9-rc2' of git://git.infradead.org/linux-ubifs · 5ff93abc
      Linus Torvalds authored
      Pull UBI[FS] fixes from Richard Weinberger:
       "This contains fixes for issues in both UBI and UBIFS:
      
         - Fallout from the merge window, refactoring UBI code introduced some
           issues.
      
         - Fixes for an UBIFS readdir bug which can cause getdents() to busy
           loop for ever and a bug in the UBIFS xattr code"
      
      * tag 'upstream-4.9-rc2' of git://git.infradead.org/linux-ubifs:
        ubifs: Abort readdir upon error
        UBI: Fix crash in try_recover_peb()
        ubi: fix swapped arguments to call to ubi_alloc_aeb
        ubifs: Fix xattr_names length in exit paths
        ubifs: Rename ubifs_rename2
      5ff93abc
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · c761923c
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "A few bug fixes and add some missing KERN_CONT annotations"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: add missing KERN_CONT to a few more debugging uses
        fscrypto: lock inode while setting encryption policy
        ext4: correct endianness conversion in __xattr_check_inode()
        fscrypto: make XTS tweak initialization endian-independent
        ext4: do not advertise encryption support when disabled
        jbd2: fix incorrect unlock on j_list_lock
        ext4: super.c: Update logging style using KERN_CONT
      c761923c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · a55da8a0
      Linus Torvalds authored
      Pull SCSI target fixes from Nicholas Bellinger:
       "Here are the outstanding target-pending fixes for v4.9-rc2.
      
        This includes:
      
         - Fix v4.1.y+ reference leak regression with concurrent TMR
           ABORT_TASK + session shutdown. (Vaibhav Tandon)
      
         - Enable tcm_fc w/ SCF_USE_CPUID to avoid host exchange timeouts
           (Hannes)
      
         - target/user error sense handling fixes. (Andy + MNC + HCH)
      
         - Fix iscsi-target NOP_OUT error path iscsi_cmd descriptor leak
           (Varun)
      
         - Two EXTENDED_COPY SCSI status fixes for ESX VAAI (Dinesh Israni +
           Nixon Vincent)
      
         - Revert a v4.8 residual overflow change, that breaks sg_inq with
           small allocation lengths.
      
        There are a number of folks stress testing the v4.1.y regression fix
        in their environments, and more folks doing iser-target I/O stress
        testing atop recent v4.x.y code.
      
        There is also one v4.2.y+ RCU conversion regression related to
        explicit NodeACL configfs changes, that is still being tracked down"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
        target/tcm_fc: use CPU affinity for responses
        target/tcm_fc: Update debugging statements to match libfc usage
        target/tcm_fc: return detailed error in ft_sess_create()
        target/tcm_fc: print command pointer in debug message
        target: fix potential race window in target_sess_cmd_list_waiting()
        Revert "target: Fix residual overflow handling in target_complete_cmd_with_length"
        target: Don't override EXTENDED_COPY xcopy_pt_cmd SCSI status code
        target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE
        target: Re-add missing SCF_ACK_KREF assignment in v4.1.y
        iscsi-target: fix iscsi cmd leak
        iscsi-target: fix spelling mistake "Unsolicitied" -> "Unsolicited"
        target/user: Fix comments to not refer to data ring
        target/user: Return an error if cmd data size is too large
        target/user: Use sense_reason_t in tcmu_queue_cmd_ring
      a55da8a0
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus-v4.9-rc2' of... · e6995f22
      Linus Torvalds authored
      Merge tag 'hwmon-for-linus-v4.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
       "Couple of hwmon fixes:
      
        Fix a potential ERR_PTR dereference in max31790 driver, and handle
        temperature readings below 0 in adm9240 driver"
      
      * tag 'hwmon-for-linus-v4.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (max31790) potential ERR_PTR dereference
        hwmon: (adm9240) handle temperature readings below 0
      e6995f22
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.9-2' of git://git.code.sf.net/p/openipmi/linux-ipmi · 5766e9d2
      Linus Torvalds authored
      Pull IPMI updates from Corey Minyard:
       "A small bug fix and a new driver for acting as an IPMI device.
      
        I was on vacation during the merge window (a long vacation) but this
        is a bug fix that should go in and a new driver that shouldn't hurt
        anything.
      
        This has been in linux-next for a month or so"
      
      * tag 'for-linus-4.9-2' of git://git.code.sf.net/p/openipmi/linux-ipmi:
        ipmi: fix crash on reading version from proc after unregisted bmc
        ipmi/bt-bmc: remove redundant return value check of platform_get_resource()
        ipmi/bt-bmc: add a dependency on ARCH_ASPEED
        ipmi: Fix ioremap error handling in bt-bmc
        ipmi: add an Aspeed BT IPMI BMC driver
      5766e9d2
  4. 22 Oct, 2016 9 commits
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0c2b6dc4
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "This updates contains:
      
         - A revert which addresses a boot failure on ARM Sun5i platforms
      
         - A new clocksource driver, which has been delayed beyond rc1 due to
           an interrupt driver issue which was unearthed by this driver. The
           debugging of that issue and the discussion about the proper
           solution made this driver miss the merge window. There is no point
           in delaying it for a full cycle as it completes the basic mainline
           support for the new JCore platform and does not create any risk
           outside of that platform"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "clocksource/drivers/timer_sun5i: Replace code by clocksource_mmio_init"
        clocksource: Add J-Core timer/clocksource driver
        of: Add J-Core timer bindings
      0c2b6dc4
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3e9679a3
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Three fixes, a hw-enablement and a cross-arch fix/enablement change:
      
         - SGI/UV fix for older platforms
      
         - x32 signal handling fix
      
         - older x86 platform bootup APIC fix
      
         - AVX512-4VNNIW (Neural Network Instructions) and AVX512-4FMAPS
           (Multiply Accumulation Single precision instructions) enablement.
      
         - move thread_info back into x86 specific code, to make life easier
           for other architectures trying to make use of
           CONFIG_THREAD_INFO_IN_TASK_STRUCT=y"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot/smp: Don't try to poke disabled/non-existent APIC
        sched/core, x86: Make struct thread_info arch specific again
        x86/signal: Remove bogus user_64bit_mode() check from sigaction_compat_abi()
        x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates
        x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features
        x86/vmware: Skip timer_irq_works() check on VMware
      3e9679a3
    • Linus Torvalds's avatar
      Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 86c5bf71
      Linus Torvalds authored
      Pull vmap stack fixes from Ingo Molnar:
       "This is fallout from CONFIG_HAVE_ARCH_VMAP_STACK=y on x86: stack
        accesses that used to be just somewhat questionable are now totally
        buggy.
      
        These changes try to do it without breaking the ABI: the fields are
        left there, they are just reporting zero, or reporting narrower
        information (the maps file change)"
      
      * 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm: Change vm_is_stack_for_task() to vm_is_stack_for_current()
        fs/proc: Stop trying to report thread stacks
        fs/proc: Stop reporting eip and esp in /proc/PID/stat
        mm/numa: Remove duplicated include from mprotect.c
      86c5bf71
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bfb7bfef
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
       "Mostly irqchip driver fixes, plus a symbol export"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        kernel/irq: Export irq_set_parent()
        irqchip/gic: Add missing \n to CPU IF adjustment message
        irqchip/jcore: Don't show Kconfig menu item for driver
        irqchip/eznps: Drop pointless static qualifier in nps400_of_init()
        irqchip/gic-v3-its: Fix entry size mask for GITS_BASER
        irqchip/gic-v3-its: Fix 64bit GIC{R,ITS}_TYPER accesses
      bfb7bfef
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 90e01058
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Add Ard Biesheuvel as EFI co-maintainer, plus fix an ARM build bug
        with older toolchains"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/arm: Fix absolute relocation detection for older toolchains
        MAINTAINERS: Add myself as EFI maintainer
      90e01058
    • Ville Syrjälä's avatar
      x86/boot/smp: Don't try to poke disabled/non-existent APIC · ff856051
      Ville Syrjälä authored
      Apparently trying to poke a disabled or non-existent APIC
      leads to a box that doesn't even boot. Let's not do that.
      
      No real clue if this is the right fix, but at least my
      P3 machine boots again.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: dyoung@redhat.com
      Cc: kexec@lists.infradead.org
      Cc: stable@vger.kernel.org
      Fixes: 2a51fe08 ("arch/x86: Handle non enumerated CPU after physical hotplug")
      Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff856051
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · dcd4693c
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Fixes marked for stable:
         - Prevent unlikely crash in copro_calculate_slb() (Frederic Barrat)
         - cxl: Prevent adapter reset if an active context exists (Vaibhav Jain)
      
        Fixes for code merged this cycle:
         - Fix boot on systems with uncompressed kernel image (Heiner Kallweit)
         - Drop dump_numa_memory_topology() (Michael Ellerman)
         - Fix numa topology console print (Aneesh Kumar K.V)
         - Ignore the pkey system calls for now (Stephen Rothwell)"
      
      * tag 'powerpc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Ignore the pkey system calls for now
        powerpc: Fix numa topology console print
        powerpc/mm: Drop dump_numa_memory_topology()
        cxl: Prevent adapter reset if an active context exists
        powerpc/boot: Fix boot on systems with uncompressed kernel image
        powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
      dcd4693c
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a23b27ae
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - avoid livelock when walking guest page tables
         - fix HYP mode static keys without CC_HAVE_ASM_GOTO
      
        MIPS:
         - fix a build error without TRACEPOINTS_ENABLED
      
        s390:
         - reject a malformed userspace configuration
      
        x86:
         - suppress a warning without CONFIG_CPU_FREQ
         - initialize whole irq_eoi array"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        arm/arm64: KVM: Map the BSS at HYP
        arm64: KVM: Take S1 walks into account when determining S2 write faults
        KVM: s390: reject invalid modes for runtime instrumentation
        kvm: x86: memset whole irq_eoi
        kvm/x86: Fix unused variable warning in kvm_timer_init()
        KVM: MIPS: Add missing uaccess.h include
      a23b27ae
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.9-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 02593ac6
      Linus Torvalds authored
      Pull NFS client bugfixes from Anna Schumaker:
       "Just two bugfixes this time:
      
        Stable bugfix:
         - Fix last_write_offset incorrectly set to page boundary
      
        Other bugfix:
         - Fix missing-braces warning"
      
      * tag 'nfs-for-4.9-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        nfs4: fix missing-braces warning
        pnfs/blocklayout: fix last_write_offset incorrectly set to page boundary
      02593ac6
  5. 21 Oct, 2016 17 commits