1. 15 Jan, 2015 40 commits
    • Richard Guy Briggs's avatar
      audit: restore AUDIT_LOGINUID unset ABI · e147a5d3
      Richard Guy Briggs authored
      commit 041d7b98 upstream.
      
      A regression was caused by commit 780a7654:
      	 audit: Make testing for a valid loginuid explicit.
      (which in turn attempted to fix a regression caused by e1760bd5)
      
      When audit_krule_to_data() fills in the rules to get a listing, there was a
      missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.
      
      This broke userspace by not returning the same information that was sent and
      expected.
      
      The rule:
      	auditctl -a exit,never -F auid=-1
      gives:
      	auditctl -l
      		LIST_RULES: exit,never f24=0 syscall=all
      when it should give:
      		LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all
      
      Tag it so that it is reported the same way it was set.  Create a new
      private flags audit_krule field (pflags) to store it that won't interact with
      the public one from the API.
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e147a5d3
    • Lorenzo Pieralisi's avatar
      arm64: kernel: fix __cpu_suspend mm switch on warm-boot · b3ce059c
      Lorenzo Pieralisi authored
      commit f43c2718 upstream.
      
      On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0
      page tables on boot or to the active_mm mappings belonging to user space
      processes, it must never be set to swapper_pg_dir page tables mappings.
      
      When a CPU is booted its active_mm is set to init_mm even though its
      TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies
      that when __cpu_suspend is triggered the active_mm can point at
      init_mm even if the current TTBR0_EL1 register contains the reserved
      TTBR0_EL1 mappings.
      
      Therefore, the mm save and restore executed in __cpu_suspend might
      turn out to be erroneous in that, if the current->active_mm corresponds
      to init_mm, on resume from low power it ends up restoring in the
      TTBR0_EL1 the init_mm mappings that are global and can cause speculation
      of TLB entries which end up being propagated to user space.
      
      This patch fixes the issue by checking the active_mm pointer before
      restoring the TTBR0 mappings. If the current active_mm == &init_mm,
      the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of
      switching back to the active_mm, which is the expected behaviour
      corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered.
      
      Fixes: 95322526 ("arm64: kernel: cpu_{suspend/resume} implementation")
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b3ce059c
    • Laura Abbott's avatar
      arm64: Move cpu_resume into the text section · 109b9186
      Laura Abbott authored
      commit c3684fbb upstream.
      
      The function cpu_resume currently lives in the .data section.
      There's no reason for it to be there since we can use relative
      instructions without a problem. Move a few cpu_resume data
      structures out of the assembly file so the .data annotation
      can be dropped completely and cpu_resume ends up in the read
      only text section.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      [ luis: 3.16-stable prereq for
        f43c2718 "arm64: kernel: fix __cpu_suspend mm switch on warm-boot" ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      109b9186
    • Lorenzo Pieralisi's avatar
      arm64: kernel: refactor the CPU suspend API for retention states · 0fb957f3
      Lorenzo Pieralisi authored
      commit 714f5992 upstream.
      
      CPU suspend is the standard kernel interface to be used to enter
      low-power states on ARM64 systems. Current cpu_suspend implementation
      by default assumes that all low power states are losing the CPU context,
      so the CPU registers must be saved and cleaned to DRAM upon state
      entry. Furthermore, the current cpu_suspend() implementation assumes
      that if the CPU suspend back-end method returns when called, this has
      to be considered an error regardless of the return code (which can be
      successful) since the CPU was not expected to return from a code path that
      is different from cpu_resume code path - eg returning from the reset vector.
      
      All in all this means that the current API does not cope well with low-power
      states that preserve the CPU context when entered (ie retention states),
      since first of all the context is saved for nothing on state entry for
      those states and a successful state entry can return as a normal function
      return, which is considered an error by the current CPU suspend
      implementation.
      
      This patch refactors the cpu_suspend() API so that it can be split in
      two separate functionalities. The arm64 cpu_suspend API just provides
      a wrapper around CPU suspend operation hook. A new function is
      introduced (for architecture code use only) for states that require
      context saving upon entry:
      
      __cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
      
      __cpu_suspend() saves the context on function entry and calls the
      so called suspend finisher (ie fn) to complete the suspend operation.
      The finisher is not expected to return, unless it fails in which case
      the error is propagated back to the __cpu_suspend caller.
      
      The API refactoring results in the following pseudo code call sequence for a
      suspending CPU, when triggered from a kernel subsystem:
      
      /*
       * int cpu_suspend(unsigned long idx)
       * @idx: idle state index
       */
      {
      -> cpu_suspend(idx)
      	|---> CPU operations suspend hook called, if present
      		|--> if (retention_state)
      			|--> direct suspend back-end call (eg PSCI suspend)
      		     else
      			|--> __cpu_suspend(idx, &back_end_finisher);
      }
      
      By refactoring the cpu_suspend API this way, the CPU operations back-end
      has a chance to detect whether idle states require state saving or not
      and can call the required suspend operations accordingly either through
      simple function call or indirectly through __cpu_suspend() which carries out
      state saving and suspend finisher dispatching to complete idle state entry.
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      [ luis: 3.16-stable prereq for
        f43c2718 "arm64: kernel: fix __cpu_suspend mm switch on warm-boot" ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0fb957f3
    • Lorenzo Pieralisi's avatar
      arm64: kernel: add missing __init section marker to cpu_suspend_init · 6b4a586a
      Lorenzo Pieralisi authored
      commit 18ab7db6 upstream.
      
      Suspend init function must be marked as __init, since it is not needed
      after the kernel has booted. This patch moves the cpu_suspend_init()
      function to the __init section.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      [ luis: 3.16-stable prereq for
        f43c2718 "arm64: kernel: fix __cpu_suspend mm switch on warm-boot" ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6b4a586a
    • Richard Guy Briggs's avatar
      audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb · 5755059f
      Richard Guy Briggs authored
      commit 54dc77d9 upstream.
      
      Eric Paris explains: Since kauditd_send_multicast_skb() gets called in
      audit_log_end(), which can come from any context (aka even a sleeping context)
      GFP_KERNEL can't be used.  Since the audit_buffer knows what context it should
      use, pass that down and use that.
      
      See: https://lkml.org/lkml/2014/12/16/542
      
      BUG: sleeping function called from invalid context at mm/slab.c:2849
      in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin
      2 locks held by sulogin/885:
        #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b
        #1:  (tty_files_lock){+.+.+.}, at: [<ffffffff9123e787>] selinux_bprm_committing_creds+0x55/0x22b
      CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30
      Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014
        ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375
        ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006
        0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38
      Call Trace:
        [<ffffffff916ba529>] dump_stack+0x50/0xa8
        [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be
        [<ffffffff910632a6>] __might_sleep+0x119/0x128
        [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f
        [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9
        [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3
        [<ffffffff914e2b62>] skb_copy+0x3e/0xa3
        [<ffffffff910c263e>] audit_log_end+0x83/0x100
        [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103
        [<ffffffff91252a73>] common_lsm_audit+0x441/0x450
        [<ffffffff9123c163>] slow_avc_audit+0x63/0x67
        [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3
        [<ffffffff9123dc2d>] inode_has_perm+0x5a/0x65
        [<ffffffff9123e7ca>] selinux_bprm_committing_creds+0x98/0x22b
        [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10
        [<ffffffff911515e6>] install_exec_creds+0xe/0x79
        [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7
        [<ffffffff9115198e>] search_binary_handler+0x81/0x18c
        [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7
        [<ffffffff91153669>] do_execve+0x1f/0x21
        [<ffffffff91153967>] SyS_execve+0x25/0x29
        [<ffffffff916c61a9>] stub_execve+0x69/0xa0
      Reported-by: default avatarValdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Tested-by: default avatarValdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      5755059f
    • Paul Moore's avatar
      audit: don't attempt to lookup PIDs when changing PID filtering audit rules · 1d668182
      Paul Moore authored
      commit 3640dcfa upstream.
      
      Commit f1dc4867 ("audit: anchor all pid references in the initial pid
      namespace") introduced a find_vpid() call when adding/removing audit
      rules with PID/PPID filters; unfortunately this is problematic as
      find_vpid() only works if there is a task with the associated PID
      alive on the system.  The following commands demonstrate a simple
      reproducer.
      
      	# auditctl -D
      	# auditctl -l
      	# autrace /bin/true
      	# auditctl -l
      
      This patch resolves the problem by simply using the PID provided by
      the user without any additional validation, e.g. no calls to check to
      see if the task/PID exists.
      
      Cc: Richard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Acked-by: default avatarEric Paris <eparis@redhat.com>
      Reviewed-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      1d668182
    • Thomas Gleixner's avatar
      tick/powerclamp: Remove tick_nohz_idle abuse · 8254b3a6
      Thomas Gleixner authored
      commit a5fd9733 upstream.
      
      commit 4dbd2771 "tick: export nohz tick idle symbols for module
      use" was merged via the thermal tree without an explicit ack from the
      relevant maintainers.
      
      The exports are abused by the intel powerclamp driver which implements
      a fake idle state from a sched FIFO task. This causes all kinds of
      wreckage in the NOHZ core code which rightfully assumes that
      tick_nohz_idle_enter/exit() are only called from the idle task itself.
      
      Recent changes in the NOHZ core lead to a failure of the powerclamp
      driver and now people try to hack completely broken and backwards
      workarounds into the NOHZ core code. This is completely unacceptable
      and just papers over the real problem. There are way more subtle
      issues lurking around the corner.
      
      The real solution is to fix the powerclamp driver by rewriting it with
      a sane concept, but that's beyond the scope of this.
      
      So the only solution for now is to remove the calls into the core NOHZ
      code from the powerclamp trainwreck along with the exports.
      
      Fixes: d6d71ee4 "PM: Introduce Intel PowerClamp Driver"
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Pan Jacob jun <jacob.jun.pan@intel.com>
      Cc: LKP <lkp@01.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412181110110.17382@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      8254b3a6
    • Jiri Jaburek's avatar
      ALSA: usb-audio: extend KEF X300A FU 10 tweak to Arcam rPAC · eb2f2690
      Jiri Jaburek authored
      commit d70a1b98 upstream.
      
      The Arcam rPAC seems to have the same problem - whenever anything
      (alsamixer, udevd, 3.9+ kernel from 60af3d03, ..) attempts to
      access mixer / control interface of the card, the firmware "locks up"
      the entire device, resulting in
        SNDRV_PCM_IOCTL_HW_PARAMS failed (-5): Input/output error
      from alsa-lib.
      
      Other operating systems can somehow read the mixer (there seems to be
      playback volume/mute), but any manipulation is ignored by the device
      (which has hardware volume controls).
      Signed-off-by: default avatarJiri Jaburek <jjaburek@redhat.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      eb2f2690
    • Andy Lutomirski's avatar
      x86/tls: Don't validate lm in set_thread_area() after all · 5862abdd
      Andy Lutomirski authored
      commit 3fb2f423 upstream.
      
      It turns out that there's a lurking ABI issue.  GCC, when
      compiling this in a 32-bit program:
      
      struct user_desc desc = {
      	.entry_number    = idx,
      	.base_addr       = base,
      	.limit           = 0xfffff,
      	.seg_32bit       = 1,
      	.contents        = 0, /* Data, grow-up */
      	.read_exec_only  = 0,
      	.limit_in_pages  = 1,
      	.seg_not_present = 0,
      	.useable         = 0,
      };
      
      will leave .lm uninitialized.  This means that anything in the
      kernel that reads user_desc.lm for 32-bit tasks is unreliable.
      
      Revert the .lm check in set_thread_area().  The value never did
      anything in the first place.
      
      Fixes: 0e58af4e ("x86/tls: Disallow unusual TLS segments")
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      5862abdd
    • Thomas Petazzoni's avatar
      i2c: mv64xxx: rework offload support to fix several problems · c0f60cba
      Thomas Petazzoni authored
      commit 00d8689b upstream.
      
      Originally, the I2C controller supported by the i2c-mv64xxx driver
      requires a lot of software support: an interrupt is generated at each
      step of an I2C transaction (after the start bit, after sending the
      address, etc.) and the driver is in charge of re-programming the I2C
      controller to do the next step of the I2C transaction. This explains
      the fairly complex state machine that the driver has.
      
      On Marvell Armada XP and later processors (Armada 375, 38x, etc.), the
      I2C controller was extended with a part called the "I2C Bridge", which
      allows to offload the I2C transaction completely to the
      hardware. Initial support for this mechanism was added in commit
      930ab3d4 ("i2c: mv64xxx: Add I2C Transaction Generator support").
      
      However, the implementation done in this commit has two related
      issues, which this commit fixes by completely changing how the offload
      implementation is done:
      
       * SMBus read transfers, where there is one write to select the
         register immediately followed in the same transaction by one read,
         were making the processor hang. This was easier visible on the
         Marvell Armada XP WRT1900AC platform using a driver for an I2C LED
         controller, or on other Armada XP platforms by using a simple
         'i2cget' command to read an I2C EEPROM.
      
       * The implementation was based on the fact that the offload engine
         was re-programmed to transfer each message of an I2C xfer: this
         meant that each message sent with the offload engine was starting
         with a normal I2C start sequence. However, the I2C subsystem
         assumes that all messages belonging to the same xfer will use the
         so-called "repeated start" so that the entire I2C xfer is seen as
         one transfer by the I2C devices and cannot be interrupt by other
         I2C masters on the same bus.
      
      In fact, the "I2C Bridge" allows to offload three types of xfer:
      
       - xfer of one write message
       - xfer of one read message
       - xfer of one write message followed by one read message
      
      For all other situations, we have to fallback to not using the "I2C
      Bridge" in order to get proper I2C semantics.
      
      Therefore, this commit reworks the offload implementation to put it
      not at the message level, but at the xfer level: in the
      mv64xxx_i2c_xfer() function, we decide if the transaction can be
      offloaded (in which case it is handled by the
      mv64xxx_i2c_offload_xfer() function), or otherwise it is handled by
      the slow path (implemented in the existing mv64xxx_i2c_execute_msg()).
      
      This allows to simplify the state machine, which no longer needs to
      have any state related to the offload implementation: the offload
      implementation is now completely separated from the slow path (with
      the exception of the interrupt handler, of course).
      
      In summary:
      
       - mv64xxx_i2c_can_offload() will analyze an I2C xfer and decided of
         the "I2C Bridge" can be used to offload it or not.
      
       - mv64xxx_i2c_offload_xfer() will actually program the "I2C Bridge"
         to offload one xfer (of either one or two messages), and block
         using mv64xxx_i2c_wait_for_completion() until the xfer completes.
      
       - The interrupt handler mv64xxx_i2c_intr() is modified to push the
         offload related code to a separate function,
         mv64xxx_i2c_intr_offload(). It will take care of reading the
         received data if needed.
      
      This commit was tested on:
      
       - Armada XP OpenBlocks AX3-4 (EEPROM on I2C and RTC on I2C)
       - Armada XP WRT1900AC (LED controller on I2C)
       - Armada XP GP (EEPROM on I2C)
      
      Fixes: 930ab3d4 ("i2c: mv64xxx: Add I2C Transaction Generator support")
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      [wsa: fixed checkpatch warnings]
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c0f60cba
    • Thomas Petazzoni's avatar
    • zhendong chen's avatar
      dm: fix missed error code if .end_io isn't implemented by target_type · 09e20944
      zhendong chen authored
      commit 5164bece upstream.
      
      In bio-based DM's clone_endio(), when target_type doesn't implement
      .end_io (e.g. linear) r will be always be initialized 0.  So if a
      WRITE SAME bio fails WRITE SAME will not be disabled as intended.
      
      Fix this by initializing r to error, rather than 0, in clone_endio().
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: 7eee4ae2 ("dm: disable WRITE SAME if it fails")
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      09e20944
    • Joe Thornber's avatar
      dm thin: fix missing out-of-data-space to write mode transition if blocks are released · e6051f0a
      Joe Thornber authored
      commit 2c43fd26 upstream.
      
      Discard bios and thin device deletion have the potential to release data
      blocks.  If the thin-pool is in out-of-data-space mode, and blocks were
      released, transition the thin-pool back to full write mode.
      
      The correct time to do this is just after the thin-pool metadata commit.
      It cannot be done before the commit because the space maps will not
      allow immediate reuse of the data blocks in case there's a rollback
      following power failure.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e6051f0a
    • Joe Thornber's avatar
      dm thin: fix inability to discard blocks when in out-of-data-space mode · 3ac19d4f
      Joe Thornber authored
      commit 45ec9bd0 upstream.
      
      When the pool was in PM_OUT_OF_SPACE mode its process_prepared_discard
      function pointer was incorrectly being set to
      process_prepared_discard_passdown rather than process_prepared_discard.
      
      This incorrect function pointer meant the discard was being passed down,
      but not effecting the mapping.  As such any discard that was issued, in
      an attempt to reclaim blocks, would not successfully free data space.
      Reported-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      3ac19d4f
    • Kailang Yang's avatar
      ALSA: hda/realtek - Add new Dell desktop for ALC3234 headset mode · 3a3090ca
      Kailang Yang authored
      commit 8b72415d upstream.
      
      New Dell desktop needs to support headset mode for ALC3234.
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      3a3090ca
    • Chris Wilson's avatar
      drm/i915: Force the CS stall for invalidate flushes · cf0b2ec0
      Chris Wilson authored
      commit add284a3 upstream.
      
      In order to act as a full command barrier by itself, we need to tell the
      pipecontrol to actually stall the command streamer while the flush runs.
      We require the full command barrier before operations like
      MI_SET_CONTEXT, which currently rely on a prior invalidate flush.
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=83677
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      cf0b2ec0
    • Chris Wilson's avatar
      drm/i915: Invalidate media caches on gen7 · faa6cf14
      Chris Wilson authored
      commit 148b83d0 upstream.
      
      In the gen7 pipe control there is an extra bit to flush the media
      caches, so let's set it during cache invalidation flushes.
      
      v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec.
      
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      faa6cf14
    • Nicholas Bellinger's avatar
      iscsi-target: Fail connection on short sendmsg writes · ba957dfe
      Nicholas Bellinger authored
      commit 6bf6ca75 upstream.
      
      This patch changes iscsit_do_tx_data() to fail on short writes
      when kernel_sendmsg() returns a value different than requested
      transfer length, returning -EPIPE and thus causing a connection
      reset to occur.
      
      This avoids a potential bug in the original code where a short
      write would result in kernel_sendmsg() being called again with
      the original iovec base + length.
      
      In practice this has not been an issue because iscsit_do_tx_data()
      is only used for transferring 48 byte headers + 4 byte digests,
      along with seldom used control payloads from NOPIN + TEXT_RSP +
      REJECT with less than 32k of data.
      
      So following Al's audit of iovec consumers, go ahead and fail
      the connection on short writes for now, and remove the bogus
      logic ahead of his proper upstream fix.
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ba957dfe
    • Long Li's avatar
      storvsc: ring buffer failures may result in I/O freeze · 63a60036
      Long Li authored
      commit e86fb5e8 upstream.
      
      When ring buffer returns an error indicating retry, storvsc may not
      return a proper error code to SCSI when bounce buffer is not used.
      This has introduced I/O freeze on RAID running atop storvsc devices.
      This patch fixes it by always returning a proper error code.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Reviewed-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      63a60036
    • Martin K. Petersen's avatar
      scsi: blacklist RSOC for Microsoft iSCSI target devices · e144f664
      Martin K. Petersen authored
      commit 198a956a upstream.
      
      The Microsoft iSCSI target does not support REPORT SUPPORTED OPERATION
      CODES. Blacklist these devices so we don't attempt to send the command.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Tested-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Reported-by: jazz@deti74.ru
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e144f664
    • Paul Mackerras's avatar
      powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode · e7c25b93
      Paul Mackerras authored
      commit 8117ac6a upstream.
      
      Currently, when going idle, we set the flag indicating that we are in
      nap mode (paca->kvm_hstate.hwthread_state) and then execute the nap
      (or sleep or rvwinkle) instruction, all with the MMU on.  This is bad
      for two reasons: (a) the architecture specifies that those instructions
      must be executed with the MMU off, and in fact with only the SF, HV, ME
      and possibly RI bits set, and (b) this introduces a race, because as
      soon as we set the flag, another thread can switch the MMU to a guest
      context.  If the race is lost, this thread will typically start looping
      on relocation-on ISIs at 0xc...4400.
      
      This fixes it by setting the MSR as required by the architecture before
      setting the flag or executing the nap/sleep/rvwinkle instruction.
      
      [ shreyas@linux.vnet.ibm.com: Edited to handle LE ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e7c25b93
    • Thomas Gleixner's avatar
      genirq: Prevent proc race against freeing of irq descriptors · 5e95fa4f
      Thomas Gleixner authored
      commit c291ee62 upstream.
      
      Since the rework of the sparse interrupt code to actually free the
      unused interrupt descriptors there exists a race between the /proc
      interfaces to the irq subsystem and the code which frees the interrupt
      descriptor.
      
      CPU0				CPU1
      				show_interrupts()
      				  desc = irq_to_desc(X);
      free_desc(desc)
        remove_from_radix_tree();
        kfree(desc);
      				  raw_spinlock_irq(&desc->lock);
      
      /proc/interrupts is the only interface which can actively corrupt
      kernel memory via the lock access. /proc/stat can only read from freed
      memory. Extremly hard to trigger, but possible.
      
      The interfaces in /proc/irq/N/ are not affected by this because the
      removal of the proc file is serialized in procfs against concurrent
      readers/writers. The removal happens before the descriptor is freed.
      
      For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
      as the descriptor is never freed. It's merely cleared out with the irq
      descriptor lock held. So any concurrent proc access will either see
      the old correct value or the cleared out ones.
      
      Protect the lookup and access to the irq descriptor in
      show_interrupts() with the sparse_irq_lock.
      
      Provide kstat_irqs_usr() which is protecting the lookup and access
      with sparse_irq_lock and switch /proc/stat to use it.
      
      Document the existing kstat_irqs interfaces so it's clear that the
      caller needs to take care about protection. The users of these
      interfaces are either not affected due to SPARSE_IRQ=n or already
      protected against removal.
      
      Fixes: 1f5a5b87 "genirq: Implement a sane sparse_irq allocator"
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      5e95fa4f
    • Sagi Grimberg's avatar
      iscsi,iser-target: Expose supported protection ops according to t10_pi · 0f990013
      Sagi Grimberg authored
      commit 23a548ee upstream.
      
      iSER will report supported protection operations based on
      the tpg attribute t10_pi settings and HCA PI offload capabilities.
      If the HCA does not support PI offload or tpg attribute t10_pi is
      not set, we fall to SW PI mode.
      
      In order to do that, we move iscsit_get_sup_prot_ops after connection
      tpg assignment.
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0f990013
    • Sagi Grimberg's avatar
      iser-target: Fix NULL dereference in SW mode DIF · c52c1599
      Sagi Grimberg authored
      commit 302cc7c3 upstream.
      
      Fallback to software mode DIF if HCA does not support
      PI (without crashing obviously). It is still possible to
      run with backend protection and an unprotected frontend,
      so looking at the command prot_op is not enough. Check
      device PI capability on a per-IO basis (isert_prot_cmd
      inline static) to determine if we need to handle protection
      information.
      
      Trace:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: [<ffffffffa037f8b1>] isert_reg_sig_mr+0x351/0x3b0 [ib_isert]
      Call Trace:
       [<ffffffff812b003a>] ? swiotlb_map_sg_attrs+0x7a/0x130
       [<ffffffffa038184d>] isert_reg_rdma+0x2fd/0x370 [ib_isert]
       [<ffffffff8108f2ec>] ? idle_balance+0x6c/0x2c0
       [<ffffffffa0382b68>] isert_put_datain+0x68/0x210 [ib_isert]
       [<ffffffffa02acf5b>] lio_queue_data_in+0x2b/0x30 [iscsi_target_mod]
       [<ffffffffa02306eb>] target_complete_ok_work+0x21b/0x310 [target_core_mod]
       [<ffffffff8106ece2>] process_one_work+0x182/0x3b0
       [<ffffffff8106fda0>] worker_thread+0x120/0x3c0
       [<ffffffff8106fc80>] ? maybe_create_worker+0x190/0x190
       [<ffffffff8107594e>] kthread+0xce/0xf0
       [<ffffffff81075880>] ? kthread_freezable_should_stop+0x70/0x70
       [<ffffffff8159a22c>] ret_from_fork+0x7c/0xb0
       [<ffffffff81075880>] ? kthread_freezable_should_stop+0x70/0x70
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c52c1599
    • Sagi Grimberg's avatar
      iser-target: Allocate PI contexts dynamically · 2a9a256d
      Sagi Grimberg authored
      commit 570db170 upstream.
      
      This patch converts to allocate PI contexts dynamically in order
      avoid a potentially bogus np->tpg_np and associated NULL pointer
      dereference in isert_connect_request() during iser-target endpoint
      shutdown with multiple network portals.
      
      Also, there is really no need to allocate these at connection
      establishment since it is not guaranteed that all the IOs on
      that connection will be to a PI formatted device.
      
      We can do it in a lazy fashion so the initial burst will have a
      transient slow down, but very fast all IOs will allocate a PI
      context.
      
      Squashed:
      
      iser-target: Centralize PI context handling code
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2a9a256d
    • Sagi Grimberg's avatar
      iser-target: Fix implicit termination of connections · 63dff7ee
      Sagi Grimberg authored
      commit b02efbfc upstream.
      
      In situations such as bond failover, The new session establishment
      implicitly invokes the termination of the old connection.
      
      So, we don't want to wait for the old connection wait_conn to completely
      terminate before we accept the new connection and post a login response.
      
      The solution is to deffer the comp_wait completion and the conn_put to
      a work so wait_conn will effectively be non-blocking (flush errors are
      assumed to come very fast).
      
      We allocate isert_release_wq with WQ_UNBOUND and WQ_UNBOUND_MAX_ACTIVE
      to spread the concurrency of release works.
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      63dff7ee
    • Sagi Grimberg's avatar
      iser-target: Handle ADDR_CHANGE event for listener cm_id · 868d84f1
      Sagi Grimberg authored
      commit ca6c1d82 upstream.
      
      The np listener cm_id will also get ADDR_CHANGE event
      upcall (in case it is bound to a specific IP). Handle
      it correctly by creating a new cm_id and implicitly
      destroy the old one.
      
      Since this is the second event a listener np cm_id may
      encounter, we move the np cm_id event handling to a
      routine.
      
      Squashed:
      
      iser-target: Move cma_id setup to a function
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      868d84f1
    • Sagi Grimberg's avatar
      iser-target: Fix connected_handler + teardown flow race · d1bfea1a
      Sagi Grimberg authored
      commit 19e2090f upstream.
      
      Take isert_conn pointer from cm_id->qp->qp_context. This
      will allow us to know that the cm_id context is always
      the network portal. This will make the cm_id event check
      (connection or network portal) more reliable.
      
      In order to avoid a NULL dereference in cma_id->qp->qp_context
      we destroy the qp after we destroy the cm_id (and make the
      dereference safe). session stablishment/teardown sequences
      can happen in parallel, we should take into account that
      connected_handler might race with connection teardown flow.
      
      Also, protect isert_conn->conn_device->active_qps decrement
      within the error patch during QP creation failure and the
      normal teardown path in isert_connect_release().
      
      Squashed:
      
      iser-target: Decrement completion context active_qps in error flow
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      d1bfea1a
    • Sagi Grimberg's avatar
      iser-target: Parallelize CM connection establishment · 611bb15b
      Sagi Grimberg authored
      commit 2371e5da upstream.
      
      There is no point in accepting a new CM request only
      when we are completely done with the last iscsi login.
      Instead we accept immediately, this will also cause the
      CM connection to reach connected state and the initiator
      is allowed to send the first login. We mark that we got
      the initial login and let iscsi layer pick it up when it
      gets there.
      
      This reduces the parallel login sequence by a factor of
      more then 4 (and more for multi-login) and also prevents
      the initiator (who does all logins in parallel) from
      giving up on login timeout expiration.
      
      In order to support multiple login requests sequence (CHAP)
      we call isert_rx_login_req from isert_rx_completion insead
      of letting isert_get_login_rx call it.
      
      Squashed:
      
      iser-target: Use kref_get_unless_zero in connected_handler
      iser-target: Acquire conn_mutex when changing connection state
      iser-target: Reject connect request in failure path
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      611bb15b
    • Sagi Grimberg's avatar
      iser-target: Fix flush + disconnect completion handling · 170b761c
      Sagi Grimberg authored
      commit 128e9cc8 upstream.
      
      ISER_CONN_UP state is not sufficient to know if
      we should wait for completion of flush errors and
      disconnected_handler event.
      
      Instead, split it to 2 states:
      - ISER_CONN_UP: Got to CM connected phase, This state
      indicates that we need to wait for a CM disconnect
      event before going to teardown.
      
      - ISER_CONN_FULL_FEATURE: Got to full feature phase
      after we posted login response, This state indicates
      that we posted recv buffers and we need to wait for
      flush completions before going to teardown.
      
      Also avoid deffering disconnected handler to a work,
      and handle it within disconnected handler.
      More work here is needed to handle DEVICE_REMOVAL event
      correctly (cleanup all resources).
      
      Squashed:
      
      iser-target: Don't deffer disconnected handler to a work
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      170b761c
    • Sagi Grimberg's avatar
      iscsi,iser-target: Initiate termination only once · 96b50563
      Sagi Grimberg authored
      commit 954f2372 upstream.
      
      Since commit 0fc4ea70 ("Target/iser: Don't put isert_conn inside
      disconnected handler") we put the conn kref in isert_wait_conn, so we
      need .wait_conn to be invoked also in the error path.
      
      Introduce call to isert_conn_terminate (called under lock)
      which transitions the connection state to TERMINATING and calls
      rdma_disconnect. If the state is already teminating, just bail
      out back (temination started).
      
      Also, make sure to destroy the connection when getting a connect
      error event if didn't get to connected (state UP). Same for the
      handling of REJECTED and UNREACHABLE cma events.
      
      Squashed:
      
      iscsi-target: Add call to wait_conn in establishment error flow
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      96b50563
    • Imre Deak's avatar
      drm/i915: vlv: fix IRQ masking when uninstalling interrupts · 1eadc8e9
      Imre Deak authored
      commit c352d1ba upstream.
      
      irq_mask should include all IRQ bits that we want to mask, but atm we
      set it incorrectly to the inverse of this. If the mask is used
      subsequently to enable/disable some IRQ bits, we may unintentionally
      unmask unrelated IRQs. I can't see any way that this can lead to a real
      problem in the current -nightly code, since the first place the mask
      will be used next (after a suspend/resume cycle) is in
      valleyview_irq_postinstall(), but the mask is reset there to its proper
      value.
      
      This causes a problem in the upstream kernel though, where - due to another
      issue - the mask is used in the above way to disable only the display IRQs.
      This other issue is fixed by:
      
      commit 950eabaf
      Author: Imre Deak <imre.deak@intel.com>
      Date:   Mon Sep 8 15:21:09 2014 +0300
      
          drm/i915: vlv: fix display IRQ enable/disable
      
      Interestingly, even with the above two bugs, we shouldn't in theory have
      any real problems (arguably a famous last sentence:). That's because
      even if we unmask something unintentionally via the VLV_IMR/VLV_IER
      register the master IRQ masking bit in VLV_MASTER_IER is still set and
      should prevent all i915 interrupts. According to my testing on an ASUS
      T100 with DSI output this isn't the case at least with the
      MIPIA_INTERRUPT. Leaving this one unmasked in IMR/IER, while having
      VLV_MASTER_IER set to 0 may lead to a lockup during system suspend as
      shown in the bugzilla ticket below. This fix should get rid of the
      problem reported there in upstream and older kernels.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85920Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      1eadc8e9
    • Jiri Olsa's avatar
      perf: Fix events installation during moving group · 4b64cc90
      Jiri Olsa authored
      commit 9fc81d87 upstream.
      
      We allow PMU driver to change the cpu on which the event
      should be installed to. This happened in patch:
      
        e2d37cd2 ("perf: Allow the PMU driver to choose the CPU on which to install events")
      
      This patch also forces all the group members to follow
      the currently opened events cpu if the group happened
      to be moved.
      
      This and the change of event->cpu in perf_install_in_context()
      function introduced in:
      
        0cda4c02 ("perf: Introduce perf_pmu_migrate_context()")
      
      forces group members to change their event->cpu,
      if the currently-opened-event's PMU changed the cpu
      and there is a group move.
      
      Above behaviour causes problem for breakpoint events,
      which uses event->cpu to touch cpu specific data for
      breakpoints accounting. By changing event->cpu, some
      breakpoints slots were wrongly accounted for given
      cpu.
      
      Vinces's perf fuzzer hit this issue and caused following
      WARN on my setup:
      
         WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
         Can't find any breakpoint slot
         [...]
      
      This patch changes the group moving code to keep the event's
      original cpu.
      Reported-by: default avatarVince Weaver <vince@deater.net>
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      4b64cc90
    • Jiri Olsa's avatar
      perf/x86/intel/uncore: Make sure only uncore events are collected · bf4f2373
      Jiri Olsa authored
      commit af91568e upstream.
      
      The uncore_collect_events functions assumes that event group
      might contain only uncore events which is wrong, because it
      might contain any type of events.
      
      This bug leads to uncore framework touching 'not' uncore events,
      which could end up all sorts of bugs.
      
      One was triggered by Vince's perf fuzzer, when the uncore code
      touched breakpoint event private event space as if it was uncore
      event and caused BUG:
      
         BUG: unable to handle kernel paging request at ffffffff82822068
         IP: [<ffffffff81020338>] uncore_assign_events+0x188/0x250
         ...
      
      The code in uncore_assign_events() function was looking for
      event->hw.idx data while the event was initialized as a
      breakpoint with different members in event->hw union.
      
      This patch forces uncore_collect_events() to collect only uncore
      events.
      Reported-by: default avatarVince Weaver <vince@deater.net>
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      bf4f2373
    • Filipe Manana's avatar
      Btrfs: fix fs corruption on transaction abort if device supports discard · a20425e3
      Filipe Manana authored
      commit 678886bd upstream.
      
      When we abort a transaction we iterate over all the ranges marked as dirty
      in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
      from those trees, add them back (unpin) to the free space caches and, if
      the fs was mounted with "-o discard", perform a discard on those regions.
      Also, after adding the regions to the free space caches, a fitrim ioctl call
      can see those ranges in a block group's free space cache and perform a discard
      on the ranges, so the same issue can happen without "-o discard" as well.
      
      This causes corruption, affecting one or multiple btree nodes (in the worst
      case leaving the fs unmountable) because some of those ranges (the ones in
      the fs_info->pinned_extents tree) correspond to btree nodes/leafs that are
      referred by the last committed super block - breaking the rule that anything
      that was committed by a transaction is untouched until the next transaction
      commits successfully.
      
      I ran into this while running in a loop (for several hours) the fstest that
      I recently submitted:
      
        [PATCH] fstests: add btrfs test to stress chunk allocation/removal and fstrim
      
      The corruption always happened when a transaction aborted and then fsck complained
      like this:
      
         _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
         *** fsck.btrfs output ***
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         read block failed check_tree_block
         Couldn't open file system
      
      In this case 94945280 corresponded to the root of a tree.
      Using frace what I observed was the following sequence of steps happened:
      
         1) transaction N started, fs_info->pinned_extents pointed to
            fs_info->freed_extents[0];
      
         2) node/eb 94945280 is created;
      
         3) eb is persisted to disk;
      
         4) transaction N commit starts, fs_info->pinned_extents now points to
            fs_info->freed_extents[1], and transaction N completes;
      
         5) transaction N + 1 starts;
      
         6) eb is COWed, and btrfs_free_tree_block() called for this eb;
      
         7) eb range (94945280 to 94945280 + 16Kb) is added to
            fs_info->pinned_extents (fs_info->freed_extents[1]);
      
         8) Something goes wrong in transaction N + 1, like hitting ENOSPC
            for example, and the transaction is aborted, turning the fs into
            readonly mode. The stack trace I got for example:
      
            [112065.253935]  [<ffffffff8140c7b6>] dump_stack+0x4d/0x66
            [112065.254271]  [<ffffffff81042984>] warn_slowpath_common+0x7f/0x98
            [112065.254567]  [<ffffffffa0325990>] ? __btrfs_abort_transaction+0x50/0x10b [btrfs]
            [112065.261674]  [<ffffffff810429e5>] warn_slowpath_fmt+0x48/0x50
            [112065.261922]  [<ffffffffa032949e>] ? btrfs_free_path+0x26/0x29 [btrfs]
            [112065.262211]  [<ffffffffa0325990>] __btrfs_abort_transaction+0x50/0x10b [btrfs]
            [112065.262545]  [<ffffffffa036b1d6>] btrfs_remove_chunk+0x537/0x58b [btrfs]
            [112065.262771]  [<ffffffffa033840f>] btrfs_delete_unused_bgs+0x1de/0x21b [btrfs]
            [112065.263105]  [<ffffffffa0343106>] cleaner_kthread+0x100/0x12f [btrfs]
            (...)
            [112065.264493] ---[ end trace dd7903a975a31a08 ]---
            [112065.264673] BTRFS: error (device sdc) in btrfs_remove_chunk:2625: errno=-28 No space left
            [112065.264997] BTRFS info (device sdc): forced readonly
      
         9) The clear kthread sees that the BTRFS_FS_STATE_ERROR bit is set in
            fs_info->fs_state and calls btrfs_cleanup_transaction(), which in
            turn calls btrfs_destroy_pinned_extent();
      
         10) Then btrfs_destroy_pinned_extent() iterates over all the ranges
             marked as dirty in fs_info->freed_extents[], and for each one
             it calls discard, if the fs was mounted with "-o discard", and
             adds the range to the free space cache of the respective block
             group;
      
         11) btrfs_trim_block_group(), invoked from the fitrim ioctl code path,
             sees the free space entries and performs a discard;
      
         12) After an umount and mount (or fsck), our eb's location on disk was full
             of zeroes, and it should have been untouched, because it was marked as
             dirty in the fs_info->pinned_extents tree, and therefore used by the
             trees that the last committed superblock points to.
      
      Fix this by not performing a discard and not adding the ranges to the free space
      caches - it's useless from this point since the fs is now in readonly mode and
      we won't write free space caches to disk anymore (otherwise we would leak space)
      nor any new superblock. By not adding the ranges to the free space caches, it
      prevents other code paths from allocating that space and write to it as well,
      therefore being safer and simpler.
      
      This isn't a new problem, as it's been present since 2011 (git commit
      acce952b).
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a20425e3
    • Peter Rosin's avatar
      ASoC: pcm512x: Trigger auto-increment of register addresses on i2c · c28dedb1
      Peter Rosin authored
      commit 681a1956 upstream.
      
      When the codec is connected using i2c, it will only auto-increment
      register addresses if msb (0x80) of the register address byte is set.
      
      [Fixes cache sync if multiple adjacent registers are updated -- broonie]
      Signed-off-by: default avatarPeter Rosin <peda@axentia.se>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c28dedb1
    • Sreekanth Reddy's avatar
      Revert "[SCSI] mpt3sas: Remove phys on topology change" · 8e2d7c21
      Sreekanth Reddy authored
      commit 2311ce4d upstream.
      
      This reverts commit 963ba22b
      ("mpt3sas: Remove phys on topology change")
      
      Reverting the previous mpt3sas drives patch changes,
      since we will observe below issue
      
      Issue:
      Drives connected Enclosure/Expander will unregister with
      SCSI Transport Layer, if any one remove and add expander
      cable with in DMD (Device Missing Delay) time period or
      even any one power-off and power-on the Enclosure with in
      the DMD period.
      Signed-off-by: default avatarSreekanth Reddy <Sreekanth.Reddy@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      8e2d7c21
    • Sreekanth Reddy's avatar
      Revert "[SCSI] mpt2sas: Remove phys on topology change." · f3263457
      Sreekanth Reddy authored
      commit 81a89c2d upstream.
      
      This reverts commit 3520f9c7
      ("mpt2sas: Remove phys on topology change")
      
      Reverting the previous mpt2sas drives patch changes,
      since we will observe below issue
      
      Issue:
      Drives connected Enclosure/Expander will unregister with
      SCSI Transport Layer, if any one remove and add expander
      cable with in DMD (Device Missing Delay) time period or
      even any one power-off and power-on the Enclosure with in
      the DMD period.
      Signed-off-by: default avatarSreekanth Reddy <Sreekanth.Reddy@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f3263457
    • Krzysztof Kozlowski's avatar
      clk: samsung: Fix double add of syscore ops after driver rebind · 495548a3
      Krzysztof Kozlowski authored
      commit c31844ff upstream.
      
      During driver unbind the syscore ops were not unregistered which lead to
      double add on syscore list:
      
      $ echo "3810000.audss-clock-controller" > /sys/bus/platform/drivers/exynos-audss-clk/unbind
      $ echo "3810000.audss-clock-controller" > /sys/bus/platform/drivers/exynos-audss-clk/bind
      [ 1463.044061] ------------[ cut here ]------------
      [ 1463.047255] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:36 __list_add+0x8c/0xc0()
      [ 1463.054613] list_add double add: new=c06e52c0, prev=c06e52c0, next=c06d5f84.
      [ 1463.061625] Modules linked in:
      [ 1463.064623] CPU: 0 PID: 1 Comm: bash Tainted: G        W      3.18.0-rc5-next-20141121-00005-ga8fab06eab42-dirty #1022
      [ 1463.075338] [<c0014e2c>] (unwind_backtrace) from [<c0011d80>] (show_stack+0x10/0x14)
      [ 1463.083046] [<c0011d80>] (show_stack) from [<c048bb70>] (dump_stack+0x70/0xbc)
      [ 1463.090236] [<c048bb70>] (dump_stack) from [<c00233d4>] (warn_slowpath_common+0x74/0xb0)
      [ 1463.098295] [<c00233d4>] (warn_slowpath_common) from [<c00234a4>] (warn_slowpath_fmt+0x30/0x40)
      [ 1463.106962] [<c00234a4>] (warn_slowpath_fmt) from [<c020fe80>] (__list_add+0x8c/0xc0)
      [ 1463.114760] [<c020fe80>] (__list_add) from [<c0282094>] (register_syscore_ops+0x30/0x3c)
      [ 1463.122819] [<c0282094>] (register_syscore_ops) from [<c0392f20>] (exynos_audss_clk_probe+0x36c/0x460)
      [ 1463.132091] [<c0392f20>] (exynos_audss_clk_probe) from [<c0283084>] (platform_drv_probe+0x48/0xa4)
      [ 1463.141013] [<c0283084>] (platform_drv_probe) from [<c0281a14>] (driver_probe_device+0x13c/0x37c)
      [ 1463.149852] [<c0281a14>] (driver_probe_device) from [<c0280560>] (bind_store+0x90/0xe0)
      [ 1463.157822] [<c0280560>] (bind_store) from [<c027fd10>] (drv_attr_store+0x20/0x2c)
      [ 1463.165363] [<c027fd10>] (drv_attr_store) from [<c0143898>] (sysfs_kf_write+0x4c/0x50)
      [ 1463.173252] [<c0143898>] (sysfs_kf_write) from [<c0142c80>] (kernfs_fop_write+0xbc/0x198)
      [ 1463.181395] [<c0142c80>] (kernfs_fop_write) from [<c00e2be0>] (vfs_write+0xa0/0x1a8)
      [ 1463.189104] [<c00e2be0>] (vfs_write) from [<c00e2f00>] (SyS_write+0x40/0x8c)
      [ 1463.196122] [<c00e2f00>] (SyS_write) from [<c000f2a0>] (ret_fast_syscall+0x0/0x48)
      [ 1463.203655] ---[ end trace 08f6710c9bc8d8f3 ]---
      [ 1463.208244] exynos-audss-clk 3810000.audss-clock-controller: setup completed
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 1241ef94 ("clk: samsung: register audio subsystem clocks using common clock framework")
      Signed-off-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      495548a3