1. 03 May, 2017 2 commits
  2. 02 May, 2017 5 commits
  3. 01 May, 2017 2 commits
  4. 28 Apr, 2017 19 commits
  5. 27 Apr, 2017 8 commits
  6. 26 Apr, 2017 3 commits
    • Michael Ellerman's avatar
      powerpc/powernv: Fix oops on P9 DD1 in cause_ipi() · 45b21cfe
      Michael Ellerman authored
      Recently we merged the native xive support for Power9, and then separately some
      reworks for doorbell IPI support. In isolation both series were OK, but the
      merged result had a bug in one case.
      
      On P9 DD1 we use pnv_p9_dd1_cause_ipi() which tries to use doorbells, and then
      falls back to the interrupt controller. However the fallback is implemented by
      calling icp_ops->cause_ipi. But now that xive support is merged we might be
      using xive, in which case icp_ops is not initialised, it's a xics specific
      structure. This leads to an oops such as:
      
        Unable to handle kernel paging request for data at address 0x00000028
        Oops: Kernel access of bad area, sig: 11 [#1]
        NIP pnv_p9_dd1_cause_ipi+0x74/0xe0
        LR smp_muxed_ipi_message_pass+0x54/0x70
      
      To fix it, rather than using icp_ops which might be NULL, have both xics and
      xive set smp_ops->cause_ipi, and then in the powernv code we save that as
      ic_cause_ipi before overriding smp_ops->cause_ipi. For paranoia add a WARN_ON()
      to check if somehow smp_ops->cause_ipi is NULL.
      
      Fixes: b866cc21 ("powerpc: Change the doorbell IPI calling convention")
      Tested-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      45b21cfe
    • Michael Ellerman's avatar
      powerpc/powernv: Fix missing attr initialisation in opal_export_attrs() · 83c49190
      Michael Ellerman authored
      In opal_export_attrs() we dynamically allocate some bin_attributes. They're
      allocated with kmalloc() and although we initialise most of the fields, we don't
      initialise write() or mmap(), and in particular we don't initialise the lockdep
      related fields in the embedded struct attribute.
      
      This leads to a lockdep warning at boot:
      
        BUG: key c0000000f11906d8 not in .data!
        WARNING: CPU: 0 PID: 1 at ../kernel/locking/lockdep.c:3136 lockdep_init_map+0x28c/0x2a0
        ...
        Call Trace:
          lockdep_init_map+0x288/0x2a0 (unreliable)
          __kernfs_create_file+0x8c/0x170
          sysfs_add_file_mode_ns+0xc8/0x240
          __machine_initcall_powernv_opal_init+0x60c/0x684
          do_one_initcall+0x60/0x1c0
          kernel_init_freeable+0x2f4/0x3d4
          kernel_init+0x24/0x160
          ret_from_kernel_thread+0x5c/0xb0
      
      Fix it by kzalloc'ing the attr, which fixes the uninitialised write() and
      mmap(), and calling sysfs_bin_attr_init() on it to initialise the lockdep
      fields.
      
      Fixes: 11fe909d ("powerpc/powernv: Add OPAL exports attributes to sysfs")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      83c49190
    • Michael Ellerman's avatar
      powerpc/mm: Fix possible out-of-bounds shift in arch_mmap_rnd() · b409946b
      Michael Ellerman authored
      The recent patch to add runtime configuration of the ASLR limits added a bug in
      arch_mmap_rnd() where we may shift an integer (32-bits) by up to 33 bits,
      leading to undefined behaviour.
      
      In practice it exhibits as every process seg faulting instantly, presumably
      because the rnd value hasn't been restricited by the modulus at all. We didn't
      notice because it only happens under certain kernel configurations and if the
      number of bits is actually set to a large value.
      
      Fix it by switching to unsigned long.
      
      Fixes: 9fea59bd ("powerpc/mm: Add support for runtime configuration of ASLR limits")
      Reported-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b409946b
  7. 24 Apr, 2017 1 commit
    • David Gibson's avatar
      powerpc/mm: Ensure IRQs are off in switch_mm() · 9765ad13
      David Gibson authored
      powerpc expects IRQs to already be (soft) disabled when switch_mm() is
      called, as made clear in the commit message of 9c1e1052 ("powerpc: Allow
      perf_counters to access user memory at interrupt time").
      
      Aside from any race conditions that might exist between switch_mm() and an IRQ,
      there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't
      followed at some point by an IRQ enable then interrupts will remain disabled
      until we return to userspace.
      
      It is true that when switch_mm() is called from the scheduler IRQs are off, but
      not when it's called by use_mm(). Looking closer we see that last year in commit
      f98db601 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler")
      this was made more explicit by the addition of switch_mm_irqs_off() which is now
      called by the scheduler, vs switch_mm() which is used by use_mm().
      
      Arguably it is a bug in use_mm() to call switch_mm() in a different context than
      it expects, but fixing that will take time.
      
      This was discovered recently when vhost started throwing warnings such as:
      
        BUG: sleeping function called from invalid context at kernel/mutex.c:578
        in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760
        no locks held by vhost-10760/10768.
        irq event stamp: 10
        hardirqs last  enabled at (9):  _raw_spin_unlock_irq+0x40/0x80
        hardirqs last disabled at (10): switch_slb+0x2e4/0x490
        softirqs last  enabled at (0):  copy_process+0x5e8/0x1260
        softirqs last disabled at (0):  (null)
        Call Trace:
          show_stack+0x88/0x390 (unreliable)
          dump_stack+0x30/0x44
          __might_sleep+0x1c4/0x2d0
          mutex_lock_nested+0x74/0x5c0
          cgroup_attach_task_all+0x5c/0x180
          vhost_attach_cgroups_work+0x58/0x80 [vhost]
          vhost_worker+0x24c/0x3d0 [vhost]
          kthread+0xec/0x100
          ret_from_kernel_thread+0x5c/0xd4
      
      Prior to commit 04b96e55 ("vhost: lockless enqueuing") (Aug 2016) the
      vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(),
      which had the effect of reenabling IRQs. Since that commit removed the locking
      in vhost_worker() the body of the vhost_worker() loop now runs with interrupts
      off causing the warnings.
      
      This patch addresses the problem by making the powerpc code mirror the x86 code,
      ie. we disable interrupts in switch_mm(), and optimise the scheduler case by
      defining switch_mm_irqs_off().
      
      Cc: stable@vger.kernel.org # v4.7+
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Flesh out/rewrite change log, add stable]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9765ad13