1. 28 Apr, 2016 1 commit
    • Keith Busch's avatar
      x86/apic: Handle zero vector gracefully in clear_vector_irq() · 1bdb8970
      Keith Busch authored
      If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup
      the already allocated vectors. This subsequently calls clear_vector_irq().
      
      The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in
      clear_vector_irq().
      
      We cannot suppress the call to x86_vector_free_irqs() for the failed
      interrupt, because the other data related to this irq must be cleaned up as
      well. So calling clear_vector_irq() with vector == 0 is legitimate.
      
      Remove the BUG_ON and return if vector is zero,
      
      [ tglx: Massaged changelog ]
      
      Fixes: b5dc8e6c "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      1bdb8970
  2. 26 Apr, 2016 1 commit
  3. 23 Apr, 2016 1 commit
    • Ross Lagerwall's avatar
      xen/qspinlock: Don't kick CPU if IRQ is not initialized · 707e59ba
      Ross Lagerwall authored
      The following commit:
      
        1fb3a8b2 ("xen/spinlock: Fix locking path engaging too soon under PVHVM.")
      
      ... moved the initalization of the kicker interrupt until after
      native_cpu_up() is called.
      
      However, when using qspinlocks, a CPU may try to kick another CPU that is
      spinning (because it has not yet initialized its kicker interrupt), resulting
      in the following crash during boot:
      
        kernel BUG at /build/linux-Ay7j_C/linux-4.4.0/drivers/xen/events/events_base.c:1210!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: 0010:[<ffffffff814c97c9>]  [<ffffffff814c97c9>] xen_send_IPI_one+0x59/0x60
        ...
        Call Trace:
         [<ffffffff8102be9e>] xen_qlock_kick+0xe/0x10
         [<ffffffff810cabc2>] __pv_queued_spin_unlock+0xb2/0xf0
         [<ffffffff810ca6d1>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
         [<ffffffff81052936>] ? check_tsc_warp+0x76/0x150
         [<ffffffff81052aa6>] check_tsc_sync_source+0x96/0x160
         [<ffffffff81051e28>] native_cpu_up+0x3d8/0x9f0
         [<ffffffff8102b315>] xen_hvm_cpu_up+0x35/0x80
         [<ffffffff8108198c>] _cpu_up+0x13c/0x180
         [<ffffffff81081a4a>] cpu_up+0x7a/0xa0
         [<ffffffff81f80dfc>] smp_init+0x7f/0x81
         [<ffffffff81f5a121>] kernel_init_freeable+0xef/0x212
         [<ffffffff81817f30>] ? rest_init+0x80/0x80
         [<ffffffff81817f3e>] kernel_init+0xe/0xe0
         [<ffffffff8182488f>] ret_from_fork+0x3f/0x70
         [<ffffffff81817f30>] ? rest_init+0x80/0x80
      
      To fix this, only send the kick if the target CPU's interrupt has been
      initialized. This check isn't racy, because the target is waiting for
      the spinlock, so it won't have initialized the interrupt in the
      meantime.
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      707e59ba
  4. 22 Apr, 2016 4 commits
    • Tony Luck's avatar
      x86 EDAC, sb_edac.c: Take account of channel hashing when needed · ea5dfb5f
      Tony Luck authored
      Haswell and Broadwell can be configured to hash the channel
      interleave function using bits [27:12] of the physical address.
      
      On those processor models we must check to see if hashing is
      enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and
      act accordingly.
      
      Based on a patch by patrickg <patrickg@supermicro.com>
      Tested-by: default avatarPatrick Geary <patrickg@supermicro.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ea5dfb5f
    • Tony Luck's avatar
      x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address · ff15e95c
      Tony Luck authored
      In commit:
      
        eb1af3b7 ("Fix computation of channel address")
      
      I switched the "sck_way" variable from holding the log2 value read
      from the h/w to instead be the actual number. Unfortunately it
      is needed in log2 form when used to shift the address.
      Tested-by: default avatarPatrick Geary <patrickg@supermicro.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: eb1af3b7 ("Fix computation of channel address")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff15e95c
    • Jan Beulich's avatar
      x86/mm/xen: Suppress hugetlbfs in PV guests · 103f6112
      Jan Beulich authored
      Huge pages are not normally available to PV guests. Not suppressing
      hugetlbfs use results in an endless loop of page faults when user mode
      code tries to access a hugetlbfs mapped area (since the hypervisor
      denies such PTEs to be created, but error indications can't be
      propagated out of xen_set_pte_at(), just like for various of its
      siblings), and - once killed in an oops like this:
      
        kernel BUG at .../fs/hugetlbfs/inode.c:428!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: e030:[<ffffffff811c333b>]  [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
        ...
        Call Trace:
         [<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
         [<ffffffff81167b3d>] evict+0xbd/0x1b0
         [<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
         [<ffffffff81165b0e>] dput+0x1fe/0x220
         [<ffffffff81150535>] __fput+0x155/0x200
         [<ffffffff81079fc0>] task_work_run+0x60/0xa0
         [<ffffffff81063510>] do_exit+0x160/0x400
         [<ffffffff810637eb>] do_group_exit+0x3b/0xa0
         [<ffffffff8106e8bd>] get_signal+0x1ed/0x470
         [<ffffffff8100f854>] do_signal+0x14/0x110
         [<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
         [<ffffffff814178a5>] retint_user+0x8/0x13
      
      This is CVE-2016-3961 / XSA-174.
      Reported-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <JGross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: stable@vger.kernel.org
      Cc: xen-devel <xen-devel@lists.xenproject.org>
      Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      103f6112
    • Juergen Gross's avatar
      x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt · 78b0634d
      Juergen Gross authored
      Correct the size of the module mapping space and the maximum available
      physical memory size of current processors.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: corbet@lwn.net
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1461310504-15977-1-git-send-email-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      78b0634d
  5. 16 Apr, 2016 1 commit
    • Vitaly Kuznetsov's avatar
      x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances · 1e2ae9ec
      Vitaly Kuznetsov authored
      Generation2 instances don't support reporting the NMI status on port 0x61,
      read from there returns 'ff' and we end up reporting nonsensical PCI
      error (as there is no PCI bus in these instances) on all NMIs:
      
          NMI: PCI system error (SERR) for reason ff on CPU 0.
          Dazed and confused, but trying to continue
      
      Fix the issue by overriding x86_platform.get_nmi_reason. Use 'booted on
      EFI' flag to detect Gen2 instances.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/1460728232-31433-1-git-send-email-vkuznets@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1e2ae9ec
  6. 15 Apr, 2016 17 commits
  7. 14 Apr, 2016 13 commits
    • Mike Snitzer's avatar
      dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros · 9567366f
      Mike Snitzer authored
      The READ_LOCK macro was incorrectly returning -EINVAL if
      dm_bm_is_read_only() was true -- it will always be true once the cache
      metadata transitions to read-only by dm_cache_metadata_set_read_only().
      
      Wrap READ_LOCK and WRITE_LOCK multi-statement macros in do {} while(0).
      Also, all accesses of the 'cmd' argument passed to these related macros
      are now encapsulated in parenthesis.
      
      A follow-up patch can be developed to eliminate the use of macros in
      favor of pure C code.  Avoiding that now given that this needs to apply
      to stable@.
      Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: d14fcf3d ("dm cache: make sure every metadata function checks fail_io")
      Cc: stable@vger.kernel.org
      9567366f
    • Keith Busch's avatar
      NVMe: Always use MSI/MSI-x interrupts · a5229050
      Keith Busch authored
      Multiple users have reported device initialization failure due the driver
      not receiving legacy PCI interrupts. This is not unique to any particular
      controller, but has been observed on multiple platforms.
      
      There have been no issues reported or observed when with message signaled
      interrupts, so this patch attempts to use MSI-x during initialization,
      falling back to MSI. If that fails, legacy would become the default.
      
      The setup_io_queues error handling had to change as a result: the admin
      queue's msix_entry used to be initialized to the legacy IRQ. The case
      where nr_io_queues is 0 would fail request_irq when setting up the admin
      queue's interrupt since re-enabling MSI-x fails with 0 vectors, leaving
      the admin queue's msix_entry invalid. Instead, return success immediately.
      Reported-by: default avatarTim Muhlemmer <muhlemmer@gmail.com>
      Reported-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      a5229050
    • Linus Torvalds's avatar
      /proc/iomem: only expose physical resource addresses to privileged users · 51d7b120
      Linus Torvalds authored
      In commit c4004b02 ("x86: remove the kernel code/data/bss resources
      from /proc/iomem") I was hoping to remove the phyiscal kernel address
      data from /proc/iomem entirely, but that had to be reverted because some
      system programs actually use it.
      
      This limits all the detailed resource information to properly
      credentialed users instead.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      51d7b120
    • Linus Torvalds's avatar
      pci-sysfs: use proper file capability helper function · ab0fa82b
      Linus Torvalds authored
      The PCI config access checked the file capabilities correctly, but used
      the itnernal security capability check rather than the helper function
      that is actually meant for that.
      
      The security_capable() has unusual return values and is not meant to be
      used elsewhere (the only other use is in the capability checking
      functions that we actually intend people to use, and this odd PCI usage
      really stood out when looking around the capability code.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab0fa82b
    • Linus Torvalds's avatar
      Make file credentials available to the seqfile interfaces · 34dbbcdb
      Linus Torvalds authored
      A lot of seqfile users seem to be using things like %pK that uses the
      credentials of the current process, but that is actually completely
      wrong for filesystem interfaces.
      
      The unix semantics for permission checking files is to check permissions
      at _open_ time, not at read or write time, and that is not just a small
      detail: passing off stdin/stdout/stderr to a suid application and making
      the actual IO happen in privileged context is a classic exploit
      technique.
      
      So if we want to be able to look at permissions at read time, we need to
      use the file open credentials, not the current ones.  Normal file
      accesses can just use "f_cred" (or any of the helper functions that do
      that, like file_ns_capable()), but the seqfile interfaces do not have
      any such options.
      
      It turns out that seq_file _does_ save away the user_ns information of
      the file, though.  Since user_ns is just part of the full credential
      information, replace that special case with saving off the cred pointer
      instead, and suddenly seq_file has all the permission information it
      needs.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34dbbcdb
    • Linus Torvalds's avatar
      Revert "x86: remove the kernel code/data/bss resources from /proc/iomem" · 4046d6e8
      Linus Torvalds authored
      This reverts commit c4004b02.
      
      Sadly, my hope that nobody would actually use the special kernel entries
      in /proc/iomem were dashed by kexec.  Which reads /proc/iomem explicitly
      to find the kernel base address.  Nasty.
      
      Anyway, that means we can't do the sane and simple thing and just remove
      the entries, and we'll instead have to mask them out based on permissions.
      Reported-by: default avatarZhengyu Zhang <zhezhang@redhat.com>
      Reported-by: default avatarDave Young <dyoung@redhat.com>
      Reported-by: default avatarFreeman Zhang <freeman.zhang1992@gmail.com>
      Reported-by: default avatarEmrah Demir <ed@abdsec.com>
      Reported-by: default avatarBaoquan He <bhe@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4046d6e8
    • Helge Deller's avatar
      parisc: Fix ftrace function tracer · 366dd4ea
      Helge Deller authored
      Fix the FTRACE function tracer for 32- and 64-bit kernel.
      The former code was horribly broken.
      
      Reimplement most coding in assembly and utilize optimizations, e.g. put
      mcount() and ftrace_stub() into one L1 cacheline.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      366dd4ea
    • Toshi Kani's avatar
      pmem: fix BUG() error in pmem.h:48 on X86_32 · cba2e47a
      Toshi Kani authored
      After 'commit fc0c2028 ("x86, pmem: use memcpy_mcsafe()
      for memcpy_from_pmem()")', probing a PMEM device hits the BUG()
      error below on X86_32 kernel.
      
       kernel BUG at include/linux/pmem.h:48!
      
      memcpy_from_pmem() calls arch_memcpy_from_pmem(), which is
      unimplemented since CONFIG_ARCH_HAS_PMEM_API is undefined on
      X86_32.
      
      Fix the BUG() error by adding default_memcpy_from_pmem().
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      cba2e47a
    • Stefan Agner's avatar
      pwm: fsl-ftm: Use flat regmap cache · ad06fdee
      Stefan Agner authored
      Use flat regmap cache to avoid lockdep warning at probe:
      
      [    0.697285] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2755 lockdep_trace_alloc+0x15c/0x160()
      [    0.697449] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
      
      The RB-tree regmap cache needs to allocate new space on first writes.
      However, allocations in an atomic context (e.g. when a spinlock is held)
      are not allowed. The function regmap_write calls map->lock, which
      acquires a spinlock in the fast_io case. Since the pwm-fsl-ftm driver
      uses MMIO, the regmap bus of type regmap_mmio is being used which has
      fast_io set to true.
      
      The MMIO space of the pwm-fsl-ftm driver is reasonable condense, hence
      using the much faster flat regmap cache is anyway the better choice.
      Signed-off-by: default avatarStefan Agner <stefan@agner.ch>
      Cc: Mark Brown <broonie@kernel.org>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      ad06fdee
    • Jon Hunter's avatar
      mmc: tegra: Disable UHS-I modes for Tegra124 · 70ad7f7e
      Jon Hunter authored
      Tegra124 has been randomly hanging during system suspend when entering
      the Tegra LP1 low power state. The hang is caused by the Tegra SDHCI
      driver and linked to the UHS-I tuning sequence. Disabling the UHS-I
      modes for Tegra124 prevents any hangs from occurring when entering
      system suspend.
      
      Unfortunately, the tuning sequence described in the public Tegra
      documentation is incomplete and on inspection of the current tuning
      sequence that has been implemented is also incomplete and may cause
      problems. In the short-term it is safer to disable UHS-I modes for now
      and fix later because it would be too large of a change to simply patch
      now. Therefore, disable UHS-I modes for Tegra124.
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      70ad7f7e
    • Ulf Hansson's avatar
      mmc: block: Use the mmc host device index as the mmcblk device index · 9aaf3437
      Ulf Hansson authored
      Commit 520bd7a8 ("mmc: core: Optimize boot time by detecting cards
      simultaneously") causes regressions for some platforms.
      
      These platforms relies on fixed mmcblk device indexes, instead of
      deploying the defacto standard with UUID/PARTUUID. In other words their
      rootfs needs to be available at hardcoded paths, like /dev/mmcblk0p2.
      
      Such guarantees have never been made by the kernel, but clearly the above
      commit changes the behaviour. More precisely, because of that the order
      changes of how cards becomes detected, so do their corresponding mmcblk
      device indexes.
      
      As the above commit significantly improves boot time for some platforms
      (magnitude of seconds), let's avoid reverting this change but instead
      restore the behaviour of how mmcblk device indexes becomes picked.
      
      By using the same index for the mmcblk device as for the corresponding mmc
      host device, the probe order of mmc host devices decides the index we get
      for the mmcblk device.
      
      For those platforms that suffers from a regression, one could expect that
      this updated behaviour should be sufficient to meet their expectations of
      "fixed" mmcblk device indexes.
      
      Another side effect from this change, is that the same index is used for
      the mmc host device, the mmcblk device and the mmc block queue. That
      should clarify their relationship.
      Reported-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Reported-by: default avatarLaszlo Fiat <laszlo.fiat@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Fixes: 520bd7a8 ("mmc: core: Optimize boot time by detecting cards
      simultaneously")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      9aaf3437
    • Dave Airlie's avatar
      Merge branch 'exynos-drm-fixes' of... · ff3e84e8
      Dave Airlie authored
      Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
      
      fix some exynos regressions.
      
      * 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
        drm/exynos: Use VIDEO_SAMSUNG_S5P_G2D=n as G2D Kconfig dependency
        drm/exynos: fix a warning message
        drm/exynos: mic: fix an error code
        drm/exynos: fimd: fix broken dp_clock control
        drm/exynos: build fbdev code conditionally
        drm/exynos: fix adjusted_mode pointer in exynos_plane_mode_set
        drm/exynos: fix error handling in exynos_drm_subdrv_open
      ff3e84e8
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 25451c19
      Dave Airlie authored
      some misc radeon fixes.
      
      * 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux:
        drm/amd/amdgpu: fix irq domain remove for tonga ih
        drm/radeon: use helper for mst connector dpms.
        drm/radeon/mst: port some MST setup code from DAL.
        drm/amdgpu: add invisible pin size statistic
      25451c19
  8. 13 Apr, 2016 2 commits