1. 06 Feb, 2019 9 commits
  2. 31 Jan, 2019 31 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.97 · e1e364bf
      Greg Kroah-Hartman authored
      e1e364bf
    • Anand Jain's avatar
      btrfs: dev-replace: go back to suspended state if target device is missing · f4f4ac41
      Anand Jain authored
      commit 0d228ece upstream.
      
      At the time of forced unmount we place the running replace to
      BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state, so when the system comes
      back and expect the target device is missing.
      
      Then let the replace state continue to be in
      BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED state instead of
      BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED as there isn't any matching scrub
      running as part of replace.
      
      Fixes: e93c89c1 ("Btrfs: add new sources for device replace code")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f4f4ac41
    • Jeff Mahoney's avatar
      btrfs: fix error handling in btrfs_dev_replace_start · b682b805
      Jeff Mahoney authored
      commit 5c061471 upstream.
      
      When we fail to start a transaction in btrfs_dev_replace_start, we leave
      dev_replace->replace_start set to STARTED but clear ->srcdev and
      ->tgtdev.  Later, that can result in an Oops in
      btrfs_dev_replace_progress when having state set to STARTED or SUSPENDED
      implies that ->srcdev is valid.
      
      Also fix error handling when the state is already STARTED or SUSPENDED
      while starting.  That, too, will clear ->srcdev and ->tgtdev even though
      it doesn't own them.  This should be an impossible case to hit since we
      should be protected by the BTRFS_FS_EXCL_OP bit being set.  Let's add an
      ASSERT there while we're at it.
      
      Fixes: e93c89c1 (Btrfs: add new sources for device replace code)
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b682b805
    • Pan Bian's avatar
      f2fs: read page index before freeing · d33655db
      Pan Bian authored
      commit 0ea295dd upstream.
      
      The function truncate_node frees the page with f2fs_put_page. However,
      the page index is read after that. So, the patch reads the index before
      freeing the page.
      
      Fixes: bf39c00a ("f2fs: drop obsolete node page when it is truncated")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d33655db
    • Juergen Gross's avatar
      xen: Fix x86 sched_clock() interface for xen · 49c5a805
      Juergen Gross authored
      commit 867cefb4 upstream.
      
      Commit f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable'
      sched_clock() interface") broke Xen guest time handling across
      migration:
      
      [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
      [  187.251137] OOM killer disabled.
      [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
      [  187.252299] suspending xenstore...
      [  187.266987] xen:grant_table: Grant tables using version 1 layout
      [18446743811.706476] OOM killer enabled.
      [18446743811.706478] Restarting tasks ... done.
      [18446743811.720505] Setting capacity to 16777216
      
      Fix that by setting xen_sched_clock_offset at resume time to ensure a
      monotonic clock value.
      
      [boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
       to avoid printing with incorrect timestamp during resume (as we
       haven't re-adjusted the clock yet)]
      
      Fixes: f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
      Cc: <stable@vger.kernel.org> # 4.11
      Reported-by: default avatarHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49c5a805
    • Pavel Tatashin's avatar
      x86/xen/time: Output xen sched_clock time from 0 · ac0d94ff
      Pavel Tatashin authored
      commit 38669ba2 upstream.
      
      It is expected for sched_clock() to output data from 0, when system boots.
      
      Add an offset xen_sched_clock_offset (similarly how it is done in other
      hypervisors i.e. kvm_sched_clock_offset) to count sched_clock() from 0,
      when time is first initialized.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-14-pasha.tatashin@oracle.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac0d94ff
    • Joao Martins's avatar
      x86/xen/time: setup vcpu 0 time info page · 9fb9a543
      Joao Martins authored
      commit 2229f70b upstream.
      
      In order to support pvclock vdso on xen we need to setup the time
      info page for vcpu 0 and register the page with Xen using the
      VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall
      will also forcefully update the pvti which will set some of the
      necessary flags for vdso. Afterwards we check if it supports the
      PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having
      vdso/vsyscall support. And if so, it will set the cpu 0 pvti that
      will be later on used when mapping the vdso image.
      
      The xen headers are also updated to include the new hypercall for
      registering the secondary vcpu_time_info struct.
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fb9a543
    • Joao Martins's avatar
      x86/xen/time: set pvclock flags on xen_time_init() · 4f3498a0
      Joao Martins authored
      commit b8888080 upstream.
      
      Specifically check for PVCLOCK_TSC_STABLE_BIT and if this bit is set,
      then set it too on pvclock flags. This allows Xen clocksource to use it
      and thus speeding up xen_clocksource_read() callers (i.e. sched_clock())
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f3498a0
    • Joao Martins's avatar
      x86/pvclock: add setter for pvclock_pvti_cpu0_va · 9a12f78a
      Joao Martins authored
      commit 9f08890a upstream.
      
      Right now there is only a pvclock_pvti_cpu0_va() which is defined
      on kvmclock since:
      
      commit dac16fba
      ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")
      
      The only user of this interface so far is kvm. This commit adds a
      setter function for the pvti page and moves pvclock_pvti_cpu0_va
      to pvclock, which is a more generic place to have it; and would
      allow other PV clocksources to use it, such as Xen.
      
      While moving pvclock_pvti_cpu0_va into pvclock, rename also this
      function to pvclock_get_pvti_cpu0_va (including its call sites)
      to be symmetric with the setter (pvclock_set_pvti_cpu0_va).
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a12f78a
    • Joao Martins's avatar
      ptp_kvm: probe for kvm guest availability · 45c40a96
      Joao Martins authored
      commit 001f60e1 upstream.
      
      In the event of moving pvclock_pvti_cpu0_va() definition to common
      pvclock code, this function would return a value on non KVM guests.
      Later on this would fail with a GPF on ptp_kvm_init when running on a
      Xen guest. Therefore, ptp_kvm_init() should check whether it is running
      in a KVM guest.
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Acked-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45c40a96
    • Mathias Nyman's avatar
      xhci: Fix leaking USB3 shared_hcd at xhci removal · 219dcadc
      Mathias Nyman authored
      commit f0680904 upstream.
      
      Ensure that the shared_hcd pointer is valid when calling usb_put_hcd()
      
      The shared_hcd is removed and freed in xhci by first calling
      usb_remove_hcd(xhci->shared_hcd), and later
      usb_put_hcd(xhci->shared_hcd)
      
      Afer commit fe190ed0 ("xhci: Do not halt the host until both HCD have
      disconnected their devices.") the shared_hcd was never properly put as
      xhci->shared_hcd was set to NULL before usb_put_hcd(xhci->shared_hcd) was
      called.
      
      shared_hcd (USB3) is removed before primary hcd (USB2).
      While removing the primary hcd we might need to handle xhci interrupts
      to cleanly remove last USB2 devices, therefore we need to set
      xhci->shared_hcd to NULL before removing the primary hcd to let xhci
      interrupt handler know shared_hcd is no longer available.
      
      xhci-plat.c, xhci-histb.c and xhci-mtk first create both their hcd's before
      adding them. so to keep the correct reverse removal order use a temporary
      shared_hcd variable for them.
      For more details see commit 4ac53087 ("usb: xhci: plat: Create both
      HCDs before adding them")
      
      Fixes: fe190ed0 ("xhci: Do not halt the host until both HCD have disconnected their devices.")
      Cc: Joel Stanley <joel@jms.id.au>
      Cc: Chunfeng Yun <chunfeng.yun@mediatek.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Jianguo Sun <sunjianguo1@huawei.com>
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarJack Pham <jackp@codeaurora.org>
      Tested-by: default avatarJack Pham <jackp@codeaurora.org>
      Tested-by: default avatarPeter Chen <peter.chen@nxp.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      219dcadc
    • Jack Pham's avatar
      usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup · a0d74de1
      Jack Pham authored
      commit bd674224 upstream.
      
      OUT endpoint requests may somtimes have this flag set when
      preparing to be submitted to HW indicating that there is an
      additional TRB chained to the request for alignment purposes.
      If that request is removed before the controller can execute the
      transfer (e.g. ep_dequeue/ep_disable), the request will not go
      through the dwc3_gadget_ep_cleanup_completed_request() handler
      and will not have its needs_extra_trb flag cleared when
      dwc3_gadget_giveback() is called.  This same request could be
      later requeued for a new transfer that does not require an
      extra TRB and if it is successfully completed, the cleanup
      and TRB reclamation will incorrectly process the additional TRB
      which belongs to the next request, and incorrectly advances the
      TRB dequeue pointer, thereby messing up calculation of the next
      requeust's actual/remaining count when it completes.
      
      The right thing to do here is to ensure that the flag is cleared
      before it is given back to the function driver.  A good place
      to do that is in dwc3_gadget_del_and_unmap_request().
      
      Fixes: c6267a51 ("usb: dwc3: gadget: align transfers to wMaxPacketSize")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJack Pham <jackp@codeaurora.org>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      [jackp: backport to <= 4.20: replaced 'needs_extra_trb' with 'unaligned'
              and 'zero' members in patch and reworded commit text]
      Signed-off-by: default avatarJack Pham <jackp@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0d74de1
    • Raju Rangoju's avatar
      nvmet-rdma: fix null dereference under heavy load · 74329ba3
      Raju Rangoju authored
      commit 5cbab630 upstream.
      
      Under heavy load if we don't have any pre-allocated rsps left, we
      dynamically allocate a rsp, but we are not actually allocating memory
      for nvme_completion (rsp->req.rsp). In such a case, accessing pointer
      fields (req->rsp->status) in nvmet_req_init() will result in crash.
      
      To fix this, allocate the memory for nvme_completion by calling
      nvmet_rdma_alloc_rsp()
      
      Fixes: 8407879c("nvmet-rdma:fix possible bogus dereference under heavy load")
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74329ba3
    • Israel Rukshin's avatar
      nvmet-rdma: Add unlikely for response allocated check · 5e318594
      Israel Rukshin authored
      commit ad1f8249 upstream.
      Signed-off-by: default avatarIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Raju  Rangoju <rajur@chelsio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e318594
    • David Hildenbrand's avatar
      s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU · 531ebf89
      David Hildenbrand authored
      commit 60f1bf29 upstream.
      
      When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
      from pcpu_devices->lowcore. However, due to prefixing, that will result
      in reading from absolute address 0 on that CPU. We have to go via the
      actual lowcore instead.
      
      This means that right now, we will read lc->nodat_stack == 0 and
      therfore work on a very wrong stack.
      
      This BUG essentially broke rebooting under QEMU TCG (which will report
      a low address protection exception). And checking under KVM, it is
      also broken under KVM. With 1 VCPU it can be easily triggered.
      
      :/# echo 1 > /proc/sys/kernel/sysrq
      :/# echo b > /proc/sysrq-trigger
      [   28.476745] sysrq: SysRq : Resetting
      [   28.476793] Kernel stack overflow.
      [   28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
      [   28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
      [   28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
      [   28.476861]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
      [   28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
      [   28.476864]            0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
      [   28.476864]            000000000010dff8 0000000000000000 0000000000000000 0000000000000000
      [   28.476865]            000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
      [   28.476887] Krnl Code: 0000000000115bfe: 4170f000            la      %r7,0(%r15)
      [   28.476887]            0000000000115c02: 41f0a000            la      %r15,0(%r10)
      [   28.476887]           #0000000000115c06: e370f0980024        stg     %r7,152(%r15)
      [   28.476887]           >0000000000115c0c: c0e5fffff86e        brasl   %r14,114ce8
      [   28.476887]            0000000000115c12: 41f07000            la      %r15,0(%r7)
      [   28.476887]            0000000000115c16: a7f4ffa8            brc     15,115b66
      [   28.476887]            0000000000115c1a: 0707                bcr     0,%r7
      [   28.476887]            0000000000115c1c: 0707                bcr     0,%r7
      [   28.476901] Call Trace:
      [   28.476902] Last Breaking-Event-Address:
      [   28.476920]  [<0000000000a01c4a>] arch_call_rest_init+0x22/0x80
      [   28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
      [   28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
      [   28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
      [   28.476932] Call Trace:
      
      Fixes: 2f859d0d ("s390/smp: reduce size of struct pcpu")
      Cc: stable@vger.kernel.org # 4.0+
      Reported-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      531ebf89
    • Sean Christopherson's avatar
      KVM: x86: Fix a 4.14 backport regression related to userspace/guest FPU · 9485d5d2
      Sean Christopherson authored
      Upstream commit:
      
          f775b13e ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
      
      introduced a bug, which was later fixed by upstream commit:
      
          5663d8f9 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
      
      For reasons unknown, both commits were initially passed-over for
      inclusion in the 4.14 stable branch despite being tagged for stable.
      Eventually, someone noticed that the fixup, commit 5663d8f9, was
      missing from stable[1], and so it was queued up for 4.14 and included in
      release v4.14.79.
      
      Even later, the original buggy patch, commit f775b13e, was also
      applied to the 4.14 stable branch.  Through an unlucky coincidence, the
      incorrect ordering did not generate a conflict between the two patches,
      and led to v4.14.94 and later releases containing a spurious call to
      kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run().  As a result, KVM may
      reload stale guest FPU state, e.g. after accepting in INIT event.  This
      can manifest as crashes during boot, segfaults, failed checksums and so
      on and so forth.
      
      Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make
      kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9 was backported
      after commit f775b13e.
      
      [1] https://www.spinics.net/lists/stable/msg263931.html
      
      Fixes: 4124a4cf ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
      Cc: stable@vger.kernel.org
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Reported-by: Roman Mamedov
      Reported-by: default avatarThomas Lindroth <thomas.lindroth@gmail.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9485d5d2
    • Jose Abreu's avatar
      net: stmmac: Use correct values in TQS/RQS fields · 9f2512df
      Jose Abreu authored
      commit 52a76235 upstream.
      
      Currently we are using all the available fifo size in RQS and
      TQS fields. This will not work correctly in multi-queues IP's
      because total fifo size must be splitted to the enabled queues.
      
      Correct this by computing the available fifo size per queue and
      setting the right value in TQS and RQS fields.
      Signed-off-by: default avatarJose Abreu <joabreu@synopsys.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Alexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Niklas Cassel <niklas.cassel@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f2512df
    • Sasha Levin's avatar
      Revert "seccomp: add a selftest for get_metadata" · c3ca9064
      Sasha Levin authored
      This reverts commit e65cd9a2.
      
      Tommi T. Rrantala notes:
      
      	PTRACE_SECCOMP_GET_METADATA was only added in 4.16
      	(26500475)
      
      	And it's also breaking seccomp_bpf.c compilation for me:
      
      	seccomp_bpf.c: In function ‘get_metadata’:
      	seccomp_bpf.c:2878:26: error: storage size of ‘md’ isn’t known
      	  struct seccomp_metadata md;
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c3ca9064
    • Milian Wolff's avatar
      perf unwind: Take pgoff into account when reporting elf to libdwfl · dbf80659
      Milian Wolff authored
      [ Upstream commit 1fe627da ]
      
      libdwfl parses an ELF file itself and creates mappings for the
      individual sections. perf on the other hand sees raw mmap events which
      represent individual sections. When we encounter an address pointing
      into a mapping with pgoff != 0, we must take that into account and
      report the file at the non-offset base address.
      
      This fixes unwinding with libdwfl in some cases. E.g. for a file like:
      
      ```
      
      using namespace std;
      
      mutex g_mutex;
      
      double worker()
      {
          lock_guard<mutex> guard(g_mutex);
          uniform_real_distribution<double> uniform(-1E5, 1E5);
          default_random_engine engine;
          double s = 0;
          for (int i = 0; i < 1000; ++i) {
              s += norm(complex<double>(uniform(engine), uniform(engine)));
          }
          cout << s << endl;
          return s;
      }
      
      int main()
      {
          vector<std::future<double>> results;
          for (int i = 0; i < 10000; ++i) {
              results.push_back(async(launch::async, worker));
          }
          return 0;
      }
      ```
      
      Compile it with `g++ -g -O2 -lpthread cpp-locking.cpp  -o cpp-locking`,
      then record it with `perf record --call-graph dwarf -e
      sched:sched_switch`.
      
      When you analyze it with `perf script` and libunwind, you should see:
      
      ```
      cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
                  7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
                  7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
                  7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
                  7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
                  7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
                  7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
                  7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
                  7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
                  7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
                  7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c>
                  7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl>
                  7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
                  563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
                  563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
                  563b9cb506fb double std::__invoke_impl<double, double (*)()>(std::__invoke_other, double (*&&)())+0x2b (inlined)
                  563b9cb506fb std::__invoke_result<double (*)()>::type std::__invoke<double (*)()>(double (*&&)())+0x2b (inlined)
                  563b9cb506fb decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<double (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2b (inlined)
                  563b9cb506fb std::thread::_Invoker<std::tuple<double (*)()> >::operator()()+0x2b (inlined)
                  563b9cb506fb std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<double>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<double (*)()> >, dou>
                  563b9cb506fb std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_>
                  563b9cb507e8 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const+0x28 (inlined)
                  563b9cb507e8 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)+0x28 (/ssd/milian/>
                  7f38e46d24fe __pthread_once_slow+0xbe (/usr/lib/libpthread-2.28.so)
                  563b9cb51149 __gthread_once+0xe9 (inlined)
                  563b9cb51149 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)>
                  563b9cb51149 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool)+0xe9 (inlined)
                  563b9cb51149 std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >&&)::{lambda()#1}::op>
                  563b9cb51149 void std::__invoke_impl<void, std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double>
                  563b9cb51149 std::__invoke_result<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >>
                  563b9cb51149 decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_>
                  563b9cb51149 std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<dou>
                  563b9cb51149 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread>
                  7f38e45f0062 execute_native_thread_routine+0x12 (/usr/lib/libstdc++.so.6.0.25)
                  7f38e46caa9c start_thread+0xfc (/usr/lib/libpthread-2.28.so)
                  7f38e42ccb22 __GI___clone+0x42 (inlined)
      ```
      
      Before this patch, using libdwfl, you would see:
      
      ```
      cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
                  7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
              a041161e77950c5c [unknown] ([unknown])
      ```
      
      With this patch applied, we get a bit further in unwinding:
      
      ```
      cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
              ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
                  7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
                  7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
                  7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
                  7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
                  7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
                  7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
                  7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
                  7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
                  7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
                  7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined)
                  7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c>
                  7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl>
                  7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
                  563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
                  563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
              6eab825c1ee3e4ff [unknown] ([unknown])
      ```
      
      Note that the backtrace is still stopping too early, when compared to
      the nice results obtained via libunwind. It's unclear so far what the
      reason for that is.
      
      Committer note:
      
      Further comment by Milian on the thread started on the Link: tag below:
      
       ---
      The remaining issue is due to a bug in elfutils:
      
      https://sourceware.org/ml/elfutils-devel/2018-q4/msg00089.html
      
      With both patches applied, libunwind and elfutils produce the same output for
      the above scenario.
       ---
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181029141644.3907-1-milian.wolff@kdab.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dbf80659
    • Martin Vuille's avatar
      perf unwind: Unwind with libdw doesn't take symfs into account · 9401f4e6
      Martin Vuille authored
      [ Upstream commit 3d20c624 ]
      
      Path passed to libdw for unwinding doesn't include symfs path
      if specified, so unwinding fails because ELF file is not found.
      
      Similar to unwinding with libunwind, pass symsrc_filename instead
      of long_name. If there is no symsrc_filename, fallback to long_name.
      Signed-off-by: default avatarMartin Vuille <jpmv27@aim.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20180211212420.18388-1-jpmv27@aim.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9401f4e6
    • Nicolas Pitre's avatar
      vt: invoke notifier on screen size change · 0b926c82
      Nicolas Pitre authored
      commit 0c9b1965 upstream.
      
      User space using poll() on /dev/vcs devices are not awaken when a
      screen size change occurs. Let's fix that.
      Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b926c82
    • Oliver Hartkopp's avatar
      can: bcm: check timer values before ktime conversion · e7126858
      Oliver Hartkopp authored
      commit 93171ba6 upstream.
      
      Kyungtae Kim detected a potential integer overflow in bcm_[rx|tx]_setup()
      when the conversion into ktime multiplies the given value with NSEC_PER_USEC
      (1000).
      
      Reference: https://marc.info/?l=linux-can&m=154732118819828&w=2
      
      Add a check for the given tv_usec, so that the value stays below one second.
      Additionally limit the tv_sec value to a reasonable value for CAN related
      use-cases of 400 days and ensure all values to be positive.
      Reported-by: default avatarKyungtae Kim <kt0755@gmail.com>
      Tested-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org> # >= 2.6.26
      Tested-by: default avatarKyungtae Kim <kt0755@gmail.com>
      Acked-by: default avatarAndre Naujoks <nautsch2@gmail.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7126858
    • Manfred Schlaegl's avatar
      can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it · 8849347d
      Manfred Schlaegl authored
      commit 7b12c818 upstream.
      
      This patch revert commit 7da11ba5
      ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
      
      After introduction of this change we encountered following new error
      message on various i.MX plattforms (flexcan):
      
      | flexcan 53fc8000.can can0: __can_get_echo_skb: BUG! Trying to echo non
      | existing skb: can_priv::echo_skb[0]
      
      The introduction of the message was a mistake because
      priv->echo_skb[idx] = NULL is a perfectly valid in following case: If
      CAN_RAW_LOOPBACK is disabled (setsockopt) in applications, the pkt_type
      of the tx skb's given to can_put_echo_skb is set to PACKET_LOOPBACK. In
      this case can_put_echo_skb will not set priv->echo_skb[idx]. It is
      therefore kept NULL.
      
      As additional argument for revert: The order of check and usage of idx
      was changed. idx is used to access an array element before checking it's
      boundaries.
      Signed-off-by: default avatarManfred Schlaegl <manfred.schlaegl@ginzinger.com>
      Fixes: 7da11ba5 ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8849347d
    • Marc Zyngier's avatar
      irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size · 711369e6
      Marc Zyngier authored
      commit 8208d170 upstream.
      
      The way we allocate events works fine in most cases, except
      when multiple PCI devices share an ITS-visible DevID, and that
      one of them is trying to use MultiMSI allocation.
      
      In that case, our allocation is not guaranteed to be zero-based
      anymore, and we have to make sure we allocate it on a boundary
      that is compatible with the PCI Multi-MSI constraints.
      
      Fix this by allocating the full region upfront instead of iterating
      over the number of MSIs. MSI-X are always allocated one by one,
      so this shouldn't change anything on that front.
      
      Fixes: b48ac83d ("irqchip: GICv3: ITS: MSI support")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      711369e6
    • Thomas Gleixner's avatar
      posix-cpu-timers: Unbreak timer rearming · c5d2a2cf
      Thomas Gleixner authored
      commit 93ad0fc0 upstream.
      
      The recent commit which prevented a division by 0 issue in the alarm timer
      code broke posix CPU timers as an unwanted side effect.
      
      The reason is that the common rearm code checks for timer->it_interval
      being 0 now. What went unnoticed is that the posix cpu timer setup does not
      initialize timer->it_interval as it stores the interval in CPU timer
      specific storage. The reason for the separate storage is historical as the
      posix CPU timers always had a 64bit nanoseconds representation internally
      while timer->it_interval is type ktime_t which used to be a modified
      timespec representation on 32bit machines.
      
      Instead of reverting the offending commit and fixing the alarmtimer issue
      in the alarmtimer code, store the interval in timer->it_interval at CPU
      timer setup time so the common code check works. This also repairs the
      existing inconistency of the posix CPU timer code which kept a single shot
      timer armed despite of the interval being 0.
      
      The separate storage can be removed in mainline, but that needs to be a
      separate commit as the current one has to be backported to stable kernels.
      
      Fixes: 0e334db6 ("posix-timers: Fix division by zero bug")
      Reported-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190111133500.840117406@linutronix.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5d2a2cf
    • Daniel Drake's avatar
      x86/kaslr: Fix incorrect i8254 outb() parameters · a0aab0f7
      Daniel Drake authored
      commit 7e6fc2f5 upstream.
      
      The outb() function takes parameters value and port, in that order.  Fix
      the parameters used in the kalsr i8254 fallback code.
      
      Fixes: 5bfce5ef ("x86, kaslr: Provide randomness functions")
      Signed-off-by: default avatarDaniel Drake <drake@endlessm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: linux@endlessm.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190107034024.15005-1-drake@endlessm.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0aab0f7
    • Dave Hansen's avatar
      x86/selftests/pkeys: Fork() to check for state being preserved · ce20aba7
      Dave Hansen authored
      commit e1812933 upstream.
      
      There was a bug where the per-mm pkey state was not being preserved across
      fork() in the child.  fork() is performed in the pkey selftests, but all of
      the pkey activity is performed in the parent.  The child does not perform
      any actions sensitive to pkey state.
      
      To make the test more sensitive to these kinds of bugs, add a fork() where
      the parent exits, and execution continues in the child.
      
      To achieve this let the key exhaustion test not terminate at the first
      allocation failure and fork after 2*NR_PKEYS loops and continue in the
      child.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: mpe@ellerman.id.au
      Cc: will.deacon@arm.com
      Cc: luto@kernel.org
      Cc: jroedel@suse.de
      Cc: stable@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Link: https://lkml.kernel.org/r/20190102215657.585704B7@viggo.jf.intel.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce20aba7
    • Dave Hansen's avatar
      x86/pkeys: Properly copy pkey state at fork() · 92296f7d
      Dave Hansen authored
      commit a31e184e upstream.
      
      Memory protection key behavior should be the same in a child as it was
      in the parent before a fork.  But, there is a bug that resets the
      state in the child at fork instead of preserving it.
      
      The creation of new mm's is a bit convoluted.  At fork(), the code
      does:
      
        1. memcpy() the parent mm to initialize child
        2. mm_init() to initalize some select stuff stuff
        3. dup_mmap() to create true copies that memcpy() did not do right
      
      For pkeys two bits of state need to be preserved across a fork:
      'execute_only_pkey' and 'pkey_allocation_map'.
      
      Those are preserved by the memcpy(), but mm_init() invokes
      init_new_context() which overwrites 'execute_only_pkey' and
      'pkey_allocation_map' with "new" values.
      
      The author of the code erroneously believed that init_new_context is *only*
      called at execve()-time.  But, alas, init_new_context() is used at execve()
      and fork().
      
      The result is that, after a fork(), the child's pkey state ends up looking
      like it does after an execve(), which is totally wrong.  pkeys that are
      already allocated can be allocated again, for instance.
      
      To fix this, add code called by dup_mmap() to copy the pkey state from
      parent to child explicitly.  Also add a comment above init_new_context() to
      make it more clear to the next poor sod what this code is used for.
      
      Fixes: e8c24d3a ("x86/pkeys: Allocation/free syscalls")
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: mpe@ellerman.id.au
      Cc: will.deacon@arm.com
      Cc: luto@kernel.org
      Cc: jroedel@suse.de
      Cc: stable@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Link: https://lkml.kernel.org/r/20190102215655.7A69518C@viggo.jf.intel.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92296f7d
    • Alexander Popov's avatar
      KVM: x86: Fix single-step debugging · e9728154
      Alexander Popov authored
      commit 5cc244a2 upstream.
      
      The single-step debugging of KVM guests on x86 is broken: if we run
      gdb 'stepi' command at the breakpoint when the guest interrupts are
      enabled, RIP always jumps to native_apic_mem_write(). Then other
      nasty effects follow.
      
      Long investigation showed that on Jun 7, 2017 the
      commit c8401dda ("KVM: x86: fix singlestepping over syscall")
      introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
      be called without X86_EFLAGS_TF set.
      
      Let's fix it. Please consider that for -stable.
      Signed-off-by: default avatarAlexander Popov <alex.popov@linux.com>
      Cc: stable@vger.kernel.org
      Fixes: c8401dda ("KVM: x86: fix singlestepping over syscall")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9728154
    • Milan Broz's avatar
      dm crypt: fix parsing of extended IV arguments · 06fc0dbe
      Milan Broz authored
      commit 1856b9f7 upstream.
      
      The dm-crypt cipher specification in a mapping table is defined as:
        cipher[:keycount]-chainmode-ivmode[:ivopts]
      or (new crypt API format):
        capi:cipher_api_spec-ivmode[:ivopts]
      
      For ESSIV, the parameter includes hash specification, for example:
      aes-cbc-essiv:sha256
      
      The implementation expected that additional IV option to never include
      another dash '-' character.
      
      But, with SHA3, there are names like sha3-256; so the mapping table
      parser fails:
      
      dmsetup create test --table "0 8 crypt aes-cbc-essiv:sha3-256 9c1185a5c5e9fc54612808977ee8f5b9e 0 /dev/sdb 0"
        or (new crypt API format)
      dmsetup create test --table "0 8 crypt capi:cbc(aes)-essiv:sha3-256 9c1185a5c5e9fc54612808977ee8f5b9e 0 /dev/sdb 0"
      
        device-mapper: crypt: Ignoring unexpected additional cipher options
        device-mapper: table: 253:0: crypt: Error creating IV
        device-mapper: ioctl: error adding target to table
      
      Fix the dm-crypt constructor to ignore additional dash in IV options and
      also remove a bogus warning (that is ignored anyway).
      
      Cc: stable@vger.kernel.org # 4.12+
      Signed-off-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06fc0dbe
    • Joe Thornber's avatar
      dm thin: fix passdown_double_checking_shared_status() · 294710dd
      Joe Thornber authored
      commit d445bd9c upstream.
      
      Commit 00a0ea33 ("dm thin: do not queue freed thin mapping for next
      stage processing") changed process_prepared_discard_passdown_pt1() to
      increment all the blocks being discarded until after the passdown had
      completed to avoid them being prematurely reused.
      
      IO issued to a thin device that breaks sharing with a snapshot, followed
      by a discard issued to snapshot(s) that previously shared the block(s),
      results in passdown_double_checking_shared_status() being called to
      iterate through the blocks double checking their reference count is zero
      and issuing the passdown if so.  So a side effect of commit 00a0ea33
      is passdown_double_checking_shared_status() was broken.
      
      Fix this by checking if the block reference count is greater than 1.
      Also, rename dm_pool_block_is_used() to dm_pool_block_is_shared().
      
      Fixes: 00a0ea33 ("dm thin: do not queue freed thin mapping for next stage processing")
      Cc: stable@vger.kernel.org # 4.9+
      Reported-by: ryan.p.norwood@gmail.com
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      294710dd