1. 29 Jun, 2017 40 commits
    • Russell King's avatar
      net: phy: fix marvell phy status reading · 958c8a07
      Russell King authored
      commit 898805e0 upstream.
      
      The Marvell driver incorrectly provides phydev->lp_advertising as the
      logical and of the link partner's advert and our advert.  This is
      incorrect - this field is supposed to store the link parter's unmodified
      advertisment.
      
      This allows ethtool to report the correct link partner auto-negotiation
      status.
      
      Fixes: be937f1f ("Marvell PHY m88e1111 driver fix")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      958c8a07
    • Hauke Mehrtens's avatar
      spi: double time out tolerance · dc95f4a3
      Hauke Mehrtens authored
      commit 833bfade upstream.
      
      The generic SPI code calculates how long the issued transfer would take
      and adds 100ms in addition to the timeout as tolerance. On my 500 MHz
      Lantiq Mips SoC I am getting timeouts from the SPI like this when the
      system boots up:
      
      m25p80 spi32766.4: SPI transfer timed out
      blk_update_request: I/O error, dev mtdblock3, sector 2
      SQUASHFS error: squashfs_read_data failed to read block 0x6e
      
      After increasing the tolerance for the timeout to 200ms I haven't seen
      these SPI transfer time outs any more.
      The Lantiq SPI driver in use here has an extra work queue in between,
      which gets triggered when the controller send the last word and the
      hardware FIFOs used for reading and writing are only 8 words long.
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc95f4a3
    • William Wu's avatar
      usb: gadget: f_fs: avoid out of bounds access on comp_desc · 87e67ff6
      William Wu authored
      commit b7f73850 upstream.
      
      Companion descriptor is only used for SuperSpeed endpoints,
      if the endpoints are HighSpeed or FullSpeed, the Companion
      descriptor will not allocated, so we can only access it if
      gadget is SuperSpeed.
      
      I can reproduce this issue on Rockchip platform rk3368 SoC
      which supports USB 2.0, and use functionfs for ADB. Kernel
      build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report
      the following BUG:
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr ffffffc0601f6509
      Read of size 1 by task swapper/0/0
      ============================================================================
      BUG kmalloc-256 (Not tainted): kasan: bad access detected
      ----------------------------------------------------------------------------
      
      Disabling lock debugging due to kernel taint
      INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1
      alloc_debug_processing+0x128/0x17c
      ___slab_alloc.constprop.58+0x50c/0x610
      __slab_alloc.isra.55.constprop.57+0x24/0x34
      __kmalloc+0xe0/0x250
      ffs_func_bind+0x52c/0x99c
      usb_add_function+0xd8/0x1d4
      configfs_composite_bind+0x48c/0x570
      udc_bind_to_driver+0x6c/0x170
      usb_udc_attach_driver+0xa4/0xd0
      gadget_dev_desc_UDC_store+0xcc/0x118
      configfs_write_file+0x1a0/0x1f8
      __vfs_write+0x64/0x174
      vfs_write+0xe4/0x200
      SyS_write+0x68/0xc8
      el0_svc_naked+0x24/0x28
      INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247
      ...
      Call trace:
      [<ffffff900808aab4>] dump_backtrace+0x0/0x230
      [<ffffff900808acf8>] show_stack+0x14/0x1c
      [<ffffff90084ad420>] dump_stack+0xa0/0xc8
      [<ffffff90082157cc>] print_trailer+0x188/0x198
      [<ffffff9008215948>] object_err+0x3c/0x4c
      [<ffffff900821b5ac>] kasan_report+0x324/0x4dc
      [<ffffff900821aa38>] __asan_load1+0x24/0x50
      [<ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0
      [<ffffff90089d3760>] composite_setup+0xdcc/0x1ac8
      [<ffffff90089d7394>] android_setup+0x124/0x1a0
      [<ffffff90089acd18>] _setup+0x54/0x74
      [<ffffff90089b6b98>] handle_ep0+0x3288/0x4390
      [<ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4
      [<ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298
      [<ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20
      [<ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac
      [<ffffff9008116610>] handle_irq_event+0x60/0xa0
      [<ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4
      [<ffffff9008115568>] generic_handle_irq+0x30/0x40
      [<ffffff90081159b4>] __handle_domain_irq+0xac/0xdc
      [<ffffff9008080e9c>] gic_handle_irq+0x64/0xa4
      ...
      Memory state around the buggy address:
        ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc
       >ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                             ^
        ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
      ==================================================================
      Signed-off-by: default avatarWilliam Wu <william.wu@rock-chips.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Cc: Jerry Zhang <zhangjerry@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87e67ff6
    • Daniel Vetter's avatar
      drm: Fix GETCONNECTOR regression · 5693426a
      Daniel Vetter authored
      commit e94ac351 upstream.
      
      In
      
      commit 91eefc05
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Dec 14 00:08:10 2016 +0100
      
          drm: Tighten locking in drm_mode_getconnector
      
      I reordered the logic a bit in that IOCTL, but that broke userspace
      since it'll get the new mode list, but not the new property values.
      Fix that again.
      
      v2: Fix up the error path handling when copy_to_user for the modes
      failes (Dhinakaran).
      
      Fixes: 91eefc05 ("drm: Tighten locking in drm_mode_getconnector")
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: dri-devel@lists.freedesktop.org
      Reported-by: default avatar"H.J. Lu" <hjl.tools@gmail.com>
      Tested-by: default avatar"H.J. Lu" <hjl.tools@gmail.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100576
      Cc: "H.J. Lu" <hjl.tools@gmail.com>
      Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
      Reviewed-by: default avatarSean Paul <seanpaul@chromium.org>
      Reviewed-by: default avatarDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170620202837.1701-1-daniel.vetter@ffwll.chSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5693426a
    • David Howells's avatar
      rxrpc: Fix several cases where a padded len isn't checked in ticket decode · 575cd7d4
      David Howells authored
      commit 5f2f9765 upstream.
      
      This fixes CVE-2017-7482.
      
      When a kerberos 5 ticket is being decoded so that it can be loaded into an
      rxrpc-type key, there are several places in which the length of a
      variable-length field is checked to make sure that it's not going to
      overrun the available data - but the data is padded to the nearest
      four-byte boundary and the code doesn't check for this extra.  This could
      lead to the size-remaining variable wrapping and the data pointer going
      over the end of the buffer.
      
      Fix this by making the various variable-length data checks use the padded
      length.
      Reported-by: default avatar石磊 <shilei-c@360.cn>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.c.dionne@auristor.com>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      575cd7d4
    • Jarkko Nikula's avatar
      ACPI / scan: Fix enumeration for special SPI and I2C devices · 017294bd
      Jarkko Nikula authored
      commit e4330d8b upstream.
      
      Commit f406270b ("ACPI / scan: Set the visited flag for all
      enumerated devices") caused that two group of special SPI or I2C
      devices do not enumerate. SPI and I2C devices are expected to be
      enumerated by the SPI and I2C subsystems but change caused that
      acpi_bus_attach() marks those devices with acpi_device_set_enumerated().
      
      First group of devices are matched using Device Tree compatible property
      with special _HID "PRP0001". Those devices have matched scan handler,
      acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them
      with acpi_device_set_enumerated().
      
      Second group of devices without valid _HID such as "LNXVIDEO" have
      device->pnp.type.platform_id set to zero and change again marks them
      with acpi_device_set_enumerated().
      
      Fix this by flagging the SPI and I2C devices during struct acpi_device
      object initialization time and let the code in acpi_bus_attach() to go
      through the device_attach() and acpi_default_enumeration() path for all
      SPI and I2C devices.
      
      Fixes: f406270b (ACPI / scan: Set the visited flag for all enumerated devices)
      Signed-off-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Acked-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      017294bd
    • Rafael J. Wysocki's avatar
      ACPI / scan: Apply default enumeration to devices with ACPI drivers · 3645b8f1
      Rafael J. Wysocki authored
      commit f5beabfe upstream.
      
      The current code in acpi_bus_attach() is inconsistent with respect
      to device objects with ACPI drivers bound to them, as it allows
      ACPI drivers to bind to device objects with existing "physical"
      device companions, but it doesn't allow "physical" device objects
      to be created for ACPI device objects with ACPI drivers bound to
      them.  Thus, in some cases, the outcome depends on the ordering
      of events which is confusing at best.
      
      For this reason, modify acpi_bus_attach() to call
      acpi_default_enumeration() for device objects with the
      pnp.type.platform_id flag set regardless of whether or not
      any ACPI drivers are bound to them.
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Reviewed-by: default avatarJoey Lee <jlee@suse.com>
      Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3645b8f1
    • Junshan Fang's avatar
      e79fccb1
    • Alex Deucher's avatar
      drm/amdgpu: adjust default display clock · 033b267d
      Alex Deucher authored
      commit 52b482b0 upstream.
      
      Increase the default display clock on newer asics to
      accomodate some high res modes with really high refresh
      rates.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826Acked-by: default avatarChunming Zhou <david1.zhou@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      033b267d
    • Alex Deucher's avatar
      drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating · 0e01b659
      Alex Deucher authored
      commit 05b4017b upstream.
      
      We were using the wrong structure which lead to an overflow
      on some boards.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387Acked-by: default avatarChunming Zhou <david1.zhou@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e01b659
    • Alex Deucher's avatar
    • Alex Deucher's avatar
    • Nicholas Bellinger's avatar
      iscsi-target: Reject immediate data underflow larger than SCSI transfer length · 47a1f8d9
      Nicholas Bellinger authored
      commit abb85a9b upstream.
      
      When iscsi WRITE underflow occurs there are two different scenarios
      that can happen.
      
      Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH
      underflow is detected, the iscsi immediate data payload is the
      smaller SCSI CDB TRANSFER LENGTH.
      
      That is, when a host fabric LLD is using a fixed size EDTL for
      a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual
      SCSI payload ends up being smaller than EDTL.  In iscsi, this
      means the received iscsi immediate data payload matches the
      smaller SCSI CDB TRANSFER LENGTH, because there is no more
      SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH.
      
      However, it's possible for a malicous host to send a WRITE
      underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH,
      but incoming iscsi immediate data actually matches EDTL.
      
      In the wild, we've never had a iscsi host environment actually
      try to do this.
      
      For this special case, it's wrong to truncate part of the
      control CDB payload and continue to process the command during
      underflow when immediate data payload received was larger than
      SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the
      bogus payload as a defensive action.
      
      Note this potential bug was originally relaxed by the following
      for allowing WRITE underflow in MSFT FCP host environments:
      
         commit c72c5250
         Author: Roland Dreier <roland@purestorage.com>
         Date:   Wed Jul 22 15:08:18 2015 -0700
      
            target: allow underflow/overflow for PR OUT etc. commands
      
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47a1f8d9
    • Nicholas Bellinger's avatar
      iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP · 600966d6
      Nicholas Bellinger authored
      commit 105fa2f4 upstream.
      
      This patch fixes a BUG() in iscsit_close_session() that could be
      triggered when iscsit_logout_post_handler() execution from within
      tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP
      (15 seconds), and the TCP connection didn't already close before
      then forcing tx thread context to automatically exit.
      
      This would manifest itself during explicit logout as:
      
      [33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179
      [33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs
      [33209.078643] ------------[ cut here ]------------
      [33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346!
      
      Normally when explicit logout attempt fails, the tx thread context
      exits and iscsit_close_connection() from rx thread context does the
      extra cleanup once it detects conn->conn_logout_remove has not been
      cleared by the logout type specific post handlers.
      
      To address this special case, if the logout post handler in tx thread
      context detects conn->tx_thread_active has already been cleared, simply
      return and exit in order for existing iscsit_close_connection()
      logic from rx thread context do failed logout cleanup.
      Reported-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Tested-by: default avatarGary Guo <ghg@datera.io>
      Tested-by: default avatarChu Yuan Lin <cyl@datera.io>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      600966d6
    • Nicholas Bellinger's avatar
      target: Fix kref->refcount underflow in transport_cmd_finish_abort · 852a8042
      Nicholas Bellinger authored
      commit 73d4e580 upstream.
      
      This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED
      when a fabric driver drops it's second reference from below the
      target_core_tmr.c based callers of transport_cmd_finish_abort().
      
      Recently with the conversion of kref to refcount_t, this bug was
      manifesting itself as:
      
      [705519.601034] refcount_t: underflow; use-after-free.
      [705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs
      [705539.719111] ------------[ cut here ]------------
      [705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51
      
      Since the original kref atomic_t based kref_put() didn't check for
      underflow and only invoked the final callback when zero was reached,
      this bug did not manifest in practice since all se_cmd memory is
      using preallocated tags.
      
      To address this, go ahead and propigate the existing return from
      transport_put_cmd() up via transport_cmd_finish_abort(), and
      change transport_cmd_finish_abort() + core_tmr_handle_tas_abort()
      callers to only do their local target_put_sess_cmd() if necessary.
      Reported-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Tested-by: default avatarGary Guo <ghg@datera.io>
      Tested-by: default avatarChu Yuan Lin <cyl@datera.io>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      852a8042
    • Will Deacon's avatar
      arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW · 830146b6
      Will Deacon authored
      commit dbb236c1 upstream.
      
      Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
      49eea433 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
      clock_gettime() vDSO"). Noticing that the core timekeeping code
      never set tkr_raw.xtime_nsec, the vDSO implementation didn't
      bother exposing it via the data page and instead took the
      unshifted tk->raw_time.tv_nsec value which was then immediately
      shifted left in the vDSO code.
      
      Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
      uncovered potential 1ns time inconsistencies caused by the
      timekeeping core not handing sub-ns resolution.
      
      Now that the core code has been fixed and is actually setting
      tkr_raw.xtime_nsec, we need to take that into account in the
      vDSO by adding it to the shifted raw_time value, in order to
      fix the user-visible inconsistency. Rather than do that at each
      use (and expand the data page in the process), instead perform
      the shift/addition operation when populating the data page and
      remove the shift from the vDSO code entirely.
      
      [jstultz: minor whitespace tweak, tried to improve commit
       message to make it more clear this fixes a regression]
      Reported-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Tested-by: default avatarDaniel Mentz <danielmentz@google.com>
      Acked-by: default avatarKevin Brodsky <kevin.brodsky@arm.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      830146b6
    • John Stultz's avatar
      time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting · 102d12f1
      John Stultz authored
      commit 3d88d56c upstream.
      
      Due to how the MONOTONIC_RAW accumulation logic was handled,
      there is the potential for a 1ns discontinuity when we do
      accumulations. This small discontinuity has for the most part
      gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
      in their vDSO clock_gettime implementation, we've seen failures
      with the inconsistency-check test in kselftest.
      
      This patch addresses the issue by using the same sub-ns
      accumulation handling that CLOCK_MONOTONIC uses, which avoids
      the issue for in-kernel users.
      
      Since the ARM64 vDSO implementation has its own clock_gettime
      calculation logic, this patch reduces the frequency of errors,
      but failures are still seen. The ARM64 vDSO will need to be
      updated to include the sub-nanosecond xtime_nsec values in its
      calculation for this issue to be completely fixed.
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Tested-by: default avatarDaniel Mentz <danielmentz@google.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      102d12f1
    • John Stultz's avatar
      time: Fix clock->read(clock) race around clocksource changes · fa79bdd4
      John Stultz authored
      commit ceea5e37 upstream.
      
      In tests, which excercise switching of clocksources, a NULL
      pointer dereference can be observed on AMR64 platforms in the
      clocksource read() function:
      
      u64 clocksource_mmio_readl_down(struct clocksource *c)
      {
      	return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
      }
      
      This is called from the core timekeeping code via:
      
      	cycle_now = tkr->read(tkr->clock);
      
      tkr->read is the cached tkr->clock->read() function pointer.
      When the clocksource is changed then tkr->clock and tkr->read
      are updated sequentially. The code above results in a sequential
      load operation of tkr->read and tkr->clock as well.
      
      If the store to tkr->clock hits between the loads of tkr->read
      and tkr->clock, then the old read() function is called with the
      new clock pointer. As a consequence the read() function
      dereferences a different data structure and the resulting 'reg'
      pointer can point anywhere including NULL.
      
      This problem was introduced when the timekeeping code was
      switched over to use struct tk_read_base. Before that, it was
      theoretically possible as well when the compiler decided to
      reload clock in the code sequence:
      
           now = tk->clock->read(tk->clock);
      
      Add a helper function which avoids the issue by reading
      tk_read_base->clock once into a local variable clk and then issue
      the read function via clk->read(clk). This guarantees that the
      read() function always gets the proper clocksource pointer handed
      in.
      
      Since there is now no use for the tkr.read pointer, this patch
      also removes it, and to address stopping the fast timekeeper
      during suspend/resume, it introduces a dummy clocksource to use
      rather then just a dummy read function.
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa79bdd4
    • Arend Van Spriel's avatar
      brcmfmac: unbind all devices upon failure in firmware callback · 0f3ae212
      Arend Van Spriel authored
      commit 7a51461f upstream.
      
      When request firmware fails, brcmf_ops_sdio_remove is being called and
      brcmf_bus freed. In such circumstancies if you do a suspend/resume cycle
      the kernel hangs on resume due a NULL pointer dereference in resume
      function. So in brcmf_sdio_firmware_callback() we need to unbind the
      driver from both sdio_func devices when firmware load failure is indicated.
      Tested-by: default avatarEnric Balletbo i Serra <enric.balletbo@collabora.com>
      Reviewed-by: default avatarHante Meuleman <hante.meuleman@broadcom.com>
      Reviewed-by: default avatarPieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
      Reviewed-by: default avatarFranky Lin <franky.lin@broadcom.com>
      Signed-off-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f3ae212
    • Arend Van Spriel's avatar
      brcmfmac: use firmware callback upon failure to load · 2cb9b88c
      Arend Van Spriel authored
      commit 03fb0e83 upstream.
      
      When firmware loading failed the code used to unbind the device provided
      by the calling code. However, for the sdio driver two devices are bound
      and both need to be released upon failure. The callback has been extended
      with parameter to pass error code so add that in this commit upon firmware
      loading failure.
      Reviewed-by: default avatarHante Meuleman <hante.meuleman@broadcom.com>
      Reviewed-by: default avatarPieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
      Reviewed-by: default avatarFranky Lin <franky.lin@broadcom.com>
      Signed-off-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cb9b88c
    • Arend Van Spriel's avatar
      brcmfmac: add parameter to pass error code in firmware callback · 1e5c7193
      Arend Van Spriel authored
      commit 6d0507a7 upstream.
      
      Extend the parameters in the firmware callback so it can be called
      upon success and failure. This allows the caller to properly clear
      all resources in the failure path. Right now the error code is
      always zero, ie. success.
      Reviewed-by: default avatarHante Meuleman <hante.meuleman@broadcom.com>
      Reviewed-by: default avatarPieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
      Reviewed-by: default avatarFranky Lin <franky.lin@broadcom.com>
      Signed-off-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e5c7193
    • Daniel Drake's avatar
      Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list · b6bfede2
      Daniel Drake authored
      commit 817ae460 upstream.
      
      Without this quirk, the touchpad is not responsive on this product, with
      the following message repeated in the logs:
      
       psmouse serio1: bad data from KBC - timeout
      
      Add it to the notimeout list alongside other similar Fujitsu laptops.
      Signed-off-by: default avatarDaniel Drake <drake@endlessm.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6bfede2
    • Naveen N. Rao's avatar
      powerpc/64s: Handle data breakpoints in Radix mode · fc1c180b
      Naveen N. Rao authored
      commit d89ba535 upstream.
      
      On Power9, trying to use data breakpoints throws the splat shown
      below. This is because the check for a data breakpoint in DSISR is in
      do_hash_page(), which is not called when in Radix mode.
      
        Unable to handle kernel paging request for data at address 0xc000000000e19218
        Faulting instruction address: 0xc0000000001155e8
        cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20]
        pc: c0000000001155e8: find_pid_ns+0x48/0xe0
        lr: c000000000116ac4: find_task_by_vpid+0x44/0x90
        sp: c0000000ef1e7da0
        msr: 9000000000009033
        dar: c000000000e19218
        dsisr: 400000
      
      Move the check to handle_page_fault() so as to catch data breakpoints
      in both Hash and Radix MMU modes.
      
      We have to change the check in do_hash_page() against 0xa410 to use
      0xa450, so as to include the value of (DSISR_DABRMATCH << 16).
      
      There are two sites that call handle_page_fault() when in Radix, both
      already pass DSISR in r4.
      
      Fixes: caca285e ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
      Reported-by: default avatarShriya R. Kulkarni <shriykul@in.ibm.com>
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      [mpe: Fix the fall-through case on hash, we need to reload DSISR]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc1c180b
    • Naveen N. Rao's avatar
      powerpc/kprobes: Pause function_graph tracing during jprobes handling · 7a6c5053
      Naveen N. Rao authored
      commit a9f8553e upstream.
      
      This fixes a crash when function_graph and jprobes are used together.
      This is essentially commit 237d28db ("ftrace/jprobes/x86: Fix
      conflict between jprobes and function graph tracing"), but for powerpc.
      
      Jprobes breaks function_graph tracing since the jprobe hook needs to use
      jprobe_return(), which never returns back to the hook, but instead to
      the original jprobe'd function. The solution is to momentarily pause
      function_graph tracing before invoking the jprobe hook and re-enable it
      when returning back to the original jprobe'd function.
      
      Fixes: 6794c782 ("powerpc64: port of the function graph tracer")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a6c5053
    • Eric W. Biederman's avatar
      signal: Only reschedule timers on signals timers have sent · e6281e0d
      Eric W. Biederman authored
      commit 57db7e4a upstream.
      
      Thomas Gleixner  wrote:
      > The CRIU support added a 'feature' which allows a user space task to send
      > arbitrary (kernel) signals to itself. The changelog says:
      >
      >   The kernel prevents sending of siginfo with positive si_code, because
      >   these codes are reserved for kernel.  I think we can allow a task to
      >   send such a siginfo to itself.  This operation should not be dangerous.
      >
      > Quite contrary to that claim, it turns out that it is outright dangerous
      > for signals with info->si_code == SI_TIMER. The following code sequence in
      > a user space task allows to crash the kernel:
      >
      >    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
      >    timer_set(id, ....);
      >    info->si_signo = SIGX;
      >    info->si_code = SI_TIMER:
      >    info->_sifields._timer._tid = id;
      >    info->_sifields._timer._sys_private = 2;
      >    rt_[tg]sigqueueinfo(..., SIGX, info);
      >    sigemptyset(&sigset);
      >    sigaddset(&sigset, SIGX);
      >    rt_sigtimedwait(sigset, info);
      >
      > For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
      > results in a kernel crash because sigwait() dequeues the signal and the
      > dequeue code observes:
      >
      >   info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
      >
      > which triggers the following callchain:
      >
      >  do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
      >
      > arm_timer() executes a list_add() on the timer, which is already armed via
      > the timer_set() syscall. That's a double list add which corrupts the posix
      > cpu timer list. As a consequence the kernel crashes on the next operation
      > touching the posix cpu timer list.
      >
      > Posix clocks which are internally implemented based on hrtimers are not
      > affected by this because hrtimer_start() can handle already armed timers
      > nicely, but it's a reliable way to trigger the WARN_ON() in
      > hrtimer_forward(), which complains about calling that function on an
      > already armed timer.
      
      This problem has existed since the posix timer code was merged into
      2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
      inject not just a signal (which linux has supported since 1.0) but the
      full siginfo of a signal.
      
      The core problem is that the code will reschedule in response to
      signals getting dequeued not just for signals the timers sent but
      for other signals that happen to a si_code of SI_TIMER.
      
      Avoid this confusion by testing to see if the queued signal was
      preallocated as all timer signals are preallocated, and so far
      only the timer code preallocates signals.
      
      Move the check for if a timer needs to be rescheduled up into
      collect_signal where the preallocation check must be performed,
      and pass the result back to dequeue_signal where the code reschedules
      timers.   This makes it clear why the code cares about preallocated
      timers.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
      Reference: 66dd34ad ("signal: allow to send any siginfo to itself")
      Reference: 1669ce53 ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
      Fixes: db8b50ba ("[PATCH] POSIX clocks & timers")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6281e0d
    • Jason A. Donenfeld's avatar
      random: silence compiler warnings and fix race · 3a182635
      Jason A. Donenfeld authored
      commit 4a072c71 upstream.
      
      Odd versions of gcc for the sh4 architecture will actually warn about
      flags being used while uninitialized, so we set them to zero. Non crazy
      gccs will optimize that out again, so it doesn't make a difference.
      
      Next, over aggressive gccs could inline the expression that defines
      use_lock, which could then introduce a race resulting in a lock
      imbalance. By using READ_ONCE, we prevent that fate. Finally, we make
      that assignment const, so that gcc can still optimize a nice amount.
      
      Finally, we fix a potential deadlock between primary_crng.lock and
      batched_entropy_reset_lock, where they could be called in opposite
      order. Moving the call to invalidate_batched_entropy to outside the lock
      rectifies this issue.
      
      Fixes: b169c13dSigned-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a182635
    • Sebastian Parschauer's avatar
      HID: Add quirk for Dell PIXART OEM mouse · 1429fdb1
      Sebastian Parschauer authored
      commit 3db28271 upstream.
      
      This mouse is also known under other IDs. It needs the quirk
      ALWAYS_POLL or will disconnect in runlevel 1 or 3.
      Signed-off-by: default avatarSebastian Parschauer <sparschauer@suse.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1429fdb1
    • Raju Rangoju's avatar
      cxgb4: notify uP to route ctrlq compl to rdma rspq · abef7afb
      Raju Rangoju authored
      commit dec6b331 upstream.
      
      During the module initialisation there is a possible race
      (basically race between uld and lld) where neither the uld
      nor lld notifies the uP about where to route the ctrl queue
      completions. LLD skips notifying uP as the rdma queues were
      not created by then (will leave it to ULD to notify the uP).
      As the ULD comes up, it also skips notifying the uP as the
      flag FULL_INIT_DONE is not set yet (ULD assumes that the
      interface is not up yet).
      
      Consequently, this race between uld and lld leaves uP
      unnotified about where to send the ctrl queue completions
      to, leading to iwarp RI_RES WR failure.
      
      Here is the race:
      
      CPU 0                                   CPU1
      
      - allocates nic rx queus
      - t4_sge_alloc_ctrl_txq()
      (if rdma rsp queues exists,
      tell uP to route ctrl queue
      compl to rdma rspq)
                                      - acquires the mutex_lock
                                      - allocates rdma response queues
                                      - if FULL_INIT_DONE set,
                                        tell uP to route ctrl queue compl
                                        to rdma rspq
                                      - relinquishes mutex_lock
      - acquires the mutex_lock
      - enable_rx()
      - set FULL_INIT_DONE
      - relinquishes mutex_lock
      
      This patch fixes the above issue.
      
      Fixes: e7519f99('cxgb4: avoid enabling napi twice to the same queue')
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abef7afb
    • Christophe Jaillet's avatar
      CIFS: Fix some return values in case of error in 'crypt_message' · 4eec15d3
      Christophe Jaillet authored
      commit 517a6e43 upstream.
      
      'rc' is known to be 0 at this point. So if 'init_sg' or 'kzalloc' fails, we
      should return -ENOMEM instead.
      
      Also remove a useless 'rc' in a debug message as it is meaningless here.
      
      Fixes: 026e93dc ("CIFS: Encrypt SMB3 requests before sending")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4eec15d3
    • Pavel Shilovsky's avatar
      CIFS: Improve readdir verbosity · dcaa5a53
      Pavel Shilovsky authored
      commit dcd87838 upstream.
      
      Downgrade the loglevel for SMB2 to prevent filling the log
      with messages if e.g. readdir was interrupted. Also make SMB2
      and SMB1 codepaths do the same logging during readdir.
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dcaa5a53
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Save/restore host values of debug registers · 3ff8eda2
      Paul Mackerras authored
      commit 7ceaa6dc upstream.
      
      At present, HV KVM on POWER8 and POWER9 machines loses any instruction
      or data breakpoint set in the host whenever a guest is run.
      Instruction breakpoints are currently only used by xmon, but ptrace
      and the perf_event subsystem can set data breakpoints as well as xmon.
      
      To fix this, we save the host values of the debug registers (CIABR,
      DAWR and DAWRX) before entering the guest and restore them on exit.
      To provide space to save them in the stack frame, we expand the stack
      frame allocated by kvmppc_hv_entry() from 112 to 144 bytes.
      
      Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ff8eda2
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit · 324df574
      Paul Mackerras authored
      commit 4c3bb4cc upstream.
      
      This restores several special-purpose registers (SPRs) to sane values
      on guest exit that were missed before.
      
      TAR and VRSAVE are readable and writable by userspace, and we need to
      save and restore them to prevent the guest from potentially affecting
      userspace execution (not that TAR or VRSAVE are used by any known
      program that run uses the KVM_RUN ioctl).  We save/restore these
      in kvmppc_vcpu_run_hv() rather than on every guest entry/exit.
      
      FSCR affects userspace execution in that it can prohibit access to
      certain facilities by userspace.  We restore it to the normal value
      for the task on exit from the KVM_RUN ioctl.
      
      IAMR is normally 0, and is restored to 0 on guest exit.  However,
      with a radix host on POWER9, it is set to a value that prevents the
      kernel from executing user-accessible memory.  On POWER9, we save
      IAMR on guest entry and restore it on guest exit to the saved value
      rather than 0.  On POWER8 we continue to set it to 0 on guest exit.
      
      PSPB is normally 0.  We restore it to 0 on guest exit to prevent
      userspace taking advantage of the guest having set it non-zero
      (which would allow userspace to set its SMT priority to high).
      
      UAMOR is normally 0.  We restore it to 0 on guest exit to prevent
      the AMR from being used as a covert channel between userspace
      processes, since the AMR is not context-switched at present.
      
      Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      324df574
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Context-switch EBB registers properly · 4ae3f358
      Paul Mackerras authored
      commit ca8efa1d upstream.
      
      This adds code to save the values of three SPRs (special-purpose
      registers) used by userspace to control event-based branches (EBBs),
      which are essentially interrupts that get delivered directly to
      userspace.  These registers are loaded up with guest values when
      entering the guest, and their values are saved when exiting the
      guest, but we were not saving the host values and restoring them
      before going back to userspace.
      
      On POWER8 this would only affect userspace programs which explicitly
      request the use of EBBs and also use the KVM_RUN ioctl, since the
      only source of EBBs on POWER8 is the PMU, and there is an explicit
      enable bit in the PMU registers (and those PMU registers do get
      properly context-switched between host and guest).  On POWER9 there
      is provision for externally-generated EBBs, and these are not subject
      to the control in the PMU registers.
      
      Since these registers only affect userspace, we can save them when
      we first come in from userspace and restore them before returning to
      userspace, rather than saving/restoring the host values on every
      guest entry/exit.  Similarly, we don't need to worry about their
      values on offline secondary threads since they execute in the context
      of the idle task, which never executes in userspace.
      
      Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ae3f358
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Ignore timebase offset on POWER9 DD1 · a7d29e27
      Paul Mackerras authored
      commit 3d3efb68 upstream.
      
      POWER9 DD1 has an erratum where writing to the TBU40 register, which
      is used to apply an offset to the timebase, can cause the timebase to
      lose counts.  This results in the timebase on some CPUs getting out of
      sync with other CPUs, which then results in misbehaviour of the
      timekeeping code.
      
      To work around the problem, we make KVM ignore the timebase offset for
      all guests on POWER9 DD1 machines.  This means that live migration
      cannot be supported on POWER9 DD1 machines.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7d29e27
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Preserve userspace HTM state properly · 4d9c6828
      Paul Mackerras authored
      commit 46a704f8 upstream.
      
      If userspace attempts to call the KVM_RUN ioctl when it has hardware
      transactional memory (HTM) enabled, the values that it has put in the
      HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by
      guest values.  To fix this, we detect this condition and save those
      SPR values in the thread struct, and disable HTM for the task.  If
      userspace goes to access those SPRs or the HTM facility in future,
      a TM-unavailable interrupt will occur and the handler will reload
      those SPRs and re-enable HTM.
      
      If userspace has started a transaction and suspended it, we would
      currently lose the transactional state in the guest entry path and
      would almost certainly get a "TM Bad Thing" interrupt, which would
      cause the host to crash.  To avoid this, we detect this case and
      return from the KVM_RUN ioctl with an EINVAL error, with the KVM
      exit reason set to KVM_EXIT_FAIL_ENTRY.
      
      Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d9c6828
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Cope with host using large decrementer mode · 1e0d7996
      Paul Mackerras authored
      commit 2f272463 upstream.
      
      POWER9 introduces a new mode for the decrementer register, called
      large decrementer mode, in which the decrementer counter is 56 bits
      wide rather than 32, and reads are sign-extended rather than
      zero-extended.  For the decrementer, this new mode is optional and
      controlled by a bit in the LPCR.  The hypervisor decrementer (HDEC)
      is 56 bits wide on POWER9 and has no mode control.
      
      Since KVM code reads and writes the decrementer and hypervisor
      decrementer registers in a few places, it needs to be aware of the
      need to treat the decrementer value as a 64-bit quantity, and only do
      a 32-bit sign extension when large decrementer mode is not in effect.
      Similarly, the HDEC should always be treated as a 64-bit quantity on
      POWER9.  We define a new EXTEND_HDEC macro to encapsulate the feature
      test for POWER9 and the sign extension.
      
      To enable the sign extension to be removed in large decrementer mode,
      we test the LPCR_LD bit in the host LPCR image stored in the struct
      kvm for the guest.  If is set then large decrementer mode is enabled
      and the sign extension should be skipped.
      
      This is partly based on an earlier patch by Oliver O'Halloran.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e0d7996
    • Heiko Carstens's avatar
      KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows · 5d34bddd
      Heiko Carstens authored
      commit addb63c1 upstream.
      
      For real-space designation asces the asce origin part is only a token.
      The asce token origin must not be used to generate an effective
      address for storage references. This however is erroneously done
      within kvm_s390_shadow_tables().
      
      Furthermore within the same function the wrong parts of virtual
      addresses are used to generate a corresponding real address
      (e.g. the region second index is used as region first index).
      
      Both of the above can result in incorrect address translations. Only
      for real space designations with a token origin of zero and addresses
      below one megabyte the translation was correct.
      
      Furthermore replace a "!asce.r" statement with a "!*fake" statement to
      make it more obvious that a specific condition has nothing to do with
      the architecture, but with the fake handling of real space designations.
      
      Fixes: 3218f709 ("s390/mm: support real-space for gmap shadows")
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d34bddd
    • James Cowgill's avatar
      KVM: MIPS: Fix maybe-uninitialized build failure · 87ee4cdd
      James Cowgill authored
      commit e27a9eca upstream.
      
      This commit fixes a "maybe-uninitialized" build failure in
      arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
      enabled. The failure is:
      
      In file included from ./include/linux/printk.h:329:0,
                       from ./include/linux/kernel.h:13,
                       from ./include/asm-generic/bug.h:15,
                       from ./arch/mips/include/asm/bug.h:41,
                       from ./include/linux/bug.h:4,
                       from ./include/linux/thread_info.h:11,
                       from ./include/asm-generic/current.h:4,
                       from ./arch/mips/include/generated/asm/current.h:1,
                       from ./include/linux/sched.h:11,
                       from arch/mips/kvm/tlb.c:13:
      arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
      ./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
         __dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
         ^~~~~~~~~~~~~~~~~~
      arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
        int idx_user, idx_kernel;
                      ^~~~~~~~~~
      
      There is a similar error relating to "idx_user". Both errors were
      observed with GCC 6.
      
      As far as I can tell, it is impossible for either idx_user or idx_kernel
      to be uninitialized when they are later read in the calls to kvm_debug,
      but to satisfy the compiler, add zero initializers to both variables.
      Signed-off-by: default avatarJames Cowgill <James.Cowgill@imgtec.com>
      Fixes: 57e3869c ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
      Acked-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87ee4cdd
    • Paolo Bonzini's avatar
      KVM: x86: fix singlestepping over syscall · 3af2b32a
      Paolo Bonzini authored
      commit c8401dda upstream.
      
      TF is handled a bit differently for syscall and sysret, compared
      to the other instructions: TF is checked after the instruction completes,
      so that the OS can disable #DB at a syscall by adding TF to FMASK.
      When the sysret is executed the #DB is taken "as if" the syscall insn
      just completed.
      
      KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
      Fix the behavior, otherwise you could get #DB on a user stack which is not
      nice.  This does not affect Linux guests, as they use an IST or task gate
      for #DB.
      
      This fixes CVE-2017-7518.
      Reported-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3af2b32a
    • Björn Töpel's avatar
      perf probe: Fix probe definition for inlined functions · 75d73538
      Björn Töpel authored
      commit 7598f8bc upstream.
      
      In commit 613f050d ("perf probe: Fix to probe on gcc generated
      functions in modules"), the offset from symbol is, incorrectly, added
      to the trace point address. This leads to incorrect probe trace points
      for inlined functions and when using relative line number on symbols.
      
      Prior this patch:
        $ perf probe -m nf_nat -D in_range
        p:probe/in_range nf_nat:in_range.isra.9+0
        $ perf probe -m i40e -D i40e_clean_rx_irq
        p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212
        $ perf probe -m i40e -D i40e_clean_rx_irq:16
        p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626
      
      After:
        $ perf probe -m nf_nat -D in_range
        p:probe/in_range nf_nat:in_range.isra.9+0
        $ perf probe -m i40e -D i40e_clean_rx_irq
        p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106
        $ perf probe -m i40e -D i40e_clean_rx_irq:16
        p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665
      
      Committer testing:
      
      Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are
      the functions that while not being explicitely marked as inline, were inlined
      by the compiler:
      
        # pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head
        __ew32
        e1000_regdump
        e1000e_dump_ps_pages
        e1000_desc_unused
        e1000e_systim_to_hwtstamp
        e1000e_rx_hwtstamp
        e1000e_update_rdt_wa
        e1000e_update_tdt_wa
        e1000_put_txbuf
        e1000_consume_page
      
      Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of
      them:
      
        # perf probe -m e1000e -D e1000e_rx_hwtstamp
        p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74
      
        # perf probe -m e1000e -D e1000_consume_page
        p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876
        p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506
        p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
      
      Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in
      e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for
      that function:
      
        $ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
        <SNIP>
        <1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram)
          <13e27c>   DW_AT_name        : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq
          <13e287>   DW_AT_low_pc      : 0x17a30
        <3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
          <13e6f0>   DW_AT_abstract_origin: <0x13ed2c>
          <13e6f4>   DW_AT_low_pc      : 0x17be6
        <SNIP>
        <1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram)
           <13ed2e>   DW_AT_name        : (indirect string, offset: 0xa54c3): e1000_consume_page
      
      So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is
      inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s
      address, gives us the offset we should use in the probe definition:
      
        0x17be6 - 0x17a30 = 438
      
      but above we have 876, which is twice as much.
      
      Lets see the second inline expansion of e1000_consume_page() in
      e1000_clean_jumbo_rx_irq():
      
        <3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
          <13e86f>   DW_AT_abstract_origin: <0x13ed2c>
          <13e873>   DW_AT_low_pc      : 0x17d21
      
        0x17d21 - 0x17a30 = 753
      
      So we where adding it at twice the offset from the containing function as we
      should.
      
      And then after this patch:
      
        # perf probe -m e1000e -D e1000e_rx_hwtstamp
        p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37
      
        # perf probe -m e1000e -D e1000_consume_page
        p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438
        p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753
        p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353
        #
      
      Which matches the two first expansions and shows that because we were
      doubling the offset it would spill over the next function:
      
        readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
         673: 0000000000017a30  1626 FUNC    LOCAL  DEFAULT    2 e1000_clean_jumbo_rx_irq
         674: 0000000000018090  2013 FUNC    LOCAL  DEFAULT    2 e1000_clean_rx_irq_ps
      
      This is the 3rd inline expansion of e1000_consume_page() in
      e1000_clean_jumbo_rx_irq():
      
         <3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
          <13ec78>   DW_AT_abstract_origin: <0x13ed2c>
          <13ec7c>   DW_AT_low_pc      : 0x17f79
      
        0x17f79 - 0x17a30 = 1353
      
       So:
      
         0x17a30 + 2 * 1353 = 0x184c2
      
        And:
      
         0x184c2 - 0x18090 = 1074
      
      Which explains the bogus third expansion for e1000_consume_page() to end up at:
      
         p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
      
      All fixed now :-)
      
      [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Fixes: 613f050d ("perf probe: Fix to probe on gcc generated functions in modules")
      Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75d73538