1. 30 Jan, 2017 1 commit
    • David Howells's avatar
      fscache: Fix dead object requeue · 22c6d17b
      David Howells authored
      Under some circumstances, an fscache object can become queued such that it
      fscache_object_work_func() can be called once the object is in the
      OBJECT_DEAD state.  This results in the kernel oopsing when it tries to
      invoke the handler for the state (which is hard coded to 0x2).
      
      The way this comes about is something like the following:
      
       (1) The object dispatcher is processing a work state for an object.  This
           is done in workqueue context.
      
       (2) An out-of-band event comes in that isn't masked, causing the object to
           be queued, say EV_KILL.
      
       (3) The object dispatcher finishes processing the current work state on
           that object and then sees there's another event to process, so,
           without returning to the workqueue core, it processes that event too.
           It then follows the chain of events that initiates until we reach
           OBJECT_DEAD without going through a wait state (such as
           WAIT_FOR_CLEARANCE).
      
           At this point, object->events may be 0, object->event_mask will be 0
           and oob_event_mask will be 0.
      
       (4) The object dispatcher returns to the workqueue processor, and in due
           course, this sees that the object's work item is still queued and
           invokes it again.
      
       (5) The current state is a work state (OBJECT_DEAD), so the dispatcher
           jumps to it - resulting in an OOPS.
      
      When I'm seeing this, the work state in (1) appears to have been either
      LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
      fscache_osm_lookup_oob).
      
      The window for (2) is very small:
      
       (A) object->event_mask is cleared whilst the event dispatch process is
           underway - though there's no memory barrier to force this to the top
           of the function.
      
           The window, therefore is from the time the object was selected by the
           workqueue processor and made requeueable to the time the mask was
           cleared.
      
       (B) fscache_raise_event() will only queue the object if it manages to set
           the event bit and the corresponding event_mask bit was set.
      
           The enqueuement is then deferred slightly whilst we get a ref on the
           object and get the per-CPU variable for workqueue congestion.  This
           slight deferral slightly increases the probability by allowing extra
           time for the workqueue to make the item requeueable.
      
      Handle this by giving the dead state a processor function and checking the
      for the dead state address rather than seeing if the processor function is
      address 0x2.  The dead state processor function can then set a flag to
      indicate that it's occurred and give a warning if it occurs more than once
      per object.
      
      
      If this race occurs, an oops similar to the following is seen (note the RIP
      value):
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
      IP: [<0000000000000002>] 0x1
      PGD 0
      Oops: 0010 [#1] SMP
      Modules linked in: ...
      CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
      Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
      Workqueue: fscache_object fscache_object_work_func [fscache]
      task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000
      RIP: 0010:[<0000000000000002>]  [<0000000000000002>] 0x1
      RSP: 0018:ffff880717547df8  EFLAGS: 00010202
      RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200
      RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480
      RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510
      R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000
      R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600
      FS:  0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
       ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
       ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
      Call Trace:
       [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
       [<ffffffff8109d5db>] process_one_work+0x17b/0x470
       [<ffffffff8109e4ac>] worker_thread+0x21c/0x400
       [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
       [<ffffffff810a5acf>] kthread+0xcf/0xe0
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff816460d8>] ret_from_fork+0x58/0x90
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarFrank Sorenson <sorenson@redhat.com>
      22c6d17b
  2. 27 Jan, 2017 12 commits
    • David Howells's avatar
      fscache: Clear outstanding writes when disabling a cookie · ebe90a0a
      David Howells authored
      fscache_disable_cookie() needs to clear the outstanding writes on the
      cookie it's disabling because they cannot be completed after.
      
      Without this, fscache_nfs_open_file() gets stuck because it disables the
      cookie when the file is opened for writing but can't uncache the pages till
      afterwards - otherwise there's a race between the open routine and anyone
      who already has it open R/O and is still reading from it.
      
      Looking in /proc/pid/stack of the offending process shows:
      
      [<ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
      [<ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
      [<ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
      [<ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
      [<ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
      [<ffffffff811743ac>] vfs_open+0x5c/0x65
      [<ffffffff81184185>] path_openat+0x785/0x8fb
      [<ffffffff81184343>] do_filp_open+0x48/0x9e
      [<ffffffff81174710>] do_sys_open+0x13b/0x1cb
      [<ffffffff811747b9>] SyS_open+0x19/0x1b
      [<ffffffff81001c44>] do_syscall_64+0x80/0x17a
      [<ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
      [<ffffffffffffffff>] 0xffffffffffffffff
      Reported-by: default avatarJianhong Yin <jiyin@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJeff Layton <jlayton@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      ebe90a0a
    • David Howells's avatar
      FS-Cache: Initialise stores_lock in netfs cookie · fa5eeba8
      David Howells authored
      Initialise the stores_lock in fscache netfs cookies.  Technically, it
      shouldn't be necessary, since the netfs cookie is an index and stores no
      data, but initialising it anyway adds insignificant overhead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      fa5eeba8
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux · fd694aaa
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "This is the main request for rc6, since really the one earlier was the
        rc5 one :-)
      
        The main thing are the nouveau specific race fixes for the connector
        locking bug we fixed in -next and reverted here as it has quite large
        prereqs. These two fixes should solve the problem at that level and we
        can fix it properly in 4.11
      
        Otherwise i915 has a bunch of changes, one ABI change for GVT related
        stuff, some VC4 leak fixes, one core fence fix and some AMD changes,
        oh and one ast hang avoidance fix.
      
        Hoping it calms down around now"
      
      * tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux: (25 commits)
        drm/nouveau: Handle fbcon suspend/resume in seperate worker
        drm/nouveau: Don't enabling polling twice on runtime resume
        drm/ast: Fixed system hanged if disable P2A
        Revert "drm/radeon: always apply pci shutdown callbacks"
        drm/i915: reinstate call to trace_i915_vma_bind
        drm/i915: Move atomic state free from out of fence release
        drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
        drm/i915: Fix calculation of rotated x and y offsets for planar formats
        drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
        drm/i915: Don't leak edid in intel_crt_detect_ddc()
        drm/i915: Release temporary load-detect state upon switching
        drm/i915: prevent crash with .disable_display parameter
        drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
        MAINTAINERS: update new mail list for intel gvt driver
        drm/i915/gvt: Fix kmem_cache_create() name
        drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
        drm/amdgpu: fix unload driver issue for virtual display
        drm/amdgpu: check ring being ready before using
        drm/vc4: Return -EINVAL on the overflow checks failing.
        drm/vc4: Fix an integer overflow in temporary allocation layout.
        ...
      fd694aaa
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2017-01-26' of... · 736a1494
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2017-01-26' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
      
      More fixes than I'd like at this stage, but I think the holidays and
      conferences have delayed finding and fixing the stuff a bit. Almost all
      of them have Fixes: tags, so it's not just random fixes, we can point
      fingers at the commits that broke stuff.
      
      There's an ABI fix to GVT from Alex, before we go on an release a kernel
      with the wrong attribute name.
      
      * tag 'drm-intel-fixes-2017-01-26' of git://anongit.freedesktop.org/git/drm-intel:
        drm/i915: reinstate call to trace_i915_vma_bind
        drm/i915: Move atomic state free from out of fence release
        drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
        drm/i915: Fix calculation of rotated x and y offsets for planar formats
        drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
        drm/i915: Don't leak edid in intel_crt_detect_ddc()
        drm/i915: Release temporary load-detect state upon switching
        drm/i915: prevent crash with .disable_display parameter
        drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
        MAINTAINERS: update new mail list for intel gvt driver
        drm/i915/gvt: Fix kmem_cache_create() name
        drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
        drm/i915/gvt: Fix relocation of shadow bb
        drm/i915/gvt: Enable the shadow batch buffer
      736a1494
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2287a240
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix two regressions introduced recently, one by reverting the
        problematic commit and one by fixing up locking in the ACPICA core.
      
        Specifics:
      
         - Revert a recent change that added an ACPI video blacklist entry for
           HP Pavilion dv6 as it turned to introduce backlight handling
           regressions on some systems (Hans de Goede).
      
         - Fix locking in the ACPICA core to avoid deadlocks related to table
           loading that were exposed by a recent change in that area (Lv
           Zheng)"
      
      * tag 'acpi-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
        ACPICA: Tables: Fix hidden logic related to acpi_tb_install_standard_table()
      2287a240
    • Linus Torvalds's avatar
      Merge tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7d3a0fa5
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix two regressions introduced recently, one by reverting the
        problematic commit and one by fixing up the behavior in an overlooked
        case.
      
        Specifics:
      
         - Revert the recent change that caused suspend-to-idle to be used as
           the default suspend method on systems where it is indicated to be
           efficient by the ACPI tables, as that turned out to be premature
           and introduced suspend regressions on some systems with missing
           power management support in device drivers (Rafael Wysocki).
      
         - Fix up the intel_pstate driver to take changes of the global limits
           via sysfs correctly when the performance policy is used which has
           been broken by a recent change in it (Srinivas Pandruvada)"
      
      * tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
        Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"
      7d3a0fa5
    • Lyude Paul's avatar
      drm/nouveau: Handle fbcon suspend/resume in seperate worker · 15266ae3
      Lyude Paul authored
      Resuming from RPM can happen while already holding
      dev->mode_config.mutex. This means we can't actually handle fbcon in
      any RPM resume workers, since restoring fbcon requires grabbing
      dev->mode_config.mutex again. So move the fbcon suspend/resume code into
      it's own worker, and rely on that instead to avoid deadlocking.
      
      This fixes more deadlocks for runtime suspending the GPU on the ThinkPad
      W541. Reproduction recipe:
      
       - Get a machine with both optimus and a nvidia card with connectors
         attached to it
       - Wait for the nvidia GPU to suspend
       - Attempt to manually reprobe any of the connectors on the nvidia GPU
         using sysfs
       - *deadlock*
      
      [airlied: use READ_ONCE to address Hans's comment]
      Signed-off-by: default avatarLyude <lyude@redhat.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: David Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      15266ae3
    • Lyude Paul's avatar
      drm/nouveau: Don't enabling polling twice on runtime resume · cae9ff03
      Lyude Paul authored
      As it turns out, on cards that actually have CRTCs on them we're already
      calling drm_kms_helper_poll_enable(drm_dev) from
      nouveau_display_resume() before we call it in
      nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
      enable polling twice, which results in a potential deadlock between the
      RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
      polling the second time while output_poll_execute is running and holding
      the mode_config lock. As such, make sure we only enable polling in
      nouveau_pmops_runtime_resume() if we need to.
      
      This fixes hangs observed on the ThinkPad W541
      Signed-off-by: default avatarLyude <lyude@redhat.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: David Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      cae9ff03
    • Y.C. Chen's avatar
      drm/ast: Fixed system hanged if disable P2A · 6c971c09
      Y.C. Chen authored
      The original ast driver will access some BMC configuration through P2A bridge
      that can be disabled since AST2300 and after.
      It will cause system hanged if P2A bridge is disabled.
      Here is the update to fix it.
      Signed-off-by: default avatarY.C. Chen <yc_chen@aspeedtech.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      6c971c09
    • Dave Airlie's avatar
      Merge tag 'drm-vc4-fixes-2017-01-23' of https://github.com/anholt/linux into drm-fixes · e996598b
      Dave Airlie authored
      This pull request brings in a few little error checking fixes and one
      slow memory leak fix.
      
      * tag 'drm-vc4-fixes-2017-01-23' of https://github.com/anholt/linux:
        drm/vc4: Return -EINVAL on the overflow checks failing.
        drm/vc4: Fix an integer overflow in temporary allocation layout.
        drm/vc4: fix a bounds check
        drm/vc4: Fix memory leak of the CRTC state.
      e996598b
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 1fb2d354
      Dave Airlie authored
      Just a few small fixes.
      
      * 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux:
        Revert "drm/radeon: always apply pci shutdown callbacks"
        drm/amdgpu: fix unload driver issue for virtual display
        drm/amdgpu: check ring being ready before using
      1fb2d354
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes · 99f300cf
      Dave Airlie authored
      Single fence fix.
      * tag 'drm-misc-fixes-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc:
        drm/fence: fix memory overwrite when setting out_fence fd
      99f300cf
  3. 26 Jan, 2017 5 commits
    • Rafael J. Wysocki's avatar
      Merge branches 'acpica' and 'acpi-video' · 0389227d
      Rafael J. Wysocki authored
      * acpica:
        ACPICA: Tables: Fix hidden logic related to acpi_tb_install_standard_table()
      
      * acpi-video:
        Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
      0389227d
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-sleep' and 'pm-cpufreq' · ff7e593c
      Rafael J. Wysocki authored
      * pm-sleep:
        Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"
      
      * pm-cpufreq:
        cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
      ff7e593c
    • Eric Dumazet's avatar
      sysctl: fix proc_doulongvec_ms_jiffies_minmax() · ff9f8a7c
      Eric Dumazet authored
      We perform the conversion between kernel jiffies and ms only when
      exporting kernel value to user space.
      
      We need to do the opposite operation when value is written by user.
      
      Only matters when HZ != 1000
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff9f8a7c
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 928d336a
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "A bunch of pin control fixes for v4.10 that didn't get sent off until
        now, sorry for the delay.
      
        It's only driver fixes:
      
         - A bunch of fixes to the Intel drivers: broxton, baytrail. Bugs
           related to register offsets, IRQ, debounce functionality.
      
         - Fix a conflict amongst UART settings on the meson.
      
         - Fix the ethernet setting on the Uniphier.
      
         - A compilation warning squelched"
      
      * tag 'pinctrl-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: uniphier: fix Ethernet (RMII) pin-mux setting for LD20
        pinctrl: meson: fix uart_ao_b for GXBB and GXL/GXM
        pinctrl: amd: avoid maybe-uninitalized warning
        pinctrl: baytrail: Do not add all GPIOs to IRQ domain
        pinctrl: baytrail: Rectify debounce support
        pinctrl: intel: Set pin direction properly
        pinctrl: broxton: Use correct PADCFGLOCK offset
      928d336a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux · bed7b016
      Linus Torvalds authored
      Pull drm revert from Dave Airlie:
       "Revert one patch missing some prereqs.
      
        One of the connector fixes was missing some prereqs, we have an
        alternate driver fix that should work that I'll send tomorrow.
      
        Today is a holiday here so quickly smashing this out"
      
      Daniel Vetter explains:
       "I pushed a locking change to fix a nouveau rpm issue to -fixes that
        needed the connector_list rework. And that's only in -next, but I
        missed that. Dave has the revert in a pull, and he'll follow-up with
        the hack nouveau patch for 4.10, and then we'll reapply the proper fix
        again for -next and revert the hacks. A bit a mess, but should be
        sorted soon"
      
      * tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux:
        Revert "drm/probe-helpers: Drop locking from poll_enable"
      bed7b016
  4. 25 Jan, 2017 22 commits