1. 27 Jul, 2018 20 commits
  2. 26 Jul, 2018 9 commits
    • Thomas Tai's avatar
      PCI/AER: Work around use-after-free in pcie_do_fatal_recovery() · bd91b56c
      Thomas Tai authored
      When an fatal error is received by a non-bridge device, the device is
      removed, and pci_stop_and_remove_bus_device() deallocates the device
      structure.  The freed device structure is used by subsequent code to send
      uevents and print messages.
      
      Hold a reference on the device until we're finished using it.  This is not
      an ideal fix because pcie_do_fatal_recovery() should not use the device at
      all after removing it, but that's too big a project for right now.
      
      Fixes: 7e9084b3 ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices")
      Signed-off-by: default avatarThomas Tai <thomas.tai@oracle.com>
      [bhelgaas: changelog, reduce get/put coverage]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      bd91b56c
    • Linus Torvalds's avatar
      Merge tag 'usb-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · cd3f77d7
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of USB fixes and new device ids for 4.18-rc7.
      
        The largest number are a bunch of gadget driver fixes that got delayed
        in being submitted earlier due to vacation schedules, but nothing
        really huge is present in them. There are some new device ids and some
        PHY driver fixes that were connected to some USB ones. Full details
        are in the shortlog.
      
        All have been in linux-next for a while with no reported issues"
      
      * tag 'usb-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
        usb: core: handle hub C_PORT_OVER_CURRENT condition
        usb: xhci: Fix memory leak in xhci_endpoint_reset()
        usb: typec: tcpm: Fix sink PDO starting index for PPS APDO selection
        usb: gadget: f_fs: Only return delayed status when len is 0
        usb: gadget: f_uac2: fix endianness of 'struct cntrl_*_lay3'
        usb: dwc2: Fix inefficient copy of unaligned buffers
        usb: dwc2: Fix DMA alignment to start at allocated boundary
        usb: dwc3: rockchip: Fix PHY documentation links.
        tools: usb: ffs-test: Fix build on big endian systems
        usb: gadget: aspeed: Workaround memory ordering issue
        usb: dwc3: gadget: remove redundant variable maxpacket
        usb: dwc2: avoid NULL dereferences
        usb/phy: fix PPC64 build errors in phy-fsl-usb.c
        usb: dwc2: host: do not delay retries for CONTROL IN transfers
        usb: gadget: u_audio: protect stream runtime fields with stream spinlock
        usb: gadget: u_audio: remove cached period bytes value
        usb: gadget: u_audio: remove caching of stream buffer parameters
        usb: gadget: u_audio: update hw_ptr in iso_complete after data copied
        usb: gadget: u_audio: fix pcm/card naming in g_audio_setup()
        usb: gadget: f_uac2: fix error handling in afunc_bind (again)
        ...
      cd3f77d7
    • Linus Torvalds's avatar
      Merge tag 'staging-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · fd4f84fa
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are three small staging driver fixes for 4.18-rc7.
      
        One is a revert of an earlier patch that turned out to be incorrect,
        one is a fix for the speakup drivers, and the last a fix for the
        ks7010 driver to resolve a regression.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: speakup: fix wraparound in uaccess length check
        staging: ks7010: call 'hostif_mib_set_request_int' instead of 'hostif_mib_set_request_bool'
        Revert "staging:r8188eu: Use lib80211 to support TKIP"
      fd4f84fa
    • Linus Torvalds's avatar
      Merge tag 'driver-core-4.18-rc7' of... · a5f9e5da
      Linus Torvalds authored
      Merge tag 'driver-core-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "This is a single driver core fix for 4.18-rc7. It partially reverts a
        previous commit to resolve some reported issues.
      
        It has been in linux-next for a while now with no reported issues"
      
      * tag 'driver-core-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        driver core: Partially revert "driver core: correct device's shutdown order"
      a5f9e5da
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 9bd59183
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Fix a recent ACPICA regression causing the AML parser to get confused
        and fail in some situations involving incorrect AML in an ACPI table
        (Erik Schmauss)"
      
      * tag 'acpi-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPICA: AML Parser: ignore dispatcher error status during table load
      9bd59183
    • Linus Torvalds's avatar
      Merge tag 'pm-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 99015e94
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Fix up the recently introduced cpufreq driver for Qualcomm Kryo
        processors by adding a terminating NULL entry to its table of device
        IDs (YueHaibing)"
      
      * tag 'pm-4.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: qcom-kryo: add NULL entry to the end of_device_id array
      99015e94
    • Snild Dolkow's avatar
      kthread, tracing: Don't expose half-written comm when creating kthreads · 3e536e22
      Snild Dolkow authored
      There is a window for racing when printing directly to task->comm,
      allowing other threads to see a non-terminated string. The vsnprintf
      function fills the buffer, counts the truncated chars, then finally
      writes the \0 at the end.
      
      	creator                     other
      	vsnprintf:
      	  fill (not terminated)
      	  count the rest            trace_sched_waking(p):
      	  ...                         memcpy(comm, p->comm, TASK_COMM_LEN)
      	  write \0
      
      The consequences depend on how 'other' uses the string. In our case,
      it was copied into the tracing system's saved cmdlines, a buffer of
      adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
      
      	crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
      	0xffffffd5b3818640:     "irq/497-pwr_evenkworker/u16:12"
      
      ...and a strcpy out of there would cause stack corruption:
      
      	[224761.522292] Kernel panic - not syncing: stack-protector:
      	    Kernel stack is corrupted in: ffffff9bf9783c78
      
      	crash-arm64> kbt | grep 'comm\|trace_print_context'
      	#6  0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
      	      comm (char [16]) =  "irq/497-pwr_even"
      
      	crash-arm64> rd 0xffffffd4d0e17d14 8
      	ffffffd4d0e17d14:  2f71726900000000 5f7277702d373934   ....irq/497-pwr_
      	ffffffd4d0e17d24:  726f776b6e657665 3a3631752f72656b   evenkworker/u16:
      	ffffffd4d0e17d34:  f9780248ff003231 cede60e0ffffff9b   12..H.x......`..
      	ffffffd4d0e17d44:  cede60c8ffffffd4 00000fffffffffd4   .....`..........
      
      The workaround in e09e2867 (use strlcpy in __trace_find_cmdline) was
      likely needed because of this same bug.
      
      Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
      This way, there won't be a window where comm is not terminated.
      
      Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
      
      Cc: stable@vger.kernel.org
      Fixes: bc0c38d1 ("ftrace: latency tracer infrastructure")
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSnild Dolkow <snild@sony.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      3e536e22
    • Steven Rostedt (VMware)'s avatar
      tracing: Quiet gcc warning about maybe unused link variable · 2519c1bb
      Steven Rostedt (VMware) authored
      Commit 57ea2a34 ("tracing/kprobes: Fix trace_probe flags on
      enable_trace_kprobe() failure") added an if statement that depends on another
      if statement that gcc doesn't see will initialize the "link" variable and
      gives the warning:
      
       "warning: 'link' may be used uninitialized in this function"
      
      It is really a false positive, but to quiet the warning, and also to make
      sure that it never actually is used uninitialized, initialize the "link"
      variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler
      thinks it could be used uninitialized.
      
      Cc: stable@vger.kernel.org
      Fixes: 57ea2a34 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      2519c1bb
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix possible double free in event_enable_trigger_func() · 15cc7864
      Steven Rostedt (VMware) authored
      There was a case that triggered a double free in event_trigger_callback()
      due to the called reg() function freeing the trigger_data and then it
      getting freed again by the error return by the caller. The solution there
      was to up the trigger_data ref count.
      
      Code inspection found that event_enable_trigger_func() has the same issue,
      but is not as easy to trigger (requires harder to trigger failures). It
      needs to be solved slightly different as it needs more to clean up when the
      reg() function fails.
      
      Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Fixes: 7862ad18 ("tracing: Add 'enable_event' and 'disable_event' event trigger commands")
      Reivewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      15cc7864
  3. 25 Jul, 2018 11 commits
    • Clint Taylor's avatar
      drm/i915/glk: Add Quirk for GLK NUC HDMI port issues. · 0ca94881
      Clint Taylor authored
      On GLK NUC platforms the HDMI retiming buffer needs additional disabled
      time to correctly sync to a faster incoming signal.
      
      When measured on a scope the highspeed lines of the HDMI clock turn off
       for ~400uS during a normal resolution change. The HDMI retimer on the
       GLK NUC appears to require at least a full frame of quiet time before a
      new faster clock can be correctly sync'd. Wait 100ms due to msleep
      inaccuracies while waiting for a completed frame. Add a quirk to the
      driver for GLK boards that use ITE66317 HDMI retimers.
      
      V2: Add more devices to the quirk list
      V3: Delay increased to 100ms, check to confirm crtc type is HDMI.
      V4: crtc type check extended to include _DDI and whitespace fixes
      v5: Fix white spaces, remove the macro for delay. Revert the crtc type
          check introduced in v4.
      
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: <stable@vger.kernel.org> # v4.14+
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105887Signed-off-by: default avatarClint Taylor <clinton.a.taylor@intel.com>
      Tested-by: default avatarDaniel Scheller <d.scheller.oss@gmail.com>
      Signed-off-by: default avatarRadhakrishna Sripada <radhakrishna.sripada@intel.com>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarImre Deak <imre.deak@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180710200205.1478-1-radhakrishna.sripada@intel.com
      (cherry picked from commit 90c3e219)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      0ca94881
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 6e77b267
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "One more round of updates for problems seen this -rc series. Drivers
        fixes are:
      
         - Amlogic Meson audio divider fix and CPU clk critical marking
      
         - Qualcomm multimedia GDSC marked as 'always on' to keep display
           working
      
         - Aspeed fixes for critical clks, resets causing clks to stay
           disabled, and an incorrect HPLL frequency calculation
      
         - Marvell Armada 3700 cpu clks would undervolt when switching from
           low frequencies to high frequencies because the voltage didn't
           stabilize in time so now we switch to an intermediate frequency
      
        Plus we have a core framework thinko that messed up the debugfs flag
        printing logic to make it not very useful"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: aspeed: Support HPLL strapping on ast2400
        clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz
        clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical
        clk/mmcc-msm8996: Make mmagic_bimc_gdsc ALWAYS_ON
        clk: aspeed: Treat a gate in reset as disabled
        clk: Really show symbolic clock flags in debugfs
        clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock
        clk: meson: audio-divider is one based
        clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
      6e77b267
    • Linus Torvalds's avatar
      Merge tag 'fscache-fixes-20180725' of... · 5c61ef1b
      Linus Torvalds authored
      Merge tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull fscache/cachefiles fixes from David Howells:
      
       - Allow cancelled operations to be queued so they can be cleaned up.
      
       - Fix a refcounting bug in the monitoring of reads on backend files
         whereby a race can occur between monitor objects being listed for
         work, the work processing being queued and the work processor running
         and destroying the monitor objects.
      
       - Fix a ref overput in object attachment, whereby a tentatively
         considered object is put in error handling without first being 'got'.
      
       - Fix a missing clear of the CACHEFILES_OBJECT_ACTIVE flag whereby an
         assertion occurs when we retry because it seems the object is now
         active.
      
       - Wait rather BUG'ing on an object collision in the depths of
         cachefiles as the active object should be being cleaned up - also
         depends on the one above.
      
      * tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
        cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
        fscache: Fix reference overput in fscache_attach_object() error handling
        cachefiles: Fix refcounting bug in backing-file read monitoring
        fscache: Allow cancelled operations to be enqueued
      5c61ef1b
    • Artem Savkov's avatar
      tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure · 57ea2a34
      Artem Savkov authored
      If enable_trace_kprobe fails to enable the probe in enable_k(ret)probe
      it returns an error, but does not unset the tp flags it set previously.
      This results in a probe being considered enabled and failures like being
      unable to remove the probe through kprobe_events file since probes_open()
      expects every probe to be disabled.
      
      Link: http://lkml.kernel.org/r/20180725102826.8300-1-asavkov@redhat.com
      Link: http://lkml.kernel.org/r/20180725142038.4765-1-asavkov@redhat.com
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 41a7dd42 ("tracing/kprobes: Support ftrace_event_file base multibuffer")
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarArtem Savkov <asavkov@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      57ea2a34
    • Masami Hiramatsu's avatar
      selftests/ftrace: Add snapshot and tracing_on test case · 82f4f3e6
      Masami Hiramatsu authored
      Add a testcase for checking snapshot and tracing_on
      relationship. This ensures that the snapshotting doesn't
      affect current tracing on/off settings.
      
      Link: http://lkml.kernel.org/r/153149932412.11274.15289227592627901488.stgit@devbox
      
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      82f4f3e6
    • Masami Hiramatsu's avatar
      ring_buffer: tracing: Inherit the tracing setting to next ring buffer · 73c8d894
      Masami Hiramatsu authored
      Maintain the tracing on/off setting of the ring_buffer when switching
      to the trace buffer snapshot.
      
      Taking a snapshot is done by swapping the backup ring buffer
      (max_tr_buffer). But since the tracing on/off setting is defined
      by the ring buffer, when swapping it, the tracing on/off setting
      can also be changed. This causes a strange result like below:
      
        /sys/kernel/debug/tracing # cat tracing_on
        1
        /sys/kernel/debug/tracing # echo 0 > tracing_on
        /sys/kernel/debug/tracing # cat tracing_on
        0
        /sys/kernel/debug/tracing # echo 1 > snapshot
        /sys/kernel/debug/tracing # cat tracing_on
        1
        /sys/kernel/debug/tracing # echo 1 > snapshot
        /sys/kernel/debug/tracing # cat tracing_on
        0
      
      We don't touch tracing_on, but snapshot changes tracing_on
      setting each time. This is an anomaly, because user doesn't know
      that each "ring_buffer" stores its own tracing-enable state and
      the snapshot is done by swapping ring buffers.
      
      Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
      Cc: stable@vger.kernel.org
      Fixes: debdd57f ("tracing: Make a snapshot feature available from userspace")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      [ Updated commit log and comment in the code ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      73c8d894
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix double free of event_trigger_data · 1863c387
      Steven Rostedt (VMware) authored
      Running the following:
      
       # cd /sys/kernel/debug/tracing
       # echo 500000 > buffer_size_kb
      [ Or some other number that takes up most of memory ]
       # echo snapshot > events/sched/sched_switch/trigger
      
      Triggers the following bug:
      
       ------------[ cut here ]------------
       kernel BUG at mm/slub.c:296!
       invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
       CPU: 6 PID: 6878 Comm: bash Not tainted 4.18.0-rc6-test+ #1066
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
       RIP: 0010:kfree+0x16c/0x180
       Code: 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 ac b3 f8 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 f4 f3 ff ff <0f> 0b 0f 0b 48 8b 3d d9 d8 f9 00 e9 c1 fe ff ff 0f 1f 40 00 0f 1f
       RSP: 0018:ffffb654436d3d88 EFLAGS: 00010246
       RAX: ffff91a9d50f3d80 RBX: ffff91a9d50f3d80 RCX: ffff91a9d50f3d80
       RDX: 00000000000006a4 RSI: ffff91a9de5a60e0 RDI: ffff91a9d9803500
       RBP: ffffffff8d267c80 R08: 00000000000260e0 R09: ffffffff8c1a56be
       R10: fffff0d404543cc0 R11: 0000000000000389 R12: ffffffff8c1a56be
       R13: ffff91a9d9930e18 R14: ffff91a98c0c2890 R15: ffffffff8d267d00
       FS:  00007f363ea64700(0000) GS:ffff91a9de580000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 000055c1cacc8e10 CR3: 00000000d9b46003 CR4: 00000000001606e0
       Call Trace:
        event_trigger_callback+0xee/0x1d0
        event_trigger_write+0xfc/0x1a0
        __vfs_write+0x33/0x190
        ? handle_mm_fault+0x115/0x230
        ? _cond_resched+0x16/0x40
        vfs_write+0xb0/0x190
        ksys_write+0x52/0xc0
        do_syscall_64+0x5a/0x160
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
       RIP: 0033:0x7f363e16ab50
       Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
       RSP: 002b:00007fff9a4c6378 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
       RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f363e16ab50
       RDX: 0000000000000009 RSI: 000055c1cacc8e10 RDI: 0000000000000001
       RBP: 000055c1cacc8e10 R08: 00007f363e435740 R09: 00007f363ea64700
       R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000009
       R13: 0000000000000001 R14: 00007f363e4345e0 R15: 00007f363e4303c0
       Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device i915 snd_pcm snd_timer i2c_i801 snd soundcore i2c_algo_bit drm_kms_helper
      86_pkg_temp_thermal video kvm_intel kvm irqbypass wmi e1000e
       ---[ end trace d301afa879ddfa25 ]---
      
      The cause is because the register_snapshot_trigger() call failed to
      allocate the snapshot buffer, and then called unregister_trigger()
      which freed the data that was passed to it. Then on return to the
      function that called register_snapshot_trigger(), as it sees it
      failed to register, it frees the trigger_data again and causes
      a double free.
      
      By calling event_trigger_init() on the trigger_data (which only ups
      the reference counter for it), and then event_trigger_free() afterward,
      the trigger_data would not get freed by the registering trigger function
      as it would only up and lower the ref count for it. If the register
      trigger function fails, then the event_trigger_free() called after it
      will free the trigger data normally.
      
      Link: http://lkml.kernel.org/r/20180724191331.738eb819@gandalf.local.home
      
      Cc: stable@vger.kerne.org
      Fixes: 93e31ffb ("tracing: Add 'snapshot' event trigger command")
      Reported-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      1863c387
    • Kiran Kumar Modukuri's avatar
      cachefiles: Wait rather than BUG'ing on "Unexpected object collision" · c2412ac4
      Kiran Kumar Modukuri authored
      If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
      the active object tree, we have been emitting a BUG after logging
      information about it and the new object.
      
      Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
      on the old object (or return an error).  The ACTIVE flag should be cleared
      after it has been removed from the active object tree.  A timeout of 60s is
      used in the wait, so we shouldn't be able to get stuck there.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c2412ac4
    • Kiran Kumar Modukuri's avatar
      cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag · 5ce83d4b
      Kiran Kumar Modukuri authored
      In cachefiles_mark_object_active(), the new object is marked active and
      then we try to add it to the active object tree.  If a conflicting object
      is already present, we want to wait for that to go away.  After the wait,
      we go round again and try to re-mark the object as being active - but it's
      already marked active from the first time we went through and a BUG is
      issued.
      
      Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.
      
      Analysis from Kiran Kumar Modukuri:
      
      [Impact]
      Oops during heavy NFS + FSCache + Cachefiles
      
      CacheFiles: Error: Overlong wait for old active object to go away.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
      
      CacheFiles: Error: Object already active kernel BUG at
      fs/cachefiles/namei.c:163!
      
      [Cause]
      In a heavily loaded system with big files being read and truncated, an
      fscache object for a cookie is being dropped and a new object being
      looked. The new object being looked for has to wait for the old object
      to go away before the new object is moved to active state.
      
      [Fix]
      Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
      retrying the object lookup.
      
      [Testcase]
      Have run ~100 hours of NFS stress tests and have not seen this bug recur.
      
      [Regression Potential]
       - Limited to fscache/cachefiles.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5ce83d4b
    • Kiran Kumar Modukuri's avatar
      fscache: Fix reference overput in fscache_attach_object() error handling · f29507ce
      Kiran Kumar Modukuri authored
      When a cookie is allocated that causes fscache_object structs to be
      allocated, those objects are initialised with the cookie pointer, but
      aren't blessed with a ref on that cookie unless the attachment is
      successfully completed in fscache_attach_object().
      
      If attachment fails because the parent object was dying or there was a
      collision, fscache_attach_object() returns without incrementing the cookie
      counter - but upon failure of this function, the object is released which
      then puts the cookie, whether or not a ref was taken on the cookie.
      
      Fix this by taking a ref on the cookie when it is assigned in
      fscache_object_init(), even when we're creating a root object.
      
      
      Analysis from Kiran Kumar:
      
      This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel
      
      BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277
      
      fscache cookie ref count updated incorrectly during fscache object
      allocation resulting in following Oops.
      
      kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
      kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
      
      [Cause]
      Two threads are trying to do operate on a cookie and two objects.
      
      (1) One thread tries to unmount the filesystem and in process goes over a
          huge list of objects marking them dead and deleting the objects.
          cookie->usage is also decremented in following path:
      
            nfs_fscache_release_super_cookie
             -> __fscache_relinquish_cookie
              ->__fscache_cookie_put
              ->BUG_ON(atomic_read(&cookie->usage) <= 0);
      
      (2) A second thread tries to lookup an object for reading data in following
          path:
      
          fscache_alloc_object
          1) cachefiles_alloc_object
              -> fscache_object_init
                 -> assign cookie, but usage not bumped.
          2) fscache_attach_object -> fails in cant_attach_object because the
               cookie's backing object or cookie's->parent object are going away
          3) fscache_put_object
              -> cachefiles_put_object
                ->fscache_object_destroy
                  ->fscache_cookie_put
                     ->BUG_ON(atomic_read(&cookie->usage) <= 0);
      
      [NOTE from dhowells] It's unclear as to the circumstances in which (2) can
      take place, given that thread (1) is in nfs_kill_super(), however a
      conflicting NFS mount with slightly different parameters that creates a
      different superblock would do it.  A backtrace from Kiran seems to show
      that this is a possibility:
      
          kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
          ...
          RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
          Call Trace:
           __fscache_relinquish_cookie+0x87/0x120 [fscache]
           nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
           nfs_kill_super+0x29/0x40 [nfs]
           deactivate_locked_super+0x48/0x80
           deactivate_super+0x5c/0x60
           cleanup_mnt+0x3f/0x90
           __cleanup_mnt+0x12/0x20
           task_work_run+0x86/0xb0
           exit_to_usermode_loop+0xc2/0xd0
           syscall_return_slowpath+0x4e/0x60
           int_ret_from_sys_call+0x25/0x9f
      
      [Fix] Bump up the cookie usage in fscache_object_init, when it is first
      being assigned a cookie atomically such that the cookie is added and bumped
      up if its refcount is not zero.  Remove the assignment in
      fscache_attach_object().
      
      [Testcase]
      I have run ~100 hours of NFS stress tests and not seen this bug recur.
      
      [Regression Potential]
       - Limited to fscache/cachefiles.
      
      Fixes: ccc4fc3d ("FS-Cache: Implement the cookie management part of the netfs API")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f29507ce
    • Kiran Kumar Modukuri's avatar
      cachefiles: Fix refcounting bug in backing-file read monitoring · 934140ab
      Kiran Kumar Modukuri authored
      cachefiles_read_waiter() has the right to access a 'monitor' object by
      virtue of being called under the waitqueue lock for one of the pages in its
      purview.  However, it has no ref on that monitor object or on the
      associated operation.
      
      What it is allowed to do is to move the monitor object to the operation's
      to_do list, but once it drops the work_lock, it's actually no longer
      permitted to access that object.  However, it is trying to enqueue the
      retrieval operation for processing - but it can only do this via a pointer
      in the monitor object, something it shouldn't be doing.
      
      If it doesn't enqueue the operation, the operation may not get processed.
      If the order is flipped so that the enqueue is first, then it's possible
      for the work processor to look at the to_do list before the monitor is
      enqueued upon it.
      
      Fix this by getting a ref on the operation so that we can trust that it
      will still be there once we've added the monitor to the to_do list and
      dropped the work_lock.  The op can then be enqueued after the lock is
      dropped.
      
      The bug can manifest in one of a couple of ways.  The first manifestation
      looks like:
      
       FS-Cache:
       FS-Cache: Assertion failed
       FS-Cache: 6 == 5 is false
       ------------[ cut here ]------------
       kernel BUG at fs/fscache/operation.c:494!
       RIP: 0010:fscache_put_operation+0x1e3/0x1f0
       ...
       fscache_op_work_func+0x26/0x50
       process_one_work+0x131/0x290
       worker_thread+0x45/0x360
       kthread+0xf8/0x130
       ? create_worker+0x190/0x190
       ? kthread_cancel_work_sync+0x10/0x10
       ret_from_fork+0x1f/0x30
      
      This is due to the operation being in the DEAD state (6) rather than
      INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
      fscache_put_operation().
      
      The bug can also manifest like the following:
      
       kernel BUG at fs/fscache/operation.c:69!
       ...
          [exception RIP: fscache_enqueue_operation+246]
       ...
       #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
       #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
       #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028
      
      I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
      entirely clear which assertion failed.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Reported-by: default avatarLei Xue <carmark.dlut@gmail.com>
      Reported-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
      Reported-by: default avatarAnthony DeRobertis <aderobertis@metrics.net>
      Reported-by: default avatarNeilBrown <neilb@suse.com>
      Reported-by: default avatarDaniel Axtens <dja@axtens.net>
      Reported-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      934140ab