1. 20 May, 2017 40 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.29 · f5eea276
      Greg Kroah-Hartman authored
      f5eea276
    • Kees Cook's avatar
      pstore: Shut down worker when unregistering · 9ee8502b
      Kees Cook authored
      commit 6330d553 upstream.
      
      When built as a module and running with update_ms >= 0, pstore will Oops
      during module unload since the work timer is still running. This makes sure
      the worker is stopped before unloading.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ee8502b
    • Ankit Kumar's avatar
      pstore: Fix flags to enable dumps on powerpc · a4de9300
      Ankit Kumar authored
      commit 041939c1 upstream.
      
      After commit c950fd6f kernel registers pstore write based on flag set.
      Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for
      powerpc architecture. On panic, kernel doesn't write message to
      /fs/pstore/dmesg*(Entry doesn't gets created at all).
      
      This patch enables pstore write for powerpc architecture by setting
      PSTORE_FLAGS_DMESG flag.
      
      Fixes: c950fd6f ("pstore: Split pstore fragile flags")
      Signed-off-by: default avatarAnkit Kumar <ankit@linux.vnet.ibm.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4de9300
    • Dan Williams's avatar
      libnvdimm, pfn: fix 'npfns' vs section alignment · 1a102950
      Dan Williams authored
      commit d5483fed upstream.
      
      Fix failures to create namespaces due to the vmem_altmap not advertising
      enough free space to store the memmap.
      
       WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
       [..]
       Call Trace:
        dump_stack+0x63/0x83
        __warn+0xcb/0xf0
        warn_slowpath_null+0x1d/0x20
        arch_add_memory+0xde/0xf0
        devm_memremap_pages+0x244/0x440
        pmem_attach_disk+0x37e/0x490 [nd_pmem]
        nd_pmem_probe+0x7e/0xa0 [nd_pmem]
        nvdimm_bus_probe+0x71/0x120 [libnvdimm]
        driver_probe_device+0x2bb/0x460
        bind_store+0x114/0x160
        drv_attr_store+0x25/0x30
      
      In commit 658922e5 "libnvdimm, pfn: fix memmap reservation sizing"
      we arranged for the capacity to be allocated, but failed to also update
      the 'npfns' parameter. This leads to cases where there is enough
      capacity reserved to hold all the allocated sections, but
      vmemmap_populate_hugepages() still encounters -ENOMEM from
      altmap_alloc_block_buf().
      
      This fix is a stop-gap until we can teach the core memory hotplug
      implementation to permit sub-section hotplug.
      
      Fixes: 658922e5 ("libnvdimm, pfn: fix memmap reservation sizing")
      Reported-by: default avatarAnisha Allada <anisha.allada@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a102950
    • Toshi Kani's avatar
      libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify · c171b24f
      Toshi Kani authored
      commit b2518c78 upstream.
      
      The following BUG was observed when nd_pmem_notify() was called
      for a BTT device.  The use of a pmem_device pointer is not valid
      with BTT.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
       IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
       Call Trace:
        nd_device_notify+0x40/0x50
        child_notify+0x10/0x20
        device_for_each_child+0x50/0x90
        nd_region_notify+0x20/0x30
        nd_device_notify+0x40/0x50
        nvdimm_region_notify+0x27/0x30
        acpi_nfit_scrub+0x341/0x590 [nfit]
        process_one_work+0x197/0x450
        worker_thread+0x4e/0x4a0
        kthread+0x109/0x140
      
      Fix nd_pmem_notify() by setting nd_region and badblocks pointers
      properly for BTT.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Fixes: 71999466 ("libnvdimm: async notification support")
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c171b24f
    • Dan Williams's avatar
      libnvdimm, region: fix flush hint detection crash · 5b6e7f35
      Dan Williams authored
      commit bc042fdf upstream.
      
      In the case where a dimm does not have any associated flush hints the
      ndrd->flush_wpq array may be uninitialized leading to crashes with the
      following signature:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
       IP: region_visible+0x10f/0x160 [libnvdimm]
      
       Call Trace:
        internal_create_group+0xbe/0x2f0
        sysfs_create_groups+0x40/0x80
        device_add+0x2d8/0x650
        nd_async_device_register+0x12/0x40 [libnvdimm]
        async_run_entry_fn+0x39/0x170
        process_one_work+0x212/0x6c0
        ? process_one_work+0x197/0x6c0
        worker_thread+0x4e/0x4a0
        kthread+0x10c/0x140
        ? process_one_work+0x6c0/0x6c0
        ? kthread_create_on_node+0x60/0x60
        ret_from_fork+0x31/0x40
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Fixes: f284a4f2 ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()")
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b6e7f35
    • Joeseph Chang's avatar
      ipmi: Fix kernel panic at ipmi_ssif_thread() · 46ba11b0
      Joeseph Chang authored
      commit 6de65fcf upstream.
      
      msg_written_handler() may set ssif_info->multi_data to NULL
      when using ipmitool to write fru.
      
      Before setting ssif_info->multi_data to NULL, add new local
      pointer "data_to_send" and store correct i2c data pointer to
      it to fix NULL pointer kernel panic and incorrect ssif_info->multi_pos.
      Signed-off-by: default avatarJoeseph Chang <joechang@codeaurora.org>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46ba11b0
    • Johan Hovold's avatar
      Bluetooth: hci_intel: add missing tty-device sanity check · 6e7de39e
      Johan Hovold authored
      commit dcb9cfaa upstream.
      
      Make sure to check the tty-device pointer before looking up the sibling
      platform device to avoid dereferencing a NULL-pointer when the tty is
      one end of a Unix98 pty.
      
      Fixes: 74cdad37 ("Bluetooth: hci_intel: Add runtime PM support")
      Fixes: 1ab1f239 ("Bluetooth: hci_intel: Add support for platform driver")
      Cc: Loic Poulain <loic.poulain@intel.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e7de39e
    • Johan Hovold's avatar
      Bluetooth: hci_bcm: add missing tty-device sanity check · f2f6d77f
      Johan Hovold authored
      commit 95065a61 upstream.
      
      Make sure to check the tty-device pointer before looking up the sibling
      platform device to avoid dereferencing a NULL-pointer when the tty is
      one end of a Unix98 pty.
      
      Fixes: 0395ffc1 ("Bluetooth: hci_bcm: Add PM for BCM devices")
      Cc: Frederic Danis <frederic.danis@linux.intel.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2f6d77f
    • Szymon Janc's avatar
      Bluetooth: Fix user channel for 32bit userspace on 64bit kernel · 518ca844
      Szymon Janc authored
      commit ab89f0bd upstream.
      
      Running 32bit userspace on 64bit kernel results in MSG_CMSG_COMPAT being
      defined as 0x80000000. This results in sendmsg failure if used from 32bit
      userspace running on 64bit kernel. Fix this by accounting for MSG_CMSG_COMPAT
      in flags check in hci_sock_sendmsg.
      Signed-off-by: default avatarSzymon Janc <szymon.janc@codecoup.pl>
      Signed-off-by: default avatarMarko Kiiskila <marko@runtime.io>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      518ca844
    • Wang YanQing's avatar
      tty: pty: Fix ldisc flush after userspace become aware of the data already · 89c91ea3
      Wang YanQing authored
      commit 77dae613 upstream.
      
      While using emacs, cat or others' commands in konsole with recent
      kernels, I have met many times that CTRL-C freeze konsole. After
      konsole freeze I can't type anything, then I have to open a new one,
      it is very annoying.
      
      See bug report:
      https://bugs.kde.org/show_bug.cgi?id=175283
      
      The platform in that bug report is Solaris, but now the pty in linux
      has the same problem or the same behavior as Solaris :)
      
      It has high possibility to trigger the problem follow steps below:
      Note: In my test, BigFile is a text file whose size is bigger than 1G
      1:open konsole
      1:cat BigFile
      2:CTRL-C
      
      After some digging, I find out the reason is that commit 1d1d14da
      ("pty: Fix buffer flush deadlock") changes the behavior of pty_flush_buffer.
      
      Thread A                                 Thread B
      --------                                 --------
      1:n_tty_poll return POLLIN
                                               2:CTRL-C trigger pty_flush_buffer
                                                   tty_buffer_flush
                                                     n_tty_flush_buffer
      3:attempt to check count of chars:
        ioctl(fd, TIOCINQ, &available)
        available is equal to 0
      
      4:read(fd, buffer, avaiable)
        return 0
      
      5:konsole close fd
      
      Yes, I know we could use the same patch included in the BUG report as
      a workaround for linux platform too. But I think the data in ldisc is
      belong to application of another side, we shouldn't clear it when we
      want to flush write buffer of this side in pty_flush_buffer. So I think
      it is better to disable ldisc flush in pty_flush_buffer, because its new
      hehavior bring no benefit except that it mess up the behavior between
      POLLIN, and TIOCINQ or FIONREAD.
      
      Also I find no flush_buffer function in others' tty driver has the
      same behavior as current pty_flush_buffer.
      
      Fixes: 1d1d14da ("pty: Fix buffer flush deadlock")
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89c91ea3
    • Johan Hovold's avatar
      serial: omap: suspend device on probe errors · e38a4c3b
      Johan Hovold authored
      commit 77e6fe7f upstream.
      
      Make sure to actually suspend the device before returning after a failed
      (or deferred) probe.
      
      Note that autosuspend must be disabled before runtime pm is disabled in
      order to balance the usage count due to a negative autosuspend delay as
      well as to make the final put suspend the device synchronously.
      
      Fixes: 388bc262 ("omap-serial: Fix the error handling in the omap_serial probe")
      Cc: Shubhrajyoti D <shubhrajyoti@ti.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e38a4c3b
    • Johan Hovold's avatar
      serial: omap: fix runtime-pm handling on unbind · f8d2751b
      Johan Hovold authored
      commit 099bd73d upstream.
      
      An unbalanced and misplaced synchronous put was used to suspend the
      device on driver unbind, something which with a likewise misplaced
      pm_runtime_disable leads to external aborts when an open port is being
      removed.
      
      Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa024010
      ...
      [<c046e760>] (serial_omap_set_mctrl) from [<c046a064>] (uart_update_mctrl+0x50/0x60)
      [<c046a064>] (uart_update_mctrl) from [<c046a400>] (uart_shutdown+0xbc/0x138)
      [<c046a400>] (uart_shutdown) from [<c046bd2c>] (uart_hangup+0x94/0x190)
      [<c046bd2c>] (uart_hangup) from [<c045b760>] (__tty_hangup+0x404/0x41c)
      [<c045b760>] (__tty_hangup) from [<c045b794>] (tty_vhangup+0x1c/0x20)
      [<c045b794>] (tty_vhangup) from [<c046ccc8>] (uart_remove_one_port+0xec/0x260)
      [<c046ccc8>] (uart_remove_one_port) from [<c046ef4c>] (serial_omap_remove+0x40/0x60)
      [<c046ef4c>] (serial_omap_remove) from [<c04845e8>] (platform_drv_remove+0x34/0x4c)
      
      Fix this up by resuming the device before deregistering the port and by
      suspending and disabling runtime pm only after the port has been
      removed.
      
      Also make sure to disable autosuspend before disabling runtime pm so
      that the usage count is balanced and device actually suspended before
      returning.
      
      Note that due to a negative autosuspend delay being set in probe, the
      unbalanced put would actually suspend the device on first driver unbind,
      while rebinding and again unbinding would result in a negative
      power.usage_count.
      
      Fixes: 7e9c8e7d ("serial: omap: make sure to suspend device before remove")
      Cc: Felipe Balbi <balbi@kernel.org>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8d2751b
    • Marek Szyprowski's avatar
      serial: samsung: Use right device for DMA-mapping calls · c5689e0a
      Marek Szyprowski authored
      commit 768d64f4 upstream.
      
      Driver should provide its own struct device for all DMA-mapping calls instead
      of extracting device pointer from DMA engine channel. Although this is harmless
      from the driver operation perspective on ARM architecture, it is always good
      to use the DMA mapping API in a proper way. This patch fixes following DMA API
      debug warning:
      
      WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1241 check_sync+0x520/0x9f4
      samsung-uart 12c20000.serial: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000006df0f580] [size=64 bytes]
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00137-g07ca963 #51
      Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
      [<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24)
      [<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0)
      [<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180)
      [<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50)
      [<c01395a4>] (warn_slowpath_fmt) from [<c0729058>] (check_sync+0x520/0x9f4)
      [<c0729058>] (check_sync) from [<c072967c>] (debug_dma_sync_single_for_device+0x88/0xc8)
      [<c072967c>] (debug_dma_sync_single_for_device) from [<c0803c10>] (s3c24xx_serial_start_tx_dma+0x100/0x2f8)
      [<c0803c10>] (s3c24xx_serial_start_tx_dma) from [<c0804338>] (s3c24xx_serial_tx_chars+0x198/0x33c)
      Reported-by: default avatarSeung-Woo Kim <sw0312.kim@samsung.com>
      Fixes: 62c37eed ("serial: samsung: add dma reqest/release functions")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Reviewed-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5689e0a
    • Eric Biggers's avatar
      fscrypt: fix context consistency check when key(s) unavailable · 64a599ac
      Eric Biggers authored
      commit 272f98f6 upstream.
      
      To mitigate some types of offline attacks, filesystem encryption is
      designed to enforce that all files in an encrypted directory tree use
      the same encryption policy (i.e. the same encryption context excluding
      the nonce).  However, the fscrypt_has_permitted_context() function which
      enforces this relies on comparing struct fscrypt_info's, which are only
      available when we have the encryption keys.  This can cause two
      incorrect behaviors:
      
      1. If we have the parent directory's key but not the child's key, or
         vice versa, then fscrypt_has_permitted_context() returned false,
         causing applications to see EPERM or ENOKEY.  This is incorrect if
         the encryption contexts are in fact consistent.  Although we'd
         normally have either both keys or neither key in that case since the
         master_key_descriptors would be the same, this is not guaranteed
         because keys can be added or removed from keyrings at any time.
      
      2. If we have neither the parent's key nor the child's key, then
         fscrypt_has_permitted_context() returned true, causing applications
         to see no error (or else an error for some other reason).  This is
         incorrect if the encryption contexts are in fact inconsistent, since
         in that case we should deny access.
      
      To fix this, retrieve and compare the fscrypt_contexts if we are unable
      to set up both fscrypt_infos.
      
      While this slightly hurts performance when accessing an encrypted
      directory tree without the key, this isn't a case we really need to be
      optimizing for; access *with* the key is much more important.
      Furthermore, the performance hit is barely noticeable given that we are
      already retrieving the fscrypt_context and doing two keyring searches in
      fscrypt_get_encryption_info().  If we ever actually wanted to optimize
      this case we might start by caching the fscrypt_contexts.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64a599ac
    • Dan Williams's avatar
      device-dax: fix cdev leak · 8dd114ef
      Dan Williams authored
      commit ed01e50a upstream.
      
      If device_add() fails, cleanup the cdev. Otherwise, we leak a kobj_map()
      with a stale device number.
      
      As Jason points out, there is a small possibility that userspace has
      opened and mapped the device in the time between cdev_add() and the
      device_add() failure. We need a new kill_dax_dev() helper to invalidate
      any established mappings.
      
      Fixes: ba09c01d ("dax: convert to the cdev api")
      Reported-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dd114ef
    • Jason A. Donenfeld's avatar
      padata: free correct variable · 6240377c
      Jason A. Donenfeld authored
      commit 07a77929 upstream.
      
      The author meant to free the variable that was just allocated, instead
      of the one that failed to be allocated, but made a simple typo. This
      patch rectifies that.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6240377c
    • Björn Jacke's avatar
      CIFS: add misssing SFM mapping for doublequote · 1c5d8b37
      Björn Jacke authored
      commit 85435d7a upstream.
      
      SFM is mapping doublequote to 0xF020
      
      Without this patch creating files with doublequote fails to Windows/Mac
      Signed-off-by: default avatarBjoern Jacke <bjacke@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1c5d8b37
    • David Disseldorp's avatar
      cifs: fix CIFS_IOC_GET_MNT_INFO oops · 6f3b2eed
      David Disseldorp authored
      commit d8a6e505 upstream.
      
      An open directory may have a NULL private_data pointer prior to readdir.
      
      Fixes: 0de1f4c6 ("Add way to query server fs info for smb3")
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f3b2eed
    • Rabin Vincent's avatar
      CIFS: fix oplock break deadlocks · f13d96bf
      Rabin Vincent authored
      commit 3998e6b8 upstream.
      
      When the final cifsFileInfo_put() is called from cifsiod and an oplock
      break work is queued, lockdep complains loudly:
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.11.0+ #21 Not tainted
       ---------------------------------------------
       kworker/0:2/78 is trying to acquire lock:
        ("cifsiod"){++++.+}, at: flush_work+0x215/0x350
      
       but task is already holding lock:
        ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock("cifsiod");
         lock("cifsiod");
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       2 locks held by kworker/0:2/78:
        #0:  ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
        #1:  ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0
      
       stack backtrace:
       CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21
       Workqueue: cifsiod cifs_writev_complete
       Call Trace:
        dump_stack+0x85/0xc2
        __lock_acquire+0x17dd/0x2260
        ? match_held_lock+0x20/0x2b0
        ? trace_hardirqs_off_caller+0x86/0x130
        ? mark_lock+0xa6/0x920
        lock_acquire+0xcc/0x260
        ? lock_acquire+0xcc/0x260
        ? flush_work+0x215/0x350
        flush_work+0x236/0x350
        ? flush_work+0x215/0x350
        ? destroy_worker+0x170/0x170
        __cancel_work_timer+0x17d/0x210
        ? ___preempt_schedule+0x16/0x18
        cancel_work_sync+0x10/0x20
        cifsFileInfo_put+0x338/0x7f0
        cifs_writedata_release+0x2a/0x40
        ? cifs_writedata_release+0x2a/0x40
        cifs_writev_complete+0x29d/0x850
        ? preempt_count_sub+0x18/0xd0
        process_one_work+0x304/0x8e0
        worker_thread+0x9b/0x6a0
        kthread+0x1b2/0x200
        ? process_one_work+0x8e0/0x8e0
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x31/0x40
      
      This is a real warning.  Since the oplock is queued on the same
      workqueue this can deadlock if there is only one worker thread active
      for the workqueue (which will be the case during memory pressure when
      the rescuer thread is handling it).
      
      Furthermore, there is at least one other kind of hang possible due to
      the oplock break handling if there is only worker.  (This can be
      reproduced without introducing memory pressure by having passing 1 for
      the max_active parameter of cifsiod.) cifs_oplock_break() can wait
      indefintely in the filemap_fdatawait() while the cifs_writev_complete()
      work is blocked:
      
       sysrq: SysRq : Show Blocked State
         task                        PC stack   pid father
       kworker/0:1     D    0    16      2 0x00000000
       Workqueue: cifsiod cifs_oplock_break
       Call Trace:
        __schedule+0x562/0xf40
        ? mark_held_locks+0x4a/0xb0
        schedule+0x57/0xe0
        io_schedule+0x21/0x50
        wait_on_page_bit+0x143/0x190
        ? add_to_page_cache_lru+0x150/0x150
        __filemap_fdatawait_range+0x134/0x190
        ? do_writepages+0x51/0x70
        filemap_fdatawait_range+0x14/0x30
        filemap_fdatawait+0x3b/0x40
        cifs_oplock_break+0x651/0x710
        ? preempt_count_sub+0x18/0xd0
        process_one_work+0x304/0x8e0
        worker_thread+0x9b/0x6a0
        kthread+0x1b2/0x200
        ? process_one_work+0x8e0/0x8e0
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x31/0x40
       dd              D    0   683    171 0x00000000
       Call Trace:
        __schedule+0x562/0xf40
        ? mark_held_locks+0x29/0xb0
        schedule+0x57/0xe0
        io_schedule+0x21/0x50
        wait_on_page_bit+0x143/0x190
        ? add_to_page_cache_lru+0x150/0x150
        __filemap_fdatawait_range+0x134/0x190
        ? do_writepages+0x51/0x70
        filemap_fdatawait_range+0x14/0x30
        filemap_fdatawait+0x3b/0x40
        filemap_write_and_wait+0x4e/0x70
        cifs_flush+0x6a/0xb0
        filp_close+0x52/0xa0
        __close_fd+0xdc/0x150
        SyS_close+0x33/0x60
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      
       Showing all locks held in the system:
       2 locks held by kworker/0:1/16:
        #0:  ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0
        #1:  ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0
      
       Showing busy workqueues and worker pools:
       workqueue cifsiod: flags=0xc
         pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
           in-flight: 16:cifs_oplock_break
           delayed: cifs_writev_complete, cifs_echo_request
       pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3
      
      Fix these problems by creating a a new workqueue (with a rescuer) for
      the oplock break work.
      Signed-off-by: default avatarRabin Vincent <rabinv@axis.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f13d96bf
    • David Disseldorp's avatar
      cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops · 41134664
      David Disseldorp authored
      commit 6026685d upstream.
      
      As with 61876395, an open directory may have a NULL private_data
      pointer prior to readdir. CIFS_ENUMERATE_SNAPSHOTS must check for this
      before dereference.
      
      Fixes: 834170c8 ("Enable previous version support")
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41134664
    • David Disseldorp's avatar
      cifs: fix leak in FSCTL_ENUM_SNAPS response handling · 449a7443
      David Disseldorp authored
      commit 0e5c7955 upstream.
      
      The server may respond with success, and an output buffer less than
      sizeof(struct smb_snapshot_array) in length. Do not leak the output
      buffer in this case.
      
      Fixes: 834170c8 ("Enable previous version support")
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      449a7443
    • Björn Jacke's avatar
      CIFS: fix mapping of SFM_SPACE and SFM_PERIOD · 87c0604d
      Björn Jacke authored
      commit b704e70b upstream.
      
      - trailing space maps to 0xF028
      - trailing period maps to 0xF029
      
      This fix corrects the mapping of file names which have a trailing character
      that would otherwise be illegal (period or space) but is allowed by POSIX.
      Signed-off-by: default avatarBjoern Jacke <bjacke@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87c0604d
    • Steve French's avatar
      SMB3: Work around mount failure when using SMB3 dialect to Macs · 8dd4e3ff
      Steve French authored
      commit 7db0a6ef upstream.
      
      Macs send the maximum buffer size in response on ioctl to validate
      negotiate security information, which causes us to fail the mount
      as the response buffer is larger than the expected response.
      
      Changed ioctl response processing to allow for padding of validate
      negotiate ioctl response and limit the maximum response size to
      maximum buffer size.
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dd4e3ff
    • Steve French's avatar
      Set unicode flag on cifs echo request to avoid Mac error · 2ac2ad9f
      Steve French authored
      commit 26c9cb66 upstream.
      
      Mac requires the unicode flag to be set for cifs, even for the smb
      echo request (which doesn't have strings).
      
      Without this Mac rejects the periodic echo requests (when mounting
      with cifs) that we use to check if server is down
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ac2ad9f
    • Sachin Prabhu's avatar
      Fix match_prepath() · 4f5e1c48
      Sachin Prabhu authored
      commit cd8c4296 upstream.
      
      Incorrect return value for shares not using the prefix path means that
      we will never match superblocks for these shares.
      
      Fixes: commit c1d8b24d ("Compare prepaths when comparing superblocks")
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f5e1c48
    • Vlastimil Babka's avatar
      mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC · 4e434d4f
      Vlastimil Babka authored
      commit 62be1511 upstream.
      
      Patch series "more robust PF_MEMALLOC handling"
      
      This series aims to unify the setting and clearing of PF_MEMALLOC, which
      prevents recursive reclaim.  There are some places that clear the flag
      unconditionally from current->flags, which may result in clearing a
      pre-existing flag.  This already resulted in a bug report that Patch 1
      fixes (without the new helpers, to make backporting easier).  Patch 2
      introduces the new helpers, modelled after existing memalloc_noio_* and
      memalloc_nofs_* helpers, and converts mm core to use them.  Patches 3
      and 4 convert non-mm code.
      
      This patch (of 4):
      
      __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent deadlock
      during page migration by lock_page() (see the comment in
      __unmap_and_move()).  Then it unconditionally clears the flag, which can
      clear a pre-existing PF_MEMALLOC flag and result in recursive reclaim.
      This was not a problem until commit a8161d1e ("mm, page_alloc:
      restructure direct compaction handling in slowpath"), because direct
      compation was called only after direct reclaim, which was skipped when
      PF_MEMALLOC flag was set.
      
      Even now it's only a theoretical issue, as the new callsite of
      __alloc_pages_direct_compact() is reached only for costly orders and
      when gfp_pfmemalloc_allowed() is true, which means either
      __GFP_NOMEMALLOC is in gfp_flags or in_interrupt() is true.  There is no
      such known context, but let's play it safe and make
      __alloc_pages_direct_compact() robust for cases where PF_MEMALLOC is
      already set.
      
      Fixes: a8161d1e ("mm, page_alloc: restructure direct compaction handling in slowpath")
      Link: http://lkml.kernel.org/r/20170405074700.29871-2-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Chris Leech <cleech@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e434d4f
    • Andrey Ryabinin's avatar
      fs/block_dev: always invalidate cleancache in invalidate_bdev() · 945d0ecd
      Andrey Ryabinin authored
      commit a5f6a6a9 upstream.
      
      invalidate_bdev() calls cleancache_invalidate_inode() iff ->nrpages != 0
      which doen't make any sense.
      
      Make sure that invalidate_bdev() always calls cleancache_invalidate_inode()
      regardless of mapping->nrpages value.
      
      Fixes: c515e1fd ("mm/fs: add hooks to support cleancache")
      Link: http://lkml.kernel.org/r/20170424164135.22350-3-aryabinin@virtuozzo.comSigned-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Alexey Kuznetsov <kuznet@virtuozzo.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      945d0ecd
    • Luis Henriques's avatar
      ceph: fix memory leak in __ceph_setxattr() · 091784ae
      Luis Henriques authored
      commit eeca958d upstream.
      
      The ceph_inode_xattr needs to be released when removing an xattr.  Easily
      reproducible running the 'generic/020' test from xfstests or simply by
      doing:
      
        attr -s attr0 -V 0 /mnt/test && attr -r attr0 /mnt/test
      
      While there, also fix the error path.
      
      Here's the kmemleak splat:
      
      unreferenced object 0xffff88001f86fbc0 (size 64):
        comm "attr", pid 244, jiffies 4294904246 (age 98.464s)
        hex dump (first 32 bytes):
          40 fa 86 1f 00 88 ff ff 80 32 38 1f 00 88 ff ff  @........28.....
          00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de  ................
        backtrace:
          [<ffffffff81560199>] kmemleak_alloc+0x49/0xa0
          [<ffffffff810f3e5b>] kmem_cache_alloc+0x9b/0xf0
          [<ffffffff812b157e>] __ceph_setxattr+0x17e/0x820
          [<ffffffff812b1c57>] ceph_set_xattr_handler+0x37/0x40
          [<ffffffff8111fb4b>] __vfs_removexattr+0x4b/0x60
          [<ffffffff8111fd37>] vfs_removexattr+0x77/0xd0
          [<ffffffff8111fdd1>] removexattr+0x41/0x60
          [<ffffffff8111fe65>] path_removexattr+0x75/0xa0
          [<ffffffff81120aeb>] SyS_lremovexattr+0xb/0x10
          [<ffffffff81564b20>] entry_SYSCALL_64_fastpath+0x13/0x94
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: default avatarLuis Henriques <lhenriques@suse.com>
      Reviewed-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      091784ae
    • Michal Hocko's avatar
      fs/xattr.c: zero out memory copied to userspace in getxattr · 9a6bb7b5
      Michal Hocko authored
      commit 81be3dee upstream.
      
      getxattr uses vmalloc to allocate memory if kzalloc fails.  This is
      filled by vfs_getxattr and then copied to the userspace.  vmalloc,
      however, doesn't zero out the memory so if the specific implementation
      of the xattr handler is sloppy we can theoretically expose a kernel
      memory.  There is no real sign this is really the case but let's make
      sure this will not happen and use vzalloc instead.
      
      Fixes: 779302e6 ("fs/xattr.c:getxattr(): improve handling of allocation failures")
      Link: http://lkml.kernel.org/r/20170306103327.2766-1-mhocko@kernel.orgAcked-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a6bb7b5
    • Martin Brandenburg's avatar
      orangefs: do not check possibly stale size on truncate · 1777e888
      Martin Brandenburg authored
      commit 53950ef5 upstream.
      
      Let the server figure this out because our size might be out of date or
      not present.
      
      The bug was that
      
      	xfs_io -f -t -c "pread -v 0 100" /mnt/foo
      	echo "Test" > /mnt/foo
      	xfs_io -f -t -c "pread -v 0 100" /mnt/foo
      
      fails because the second truncate did not happen if nothing had
      requested the size after the write in echo.  Thus i_size was zero (not
      present) and the orangefs_setattr though i_size was zero and there was
      nothing to do.
      Signed-off-by: default avatarMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: default avatarMike Marshall <hubcap@omnibond.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1777e888
    • Martin Brandenburg's avatar
      orangefs: do not set getattr_time on orangefs_lookup · 63907bb7
      Martin Brandenburg authored
      commit 17930b25 upstream.
      
      Since orangefs_lookup calls orangefs_iget which calls
      orangefs_inode_getattr, getattr_time will get set.
      Signed-off-by: default avatarMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: default avatarMike Marshall <hubcap@omnibond.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63907bb7
    • Martin Brandenburg's avatar
      orangefs: clean up oversize xattr validation · 59f49610
      Martin Brandenburg authored
      commit e675c5ec upstream.
      
      Also don't check flags as this has been validated by the VFS already.
      
      Fix an off-by-one error in the max size checking.
      
      Stop logging just because userspace wants to write attributes which do
      not fit.
      
      This and the previous commit fix xfstests generic/020.
      Signed-off-by: default avatarMartin Brandenburg <martin@omnibond.com>
      Signed-off-by: default avatarMike Marshall <hubcap@omnibond.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59f49610
    • Martin Brandenburg's avatar
      127adc18
    • Eric Biggers's avatar
      ext4: evict inline data when writing to memory map · b2764f85
      Eric Biggers authored
      commit 7b4cc978 upstream.
      
      Currently the case of writing via mmap to a file with inline data is not
      handled.  This is maybe a rare case since it requires a writable memory
      map of a very small file, but it is trivial to trigger with on
      inline_data filesystem, and it causes the
      'BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA));' in
      ext4_writepages() to be hit:
      
          mkfs.ext4 -O inline_data /dev/vdb
          mount /dev/vdb /mnt
          xfs_io -f /mnt/file \
      	-c 'pwrite 0 1' \
      	-c 'mmap -w 0 1m' \
      	-c 'mwrite 0 1' \
      	-c 'fsync'
      
      	kernel BUG at fs/ext4/inode.c:2723!
      	invalid opcode: 0000 [#1] SMP
      	CPU: 1 PID: 2532 Comm: xfs_io Not tainted 4.11.0-rc1-xfstests-00301-g071d9acf3d1f #633
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
      	task: ffff88003d3a8040 task.stack: ffffc90000300000
      	RIP: 0010:ext4_writepages+0xc89/0xf8a
      	RSP: 0018:ffffc90000303ca0 EFLAGS: 00010283
      	RAX: 0000028410000000 RBX: ffff8800383fa3b0 RCX: ffffffff812afcdc
      	RDX: 00000a9d00000246 RSI: ffffffff81e660e0 RDI: 0000000000000246
      	RBP: ffffc90000303dc0 R08: 0000000000000002 R09: 869618e8f99b4fa5
      	R10: 00000000852287a2 R11: 00000000a03b49f4 R12: ffff88003808e698
      	R13: 0000000000000000 R14: 7fffffffffffffff R15: 7fffffffffffffff
      	FS:  00007fd3e53094c0(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000
      	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      	CR2: 00007fd3e4c51000 CR3: 000000003d554000 CR4: 00000000003406e0
      	Call Trace:
      	 ? _raw_spin_unlock+0x27/0x2a
      	 ? kvm_clock_read+0x1e/0x20
      	 do_writepages+0x23/0x2c
      	 ? do_writepages+0x23/0x2c
      	 __filemap_fdatawrite_range+0x80/0x87
      	 filemap_write_and_wait_range+0x67/0x8c
      	 ext4_sync_file+0x20e/0x472
      	 vfs_fsync_range+0x8e/0x9f
      	 ? syscall_trace_enter+0x25b/0x2d0
      	 vfs_fsync+0x1c/0x1e
      	 do_fsync+0x31/0x4a
      	 SyS_fsync+0x10/0x14
      	 do_syscall_64+0x69/0x131
      	 entry_SYSCALL64_slow_path+0x25/0x25
      
      We could try to be smart and keep the inline data in this case, or at
      least support delayed allocation when allocating the block, but these
      solutions would be more complicated and don't seem worthwhile given how
      rare this case seems to be.  So just fix the bug by calling
      ext4_convert_inline_data() when we're asked to make a page writable, so
      that any inline data gets evicted, with the block allocated immediately.
      Reported-by: default avatarNick Alcock <nick.alcock@oracle.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2764f85
    • Adrian Hunter's avatar
      perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() · 7929b50d
      Adrian Hunter authored
      commit c3a0bbc7 upstream.
      
      Address filtering with kernel symbols incorrectly resulted in the error
      "Cannot determine size of symbol" because the no_size logic was the wrong
      way around.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7929b50d
    • Mike Marciniszyn's avatar
      IB/hfi1: Prevent kernel QP post send hard lockups · e3cea383
      Mike Marciniszyn authored
      commit b6eac931 upstream.
      
      The driver progress routines can call cond_resched() when
      a timeslice is exhausted and irqs are enabled.
      
      If the ULP had been holding a spin lock without disabling irqs and
      the post send directly called the progress routine, the cond_resched()
      could yield allowing another thread from the same ULP to deadlock
      on that same lock.
      
      Correct by replacing the current hfi1_do_send() calldown with a unique
      one for post send and adding an argument to hfi1_do_send() to indicate
      that the send engine is running in a thread.   If the routine is not
      running in a thread, avoid calling cond_resched().
      
      Fixes: Commit 831464ce ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets")
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3cea383
    • Jack Morgenstein's avatar
      IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level · 43c54927
      Jack Morgenstein authored
      commit fb7a9174 upstream.
      
      A warning message during SRIOV multicast cleanup should have actually been
      a debug level message. The condition generating the warning does no harm
      and can fill the message log.
      
      In some cases, during testing, some tests were so intense as to swamp the
      message log with these warning messages, causing a stall in the console
      message log output task. This stall caused an NMI to be sent to all CPUs
      (so that they all dumped their stacks into the message log).
      Aside from the message flood causing an NMI, the tests all passed.
      
      Once the message flood which caused the NMI is removed (by reducing the
      warning message to debug level), the NMI no longer occurs.
      
      Sample message log (console log) output illustrating the flood and
      resultant NMI (snippets with comments and modified with ... instead
      of hex digits, to satisfy checkpatch.pl):
      
       <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
       *** About 4000 almost identical lines in less than one second ***
       <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
       INFO: rcu_sched detected stalls on CPUs/tasks: { 17} (...)
       *** { 17} above indicates that CPU 17 was the one that stalled ***
       sending NMI to all CPUs:
       ...
       NMI backtrace for cpu 17
       CPU: 17 PID: 45909 Comm: kworker/17:2
       Hardware name: HP ProLiant DL360p Gen8, BIOS P71 09/08/2013
       Workqueue: events fb_flashcursor
       task: ffff880478...... ti: ffff88064e...... task.ti: ffff88064e......
       RIP: 0010:[ffffffff81......]  [ffffffff81......] io_serial_in+0x15/0x20
       RSP: 0018:ffff88064e257cb0  EFLAGS: 00000002
       RAX: 0000000000...... RBX: ffffffff81...... RCX: 0000000000......
       RDX: 0000000000...... RSI: 0000000000...... RDI: ffffffff81......
       RBP: ffff88064e...... R08: ffffffff81...... R09: 0000000000......
       R10: 0000000000...... R11: ffff88064e...... R12: 0000000000......
       R13: 0000000000...... R14: ffffffff81...... R15: 0000000000......
       FS:  0000000000......(0000) GS:ffff8804af......(0000) knlGS:000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080......
       CR2: 00007f2a2f...... CR3: 0000000001...... CR4: 0000000000......
       DR0: 0000000000...... DR1: 0000000000...... DR2: 0000000000......
       DR3: 0000000000...... DR6: 00000000ff...... DR7: 0000000000......
       Stack:
       ffff88064e...... ffffffff81...... ffffffff81...... 0000000000......
       ffffffff81...... ffff88064e...... ffffffff81...... ffffffff81......
       ffffffff81...... ffff88064e...... ffffffff81...... 0000000000......
       Call Trace:
      [<ffffffff813d099b>] wait_for_xmitr+0x3b/0xa0
      [<ffffffff813d0b5c>] serial8250_console_putchar+0x1c/0x30
      [<ffffffff813d0b40>] ? serial8250_console_write+0x140/0x140
      [<ffffffff813cb5fa>] uart_console_write+0x3a/0x80
      [<ffffffff813d0aae>] serial8250_console_write+0xae/0x140
      [<ffffffff8107c4d1>] call_console_drivers.constprop.15+0x91/0xf0
      [<ffffffff8107d6cf>] console_unlock+0x3bf/0x400
      [<ffffffff813503cd>] fb_flashcursor+0x5d/0x140
      [<ffffffff81355c30>] ? bit_clear+0x120/0x120
      [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
      [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
      [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
      [<ffffffff810a5aef>] kthread+0xcf/0xe0
      [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
      [<ffffffff81645858>] ret_from_fork+0x58/0x90
      [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
      Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 66 0f 1f 44 00 00 66 66 66 6
      
      As indicated in the stack trace above, the console output task got swamped.
      
      Fixes: b9c5d6a6 ("IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43c54927
    • Jack Morgenstein's avatar
      IB/mlx4: Fix ib device initialization error flow · 9ae6b33d
      Jack Morgenstein authored
      commit 99e68909 upstream.
      
      In mlx4_ib_add, procedure mlx4_ib_alloc_eqs is called to allocate EQs.
      
      However, in the mlx4_ib_add error flow, procedure mlx4_ib_free_eqs is not
      called to free the allocated EQs.
      
      Fixes: e605b743 ("IB/mlx4: Increase the number of vectors (EQs) available for ULPs")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ae6b33d
    • Shamir Rabinovitch's avatar
      IB/IPoIB: ibX: failed to create mcg debug file · d20bfe22
      Shamir Rabinovitch authored
      commit 771a5258 upstream.
      
      When udev renames the netdev devices, ipoib debugfs entries does not
      get renamed. As a result, if subsequent probe of ipoib device reuse the
      name then creating a debugfs entry for the new device would fail.
      
      Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
      of ipoib event handling in order to avoid any race condition between these.
      
      Fixes: 1732b0ef ([IPoIB] add path record information in debugfs)
      Signed-off-by: default avatarVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: default avatarShamir Rabinovitch <shamir.rabinovitch@oracle.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d20bfe22