1. 05 May, 2017 1 commit
    • Dan Williams's avatar
      libnvdimm, pfn: fix 'npfns' vs section alignment · d5483fed
      Dan Williams authored
      Fix failures to create namespaces due to the vmem_altmap not advertising
      enough free space to store the memmap.
      
       WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
       [..]
       Call Trace:
        dump_stack+0x63/0x83
        __warn+0xcb/0xf0
        warn_slowpath_null+0x1d/0x20
        arch_add_memory+0xde/0xf0
        devm_memremap_pages+0x244/0x440
        pmem_attach_disk+0x37e/0x490 [nd_pmem]
        nd_pmem_probe+0x7e/0xa0 [nd_pmem]
        nvdimm_bus_probe+0x71/0x120 [libnvdimm]
        driver_probe_device+0x2bb/0x460
        bind_store+0x114/0x160
        drv_attr_store+0x25/0x30
      
      In commit 658922e5 "libnvdimm, pfn: fix memmap reservation sizing"
      we arranged for the capacity to be allocated, but failed to also update
      the 'npfns' parameter. This leads to cases where there is enough
      capacity reserved to hold all the allocated sections, but
      vmemmap_populate_hugepages() still encounters -ENOMEM from
      altmap_alloc_block_buf().
      
      This fix is a stop-gap until we can teach the core memory hotplug
      implementation to permit sub-section hotplug.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 658922e5 ("libnvdimm, pfn: fix memmap reservation sizing")
      Reported-by: default avatarAnisha Allada <anisha.allada@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      d5483fed
  2. 04 May, 2017 2 commits
  3. 01 May, 2017 3 commits
    • Dan Williams's avatar
      device-dax: fix sysfs attribute deadlock · 565851c9
      Dan Williams authored
      Usage of device_lock() for dax_region attributes is unnecessary and
      deadlock prone. It's unnecessary because the order of registration /
      un-registration guarantees that drvdata is always valid. It's deadlock
      prone because it sets up this situation:
      
       ndctl           D    0  2170   2082 0x00000000
       Call Trace:
        __schedule+0x31f/0x980
        schedule+0x3d/0x90
        schedule_preempt_disabled+0x15/0x20
        __mutex_lock+0x402/0x980
        ? __mutex_lock+0x158/0x980
        ? align_show+0x2b/0x80 [dax]
        ? kernfs_seq_start+0x2f/0x90
        mutex_lock_nested+0x1b/0x20
        align_show+0x2b/0x80 [dax]
        dev_attr_show+0x20/0x50
      
       ndctl           D    0  2186   2079 0x00000000
       Call Trace:
        __schedule+0x31f/0x980
        schedule+0x3d/0x90
        __kernfs_remove+0x1f6/0x340
        ? kernfs_remove_by_name_ns+0x45/0xa0
        ? remove_wait_queue+0x70/0x70
        kernfs_remove_by_name_ns+0x45/0xa0
        remove_files.isra.1+0x35/0x70
        sysfs_remove_group+0x44/0x90
        sysfs_remove_groups+0x2e/0x50
        dax_region_unregister+0x25/0x40 [dax]
        devm_action_release+0xf/0x20
        release_nodes+0x16d/0x2b0
        devres_release_all+0x3c/0x60
        device_release_driver_internal+0x17d/0x220
        device_release_driver+0x12/0x20
        unbind_store+0x112/0x160
      
      ndctl/2170 is trying to acquire the device_lock() to read an attribute,
      and ndctl/2186 is holding the device_lock() while trying to drain all
      active attribute readers.
      
      Thanks to Yi Zhang for the reproduction script.
      
      Fixes: d7fe1a67 ("dax: add region 'id', 'size', and 'align' attributes")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarYi Zhang <yizhan@redhat.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      565851c9
    • Dan Williams's avatar
      libnvdimm: restore "libnvdimm: band aid btt vs clear poison locking" · a3e9af95
      Dan Williams authored
      This continues the 4.11 status quo of disabling of error clearing from
      the BTT I/O path. Toshi found that even though we have eliminated all
      the libnvdimm sources of sleeping-while-atomic triggers, we still have
      sleeping operations that will occur in the path to send the ACPI DSM to
      the DIMM to clear the error:
      
       BUG: sleeping function called from invalid context at mm/slab.h:432
       in_atomic(): 1, irqs_disabled(): 0, pid: 13353, name: dd
       Call Trace:
        dump_stack+0x86/0xc3
        ___might_sleep+0x17d/0x250
        __might_sleep+0x4a/0x80
        __kmalloc+0x1c0/0x2e0
        acpi_os_allocate_zeroed+0x2d/0x2f
        acpi_evaluate_object+0x59/0x3b1
        acpi_evaluate_dsm+0xbd/0x10c
        acpi_nfit_ctl+0x1ef/0x7c0 [nfit]
        ? nsio_rw_bytes+0x152/0x280
        nvdimm_clear_poison+0x77/0x140
        nsio_rw_bytes+0x18f/0x280
        btt_write_pg+0x1d4/0x3d0 [nd_btt]
        btt_make_request+0x119/0x2d0 [nd_btt]
      
      A solution for tracking and handling media errors natively in the BTT is
      needed.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Reported-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      a3e9af95
    • Dan Williams's avatar
      libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering · 452bae0a
      Dan Williams authored
      A debug patch to turn the standard device_lock() into something that
      lockdep can analyze yielded the following:
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       4.11.0-rc4+ #106 Tainted: G           O
       -------------------------------------------------------
       lt-libndctl/1898 is trying to acquire lock:
        (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm]
      
       but task is already holding lock:
        (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm]
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
              lock_acquire+0xf6/0x1f0
              __mutex_lock+0x88/0x980
              mutex_lock_nested+0x1b/0x20
              nvdimm_bus_lock+0x21/0x30 [libnvdimm]
              nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm]
              nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm]
              nd_pmem_probe+0x14/0x180 [nd_pmem]
              nvdimm_bus_probe+0xa9/0x260 [libnvdimm]
      
       -> #0 (&dev->nvdimm_mutex/3){+.+.+.}:
              __lock_acquire+0x1107/0x1280
              lock_acquire+0xf6/0x1f0
              __mutex_lock+0x88/0x980
              mutex_lock_nested+0x1b/0x20
              nd_attach_ndns+0x178/0x1b0 [libnvdimm]
              nd_namespace_store+0x308/0x3c0 [libnvdimm]
              namespace_store+0x87/0x220 [libnvdimm]
      
      In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'.
      
      Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect
      nd_{attach,detach}_ndns() operations.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 8c2f7e86 ("libnvdimm: infrastructure for btt devices")
      Reported-by: default avatarYi Zhang <yizhan@redhat.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      452bae0a
  4. 29 Apr, 2017 1 commit
    • Dan Williams's avatar
      libnvdimm: rework region badblocks clearing · 23f49844
      Dan Williams authored
      Toshi noticed that the new support for a region-level badblocks missed
      the case where errors are cleared due to BTT I/O.
      
      An initial attempt to fix this ran into a "sleeping while atomic"
      warning due to taking the nvdimm_bus_lock() in the BTT I/O path to
      satisfy the locking requirements of __nvdimm_bus_badblocks_clear().
      However, that lock is not needed since we are not acting on any data that
      is subject to change under that lock. The badblocks instance has its own
      internal lock to handle mutations of the error list.
      
      So, in order to make it clear that we are just acting on region devices,
      rename __nvdimm_bus_badblocks_clear() to nvdimm_clear_badblocks_regions().
      Eliminate the lock and consolidate all support routines for the new
      nvdimm_account_cleared_poison() in drivers/nvdimm/bus.c. Finally, to the
      opportunity to cleanup to some unnecessary casts, make the calling
      convention of nvdimm_clear_badblocks_regions() clearer by replacing struct
      resource with the minimal struct clear_badblocks_context, and use the
      DEVICE_ATTR macro.
      
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Reported-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      23f49844
  5. 28 Apr, 2017 4 commits
    • Dan Williams's avatar
      acpi, nfit: kill ACPI_NFIT_DEBUG · 7699a6a3
      Dan Williams authored
      Inevitably when one actually needs to debug a DSM issue it's on a
      distribution kernel that has CONFIG_ACPI_NFIT_DEBUG=n. The config symbol
      was only there to avoid the compile error due to the missing fallback for
      print_hex_dump_debug in the CONFIG_DYNAMIC_DEBUG=n case. That was fixed
      with commit cdf17449 "hexdump: do not print debug dumps for
      !CONFIG_DEBUG", so the config symbol can just be dropped.
      
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      7699a6a3
    • Toshi Kani's avatar
      libnvdimm: fix clear length of nvdimm_forget_poison() · 8d13c029
      Toshi Kani authored
      ND_CMD_CLEAR_ERROR command returns 'clear_err.cleared', the length
      of error actually cleared, which may be smaller than its requested
      'len'.
      
      Change nvdimm_clear_poison() to call nvdimm_forget_poison() with
      'clear_err.cleared' when this value is valid.
      
      Cc: <stable@vger.kernel.org>
      Fixes: e046114a ("libnvdimm: clear the internal poison_list when clearing badblocks")
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      8d13c029
    • Toshi Kani's avatar
      libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify · b2518c78
      Toshi Kani authored
      The following BUG was observed when nd_pmem_notify() was called
      for a BTT device.  The use of a pmem_device pointer is not valid
      with BTT.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
       IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
       Call Trace:
        nd_device_notify+0x40/0x50
        child_notify+0x10/0x20
        device_for_each_child+0x50/0x90
        nd_region_notify+0x20/0x30
        nd_device_notify+0x40/0x50
        nvdimm_region_notify+0x27/0x30
        acpi_nfit_scrub+0x341/0x590 [nfit]
        process_one_work+0x197/0x450
        worker_thread+0x4e/0x4a0
        kthread+0x109/0x140
      
      Fix nd_pmem_notify() by setting nd_region and badblocks pointers
      properly for BTT.
      
      Cc: <stable@vger.kernel.org>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Fixes: 71999466 ("libnvdimm: async notification support")
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      b2518c78
    • Dan Williams's avatar
      libnvdimm, region: sysfs trigger for nvdimm_flush() · ab630891
      Dan Williams authored
      The nvdimm_flush() mechanism helps to reduce the impact of an ADR
      (asynchronous-dimm-refresh) failure. The ADR mechanism handles flushing
      platform WPQ (write-pending-queue) buffers when power is removed. The
      nvdimm_flush() mechanism performs that same function on-demand.
      
      When a pmem namespace is associated with a block device, an
      nvdimm_flush() is triggered with every block-layer REQ_FUA, or REQ_FLUSH
      request. These requests are typically associated with filesystem
      metadata updates. However, when a namespace is in device-dax mode,
      userspace (think database metadata) needs another path to perform the
      same flushing. In other words this is not required to make data
      persistent, but in the case of metadata it allows for a smaller failure
      domain in the unlikely event of an ADR failure.
      
      The new 'deep_flush' attribute is visible when the individual DIMMs
      backing a given interleave-set are described by platform firmware. In
      ACPI terms this is "NVDIMM Region Mapping Structures" and associated
      "Flush Hint Address Structures". Reads return "1" if the region supports
      triggering WPQ flushes on all DIMMs. Reads return "0" the flush
      operation is a platform nop, and in that case the attribute is
      read-only.
      
      Why sysfs and not an ioctl? An ioctl requires establishing a new
      ioctl function number space for device-dax. Given that this would be
      called on a device-dax fd an application could be forgiven for
      accidentally calling this on a filesystem-dax fd. Placing this interface
      in libnvdimm sysfs removes that potential for collision with a
      filesystem ioctl, and it keeps ioctls out of the generic device-dax
      implementation.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      ab630891
  6. 27 Apr, 2017 1 commit
  7. 24 Apr, 2017 1 commit
    • Dan Williams's avatar
      libnvdimm, region: fix flush hint detection crash · bc042fdf
      Dan Williams authored
      In the case where a dimm does not have any associated flush hints the
      ndrd->flush_wpq array may be uninitialized leading to crashes with the
      following signature:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
       IP: region_visible+0x10f/0x160 [libnvdimm]
      
       Call Trace:
        internal_create_group+0xbe/0x2f0
        sysfs_create_groups+0x40/0x80
        device_add+0x2d8/0x650
        nd_async_device_register+0x12/0x40 [libnvdimm]
        async_run_entry_fn+0x39/0x170
        process_one_work+0x212/0x6c0
        ? process_one_work+0x197/0x6c0
        worker_thread+0x4e/0x4a0
        kthread+0x10c/0x140
        ? process_one_work+0x6c0/0x6c0
        ? kthread_create_on_node+0x60/0x60
        ret_from_fork+0x31/0x40
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Fixes: f284a4f2 ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()")
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      bc042fdf
  8. 18 Apr, 2017 2 commits
  9. 17 Apr, 2017 4 commits
  10. 14 Apr, 2017 2 commits
  11. 13 Apr, 2017 9 commits
  12. 12 Apr, 2017 2 commits
    • Dan Williams's avatar
      x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions · 11e63f6d
      Dan Williams authored
      Before we rework the "pmem api" to stop abusing __copy_user_nocache()
      for memcpy_to_pmem() we need to fix cases where we may strand dirty data
      in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
      for arbitrary data transfers from userspace. There is no guarantee that
      these transfers, performed by dax_iomap_actor(), will have aligned
      destinations or aligned transfer lengths. Backstop the usage
      __copy_user_nocache() with explicit cache management in these unaligned
      cases.
      
      Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
      that is saved for a later patch that moves the entirety of the "pmem
      api" into the pmem driver directly.
      
      Fixes: 5de490da ("pmem: add copy_from_iter_pmem() and clear_pmem()")
      Cc: <stable@vger.kernel.org>
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      11e63f6d
    • Dan Williams's avatar
      device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation · 956a4cd2
      Dan Williams authored
      The following warning triggers with a new unit test that stresses the
      device-dax interface.
      
       ===============================
       [ ERR: suspicious RCU usage.  ]
       4.11.0-rc4+ #1049 Tainted: G           O
       -------------------------------
       ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 0
       2 locks held by fio/9070:
        #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
        #1:  (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
      
       Call Trace:
        dump_stack+0x86/0xc3
        lockdep_rcu_suspicious+0xd7/0x110
        ___might_sleep+0xac/0x250
        __might_sleep+0x4a/0x80
        __alloc_pages_nodemask+0x23a/0x360
        alloc_pages_current+0xa1/0x1f0
        pte_alloc_one+0x17/0x80
        __pte_alloc+0x1e/0x120
        __get_locked_pte+0x1bf/0x1d0
        insert_pfn.isra.70+0x3a/0x100
        ? lookup_memtype+0xa6/0xd0
        vm_insert_mixed+0x64/0x90
        dax_dev_huge_fault+0x520/0x620 [dax]
        ? dax_dev_huge_fault+0x32/0x620 [dax]
        dax_dev_fault+0x10/0x20 [dax]
        __do_fault+0x1e/0x140
        __handle_mm_fault+0x9af/0x10d0
        handle_mm_fault+0x16d/0x370
        ? handle_mm_fault+0x47/0x370
        __do_page_fault+0x28c/0x4f0
        trace_do_page_fault+0x58/0x2a0
        do_async_page_fault+0x1a/0xa0
        async_page_fault+0x28/0x30
      
      Inserting a page table entry may trigger an allocation while we are
      holding a read lock to keep the device instance alive for the duration
      of the fault. Use srcu for this keep-alive protection.
      
      Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      956a4cd2
  13. 11 Apr, 2017 2 commits
    • Dan Williams's avatar
      libnvdimm: band aid btt vs clear poison locking · 4aa5615e
      Dan Williams authored
      The following warning results from holding a lane spinlock,
      preempt_disable(), or the btt map spinlock and then trying to take the
      reconfig_mutex to walk the poison list and potentially add new entries.
      
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
       in_atomic(): 1, irqs_disabled(): 0, pid: 17159, name: dd
       [..]
       Call Trace:
        dump_stack+0x85/0xc8
        ___might_sleep+0x184/0x250
        __might_sleep+0x4a/0x90
        __mutex_lock+0x58/0x9b0
        ? nvdimm_bus_lock+0x21/0x30 [libnvdimm]
        ? __nvdimm_bus_badblocks_clear+0x2f/0x60 [libnvdimm]
        ? acpi_nfit_forget_poison+0x79/0x80 [nfit]
        ? _raw_spin_unlock+0x27/0x40
        mutex_lock_nested+0x1b/0x20
        nvdimm_bus_lock+0x21/0x30 [libnvdimm]
        nvdimm_forget_poison+0x25/0x50 [libnvdimm]
        nvdimm_clear_poison+0x106/0x140 [libnvdimm]
        nsio_rw_bytes+0x164/0x270 [libnvdimm]
        btt_write_pg+0x1de/0x3e0 [nd_btt]
        ? blk_queue_enter+0x30/0x290
        btt_make_request+0x11a/0x310 [nd_btt]
        ? blk_queue_enter+0xb7/0x290
        ? blk_queue_enter+0x30/0x290
        generic_make_request+0x118/0x3b0
      
      As a minimal fix, disable error clearing when the BTT is enabled for the
      namespace. For the final fix a larger rework of the poison list locking
      is needed.
      
      Note that this is not a problem in the blk case since that path never
      calls nvdimm_clear_poison().
      
      Cc: <stable@vger.kernel.org>
      Fixes: 82bf1037 ("libnvdimm: check and clear poison before writing to pmem")
      Cc: Dave Jiang <dave.jiang@intel.com>
      [jeff: dynamically disable error clearing in the btt case]
      Suggested-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Reported-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      4aa5615e
    • Dan Williams's avatar
      libnvdimm: fix reconfig_mutex, mmap_sem, and jbd2_handle lockdep splat · 0beb2012
      Dan Williams authored
      Holding the reconfig_mutex over a potential userspace fault sets up a
      lockdep dependency chain between filesystem-DAX and the libnvdimm ioctl
      path. Move the user access outside of the lock.
      
           [ INFO: possible circular locking dependency detected ]
           4.11.0-rc3+ #13 Tainted: G        W  O
           -------------------------------------------------------
           fallocate/16656 is trying to acquire lock:
            (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffa00080b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm]
           but task is already holding lock:
            (jbd2_handle){++++..}, at: [<ffffffff813b4944>] start_this_handle+0x104/0x460
      
          which lock already depends on the new lock.
      
          the existing dependency chain (in reverse order) is:
      
          -> #2 (jbd2_handle){++++..}:
                  lock_acquire+0xbd/0x200
                  start_this_handle+0x16a/0x460
                  jbd2__journal_start+0xe9/0x2d0
                  __ext4_journal_start_sb+0x89/0x1c0
                  ext4_dirty_inode+0x32/0x70
                  __mark_inode_dirty+0x235/0x670
                  generic_update_time+0x87/0xd0
                  touch_atime+0xa9/0xd0
                  ext4_file_mmap+0x90/0xb0
                  mmap_region+0x370/0x5b0
                  do_mmap+0x415/0x4f0
                  vm_mmap_pgoff+0xd7/0x120
                  SyS_mmap_pgoff+0x1c5/0x290
                  SyS_mmap+0x22/0x30
                  entry_SYSCALL_64_fastpath+0x1f/0xc2
      
          -> #1 (&mm->mmap_sem){++++++}:
                  lock_acquire+0xbd/0x200
                  __might_fault+0x70/0xa0
                  __nd_ioctl+0x683/0x720 [libnvdimm]
                  nvdimm_ioctl+0x8b/0xe0 [libnvdimm]
                  do_vfs_ioctl+0xa8/0x740
                  SyS_ioctl+0x79/0x90
                  do_syscall_64+0x6c/0x200
                  return_from_SYSCALL_64+0x0/0x7a
      
          -> #0 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
                  __lock_acquire+0x16b6/0x1730
                  lock_acquire+0xbd/0x200
                  __mutex_lock+0x88/0x9b0
                  mutex_lock_nested+0x1b/0x20
                  nvdimm_bus_lock+0x21/0x30 [libnvdimm]
                  nvdimm_forget_poison+0x25/0x50 [libnvdimm]
                  nvdimm_clear_poison+0x106/0x140 [libnvdimm]
                  pmem_do_bvec+0x1c2/0x2b0 [nd_pmem]
                  pmem_make_request+0xf9/0x270 [nd_pmem]
                  generic_make_request+0x118/0x3b0
                  submit_bio+0x75/0x150
      
      Cc: <stable@vger.kernel.org>
      Fixes: 62232e45 ("libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices")
      Cc: Dave Jiang <dave.jiang@intel.com>
      Reported-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      0beb2012
  14. 04 Apr, 2017 1 commit
    • Dan Williams's avatar
      libnvdimm: fix blk free space accounting · fe514739
      Dan Williams authored
      Commit a1f3e4d6 "libnvdimm, region: update nd_region_available_dpa()
      for multi-pmem support" reworked blk dpa (DIMM Physical Address)
      accounting to comprehend multiple pmem namespace allocations aliasing
      with a given blk-dpa range.
      
      The following call trace is a result of failing to account for allocated
      blk capacity.
      
       WARNING: CPU: 1 PID: 2433 at tools/testing/nvdimm/../../../drivers/nvdimm/names
      4 size_store+0x6f3/0x930 [libnvdimm]
       nd_region region5: allocation underrun: 0x0 of 0x1000000 bytes
       [..]
       Call Trace:
        dump_stack+0x86/0xc3
        __warn+0xcb/0xf0
        warn_slowpath_fmt+0x5f/0x80
        size_store+0x6f3/0x930 [libnvdimm]
        dev_attr_store+0x18/0x30
      
      If a given blk-dpa allocation does not alias with any pmem ranges then
      the full allocation should be accounted as busy space, not the size of
      the current pmem contribution to the region.
      
      The thinkos that led to this confusion was not realizing that the struct
      resource management is already guaranteeing no collisions between pmem
      allocations and blk allocations on the same dimm. Also, we do not try to
      support blk allocations in aliased pmem holes.
      
      This patch also fixes a case where the available blk goes negative.
      
      Cc: <stable@vger.kernel.org>
      Fixes: a1f3e4d6 ("libnvdimm, region: update nd_region_available_dpa() for multi-pmem support").
      Reported-by: default avatarDariusz Dokupil <dariusz.dokupil@intel.com>
      Reported-by: default avatarDave Jiang <dave.jiang@intel.com>
      Reported-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Tested-by: default avatarDave Jiang <dave.jiang@intel.com>
      Tested-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      fe514739
  15. 28 Mar, 2017 1 commit
  16. 26 Mar, 2017 4 commits
    • Linus Torvalds's avatar
      Linux 4.11-rc4 · c02ed2e7
      Linus Torvalds authored
      c02ed2e7
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 0dc82fa5
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "A smattering of different small fixes for some random driver
        subsystems. Nothing all that major, just resolutions for reported
        issues and bugs.
      
        All have been in linux-next with no reported issues"
      
      * tag 'char-misc-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
        extcon: int3496: Set the id pin to direction-input if necessary
        extcon: int3496: Use gpiod_get instead of gpiod_get_index
        extcon: int3496: Add dependency on X86 as it's Intel specific
        extcon: int3496: Add GPIO ACPI mapping table
        extcon: int3496: Rename GPIO pins in accordance with binding
        vmw_vmci: handle the return value from pci_alloc_irq_vectors correctly
        ppdev: fix registering same device name
        parport: fix attempt to write duplicate procfiles
        auxdisplay: img-ascii-lcd: add missing sentinel entry in img_ascii_lcd_matches
        Drivers: hv: vmbus: Don't leak memory when a channel is rescinded
        Drivers: hv: vmbus: Don't leak channel ids
        Drivers: hv: util: don't forget to init host_ts.lock
        Drivers: hv: util: move waiting for release to hv_utils_transport itself
        vmbus: remove hv_event_tasklet_disable/enable
        vmbus: use rcu for per-cpu channel list
        mei: don't wait for os version message reply
        mei: fix deadlock on mei reset
        intel_th: pci: Add Gemini Lake support
        intel_th: pci: Add Denverton SOC support
        intel_th: Don't leak module refcount on failure to activate
        ...
      0dc82fa5
    • Linus Torvalds's avatar
      Merge tag 'driver-core-4.11-rc4' of... · 9e54ef9d
      Linus Torvalds authored
      Merge tag 'driver-core-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single kernfs fix for 4.11-rc4 that resolves a reported
        issue.
      
        It has been in linux-next with no reported issues"
      
      * tag 'driver-core-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file()
      9e54ef9d
    • Linus Torvalds's avatar
      Merge tag 'tty-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · f1638fc6
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some tty and serial driver fixes for 4.11-rc4.
      
        One of these fix a long-standing issue in the ldisc code that was
        found by Dmitry Vyukov with his great fuzzing work. The other fixes
        resolve other reported issues, and there is one revert of a patch in
        4.11-rc1 that wasn't correct.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'tty-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: fix data race in tty_ldisc_ref_wait()
        tty: don't panic on OOM in tty_set_ldisc()
        Revert "tty: serial: pl011: add ttyAMA for matching pl011 console"
        tty: acpi/spcr: QDF2400 E44 checks for wrong OEM revision
        serial: 8250_dw: Fix breakage when HAVE_CLK=n
        serial: 8250_dw: Honor clk_round_rate errors in dw8250_set_termios
      f1638fc6