1. 21 May, 2016 8 commits
    • Dan Williams's avatar
      36092ee8
    • Dan Williams's avatar
      libnvdimm, dax: fix deletion · 03dca343
      Dan Williams authored
      The ndctl unit tests discovered that the dax enabling omitted updates to
      nd_detach_and_reset().  This routine clears device the configuration
      when the namespace is detached.  Without this clearing userspace may
      assume that the device is in the process of being configured by another
      agent in the system.
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      03dca343
    • Dan Williams's avatar
      libnvdimm, dax: fix alignment validation · 5e24c9fd
      Dan Williams authored
      Testing the dax-device autodetect support revealed a probe failure with
      the following result:
      
          dax0.1: bad offset: 0x8200000 dax disabled
      
      The original pfn-device implementation inferred the alignment from
      ilog2(offset), now that the alignment is explicit the is_power_of_2()
      needs replacing with a real sanity check against the recorded alignment.
      Otherwise the alignment check is useless in the implicit case and only
      the minimum size of the offset matters.
      
      This self-consistency check is further validated by the probe path that
      will re-check that the offset is large enough to contain all the
      metadata required to enable the device.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      5e24c9fd
    • Dan Williams's avatar
      libnvdimm, dax: autodetect support · c5ed9268
      Dan Williams authored
      For autodetecting a previously established dax configuration we need the
      info block to indicate block-device vs device-dax mode, and we need to
      have the default namespace probe hand-off the configuration to the
      dax_pmem driver.
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      c5ed9268
    • Dan Williams's avatar
      libnvdimm: release ida resources · b354aba0
      Dan Williams authored
      ida instances allocate some internal memory for ->free_bitmap in
      addition to the base 'struct ida'.  Use ida_destroy() to release that
      memory at module_exit().
      Reported-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      b354aba0
    • Dan Williams's avatar
      Revert "block: enable dax for raw block devices" · acc93d30
      Dan Williams authored
      This reverts commit 5a023cdb.
      
      The functionality is superseded by the new "Device DAX" facility.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Jan Kara <jack@suse.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      acc93d30
    • Dan Williams's avatar
      /dev/dax, core: file operations and dax-mmap · dee41079
      Dan Williams authored
      The "Device DAX" core enables dax mappings of performance / feature
      differentiated memory.  An open mapping or file handle keeps the backing
      struct device live, but new mappings are only possible while the device
      is enabled.   Faults are handled under rcu_read_lock to synchronize
      with the enabled state of the device.
      
      Similar to the filesystem-dax case the backing memory may optionally
      have struct page entries.  However, unlike fs-dax there is no support
      for private mappings, or mappings that are not backed by media (see
      use of zero-page in fs-dax).
      
      Mappings are always guaranteed to match the alignment of the dax_region.
      If the dax_region is configured to have a 2MB alignment, all mappings
      are guaranteed to be backed by a pmd entry.  Contrast this determinism
      with the fs-dax case where pmd mappings are opportunistic.  If userspace
      attempts to force a misaligned mapping, the driver will fail the mmap
      attempt.  See dax_dev_check_vma() for other scenarios that are rejected,
      like MAP_PRIVATE mappings.
      
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      dee41079
    • Dan Williams's avatar
      /dev/dax, pmem: direct access to persistent memory · ab68f262
      Dan Williams authored
      Device DAX is the device-centric analogue of Filesystem DAX
      (CONFIG_FS_DAX).  It allows memory ranges to be allocated and mapped
      without need of an intervening file system.  Device DAX is strict,
      precise and predictable.  Specifically this interface:
      
      1/ Guarantees fault granularity with respect to a given page size (pte,
      pmd, or pud) set at configuration time.
      
      2/ Enforces deterministic behavior by being strict about what fault
      scenarios are supported.
      
      For example, by forcing MADV_DONTFORK semantics and omitting MAP_PRIVATE
      support device-dax guarantees that a mapping always behaves/performs the
      same once established.  It is the "what you see is what you get" access
      mechanism to differentiated memory vs filesystem DAX which has
      filesystem specific implementation semantics.
      
      Persistent memory is the first target, but the mechanism is also
      targeted for exclusive allocations of performance differentiated memory
      ranges.
      
      This commit is limited to the base device driver infrastructure to
      associate a dax device with pmem range.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      ab68f262
  2. 18 May, 2016 5 commits
  3. 09 May, 2016 3 commits
  4. 06 May, 2016 4 commits
  5. 02 May, 2016 1 commit
  6. 30 Apr, 2016 1 commit
    • Dan Williams's avatar
      libnvdimm, pfn: fix memmap reservation sizing · 658922e5
      Dan Williams authored
      When configuring a pfn-device instance to allocate the memmap array it
      needs to account for the fact that vmemmap_populate_hugepages()
      allocates struct page blocks in HPAGE_SIZE chunks.  We need to align the
      reserved area size to 2MB otherwise arch_add_memory() runs out of memory
      while establishing the memmap:
      
       WARNING: CPU: 0 PID: 496 at arch/x86/mm/init_64.c:704 arch_add_memory+0xe7/0xf0
       [..]
       Call Trace:
        [<ffffffff8148bdb3>] dump_stack+0x85/0xc2
        [<ffffffff810a749b>] __warn+0xcb/0xf0
        [<ffffffff810a75cd>] warn_slowpath_null+0x1d/0x20
        [<ffffffff8106a497>] arch_add_memory+0xe7/0xf0
        [<ffffffff811d2097>] devm_memremap_pages+0x287/0x450
        [<ffffffff811d1ffa>] ? devm_memremap_pages+0x1ea/0x450
        [<ffffffffa0000298>] __wrap_devm_memremap_pages+0x58/0x70 [nfit_test_iomap]
        [<ffffffffa0047a58>] pmem_attach_disk+0x318/0x420 [nd_pmem]
        [<ffffffffa0047bcf>] nd_pmem_probe+0x6f/0x90 [nd_pmem]
        [<ffffffffa0009469>] nvdimm_bus_probe+0x69/0x110 [libnvdimm]
       [..]
        ndbus0: nd_pmem.probe(pfn3.0) = -12
       nd_pmem: probe of pfn3.0 failed with error -12
      libndctl: ndctl_pfn_enable: pfn3.0: failed to enable
      Reported-by: default avatarNamratha Kothapalli <namratha.n.kothapalli@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      658922e5
  7. 29 Apr, 2016 2 commits
  8. 28 Apr, 2016 3 commits
    • Dan Williams's avatar
      nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism · 31eca76b
      Dan Williams authored
      There are currently 4 known similar but incompatible definitions of the
      command sets that can be sent to an NVDIMM through ACPI.  It is also
      clear that future platform generations (ACPI or not) will continue to
      revise and extend the DIMM command set as new devices and use cases
      arrive.
      
      It is obviously untenable to continue to proliferate divergence
      of these command definitions, and to that end a standardization process
      has begun to provide for a unified specification.  However, that leaves a
      problem about what to do with this first generation where vendors are
      already shipping divergence.
      
      The Linux kernel can support these initial diverged platforms without
      giving platform-firmware free reign to continue to diverge and compound
      kernel maintenance overhead.  The kernel implementation can encourage
      standardization in two ways:
      
      1/ Require that any function code that userspace wants to send be
         explicitly white-listed in the implementation.  For ACPI this means
         function codes marked as supported by acpi_check_dsm() may
         only be invoked if they appear in the white-list.  A function must be
         publicly documented before it is added to the white-list.
      
      2/ The above restrictions can be trivially bypassed by using the
         "vendor-specific" payload command.  However, since vendor-specific
         commands are by definition not publicly documented and have the
         potential to corrupt the kernel's view of the dimm state, we provide a
         toggle to disable vendor-specific operations.  Enabling undefined
         behavior is a policy decision that can be made by the platform owner
         and encourages firmware implementations to choose public over
         private command implementations.
      
      Based on an initial patch from Jerry Hoemann
      Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      31eca76b
    • Dan Williams's avatar
      nfit, libnvdimm: clarify "commands" vs "_DSMs" · e3654eca
      Dan Williams authored
      Clarify the distinction between "commands", the ioctls userspace calls
      to request the kernel take some action on a given dimm device, and
      "_DSMs", the actual function numbers used in the firmware interface to
      the DIMM.  _DSMs are ACPI specific whereas commands are Linux kernel
      generic.
      
      This is in preparation for breaking the 1:1 implicit relationship
      between the kernel ioctl number space and the firmware specific function
      numbers.
      
      Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      e3654eca
    • Jerry Hoemann's avatar
      libnvdimm: increase max envelope size for ioctl · 40abf9be
      Jerry Hoemann authored
      nd_ioctl() must first read in the fixed sized portion of an ioctl so
      that it can then determine the size of the variable part.
      
      Prepare for ND_CMD_CALL calls which have larger fixed portion
      envelope.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      40abf9be
  9. 26 Apr, 2016 2 commits
    • Toshi Kani's avatar
      acpi/nfit: Add sysfs "id" for NVDIMM ID · 38a879ba
      Toshi Kani authored
      ACPI 6.1, section 5.2.25.9, defines an identifier for an NVDIMM.
      
      Change the NFIT driver to add a new sysfs file "id" under nfit
      directory.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Robert Elliott <elliott@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      38a879ba
    • Toshi Kani's avatar
      acpi/nfit: Update nfit driver to comply with ACPI 6.1 · 5ad9a7fd
      Toshi Kani authored
      ACPI 6.1, Table 5-133, updates NVDIMM Control Region Structure
      as follows.
       - Valid Fields, Manufacturing Location, and Manufacturing Date
         are added from reserved range.  No change in the structure size.
       - IDs (SPD values) are stored as arrays of bytes (i.e. big-endian
         format).  The spec clarifies that they need to be represented
         as arrays of bytes as well.
      
      This patch makes the following changes to support this update.
       - Change the NFIT driver to show SPD ID values in big-endian
         format.
       - Change sprintf format to use "0x" instead of "#" since "%#02x"
         does not prepend '0'.
      
      link: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdfSigned-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Robert Elliott <elliott@hpe.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      5ad9a7fd
  10. 24 Apr, 2016 2 commits
  11. 23 Apr, 2016 9 commits