1. 03 Apr, 2020 10 commits
  2. 31 Mar, 2020 3 commits
  3. 30 Mar, 2020 2 commits
  4. 26 Mar, 2020 1 commit
  5. 17 Mar, 2020 4 commits
    • Dan Williams's avatar
      libnvdimm/region: Introduce an 'align' attribute · 2522afb8
      Dan Williams authored
      The align attribute applies an alignment constraint for namespace
      creation in a region. Whereas the 'align' attribute of a namespace
      applied alignment padding via an info block, the 'align' attribute
      applies alignment constraints to the free space allocation.
      
      The default for 'align' is the maximum known memremap_compat_align()
      across all archs (16MiB from PowerPC at time of writing) multiplied by
      the number of interleave ways if there is blk-aliasing. The minimum is
      PAGE_SIZE and allows for the creation of cross-arch incompatible
      namespaces, just as previous kernels allowed, but the expectation is
      cross-arch and mode-independent compatibility by default.
      
      The regression risk with this change is limited to cases that were
      dependent on the ability to create unaligned namespaces, *and* for some
      reason are unable to opt-out of aligned namespaces by writing to
      'regionX/align'. If such a scenario arises the default can be flipped
      from opt-out to opt-in of compat-aligned namespace creation, but that is
      a last resort. The kernel will otherwise continue to support existing
      defined misaligned namespaces.
      
      Unfortunately this change needs to touch several parts of the
      implementation at once:
      
      - region/available_size: expand busy extents to current align
      - region/max_available_extent: expand busy extents to current align
      - namespace/size: trim free space to current align
      
      ...to keep the free space accounting conforming to the dynamic align
      setting.
      Reported-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Link: https://lore.kernel.org/r/158041478371.3889308.14542630147672668068.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      2522afb8
    • Dan Williams's avatar
      libnvdimm/region: Introduce NDD_LABELING · a0e37452
      Dan Williams authored
      The NDD_ALIASING flag is used to indicate where pmem capacity might
      alias with blk capacity and require labeling. It is also used to
      indicate whether the DIMM supports labeling. Separate this latter
      capability into its own flag so that the NDD_ALIASING flag is scoped to
      true aliased configurations.
      
      To my knowledge aliased configurations only exist in the ACPI spec,
      there are no known platforms that ship this support in production.
      
      This clarity allows namespace-capacity alignment constraints around
      interleave-ways to be relaxed.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Oliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Link: https://lore.kernel.org/r/158041477856.3889308.4212605617834097674.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      a0e37452
    • Dan Williams's avatar
      libnvdimm/namespace: Enforce memremap_compat_align() · 6acd7d5e
      Dan Williams authored
      The pmem driver on PowerPC crashes with the following signature when
      instantiating misaligned namespaces that map their capacity via
      memremap_pages().
      
          BUG: Unable to handle kernel data access at 0xc001000406000000
          Faulting instruction address: 0xc000000000090790
          NIP [c000000000090790] arch_add_memory+0xc0/0x130
          LR [c000000000090744] arch_add_memory+0x74/0x130
          Call Trace:
           arch_add_memory+0x74/0x130 (unreliable)
           memremap_pages+0x74c/0xa30
           devm_memremap_pages+0x3c/0xa0
           pmem_attach_disk+0x188/0x770
           nvdimm_bus_probe+0xd8/0x470
      
      With the assumption that only memremap_pages() has alignment
      constraints, enforce memremap_compat_align() for
      pmem_should_map_pages(), nd_pfn, and nd_dax cases. This includes
      preventing the creation of namespaces where the base address is
      misaligned and cases there infoblock padding parameters are invalid.
      Reported-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Fixes: a3619190 ("libnvdimm/pfn: stop padding pmem namespaces to section alignment")
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      6acd7d5e
    • Dan Williams's avatar
      libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid · b2ba7e91
      Dan Williams authored
      The EOPNOTSUPP return code from the pmem driver indicates that the
      namespace has a configuration that may be valid, but the current kernel
      does not support it. Expand this to all of the nd_pfn_validate() error
      conditions after the infoblock has been verified as self consistent.
      
      This prevents exposing the namespace to I/O when the infoblock needs to
      be corrected, or the system needs to be put into a different
      configuration (like changing the page size on PowerPC).
      
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      b2ba7e91
  6. 29 Feb, 2020 2 commits
  7. 21 Feb, 2020 1 commit
    • Dan Williams's avatar
      mm/memremap_pages: Introduce memremap_compat_align() · 9ffc1d19
      Dan Williams authored
      The "sub-section memory hotplug" facility allows memremap_pages() users
      like libnvdimm to compensate for hardware platforms like x86 that have a
      section size larger than their hardware memory mapping granularity.  The
      compensation that sub-section support affords is being tolerant of
      physical memory resources shifting by units smaller (64MiB on x86) than
      the memory-hotplug section size (128 MiB). Where the platform
      physical-memory mapping granularity is limited by the number and
      capability of address-decode-registers in the memory controller.
      
      While the sub-section support allows memremap_pages() to operate on
      sub-section (2MiB) granularity, the Power architecture may still
      require 16MiB alignment on "!radix_enabled()" platforms.
      
      In order for libnvdimm to be able to detect and manage this per-arch
      limitation, introduce memremap_compat_align() as a common minimum
      alignment across all driver-facing memory-mapping interfaces, and let
      Power override it to 16MiB in the "!radix_enabled()" case.
      
      The assumption / requirement for 16MiB to be a viable
      memremap_compat_align() value is that Power does not have platforms
      where its equivalent of address-decode-registers never hardware remaps a
      persistent memory resource on smaller than 16MiB boundaries. Note that I
      tried my best to not add a new Kconfig symbol, but header include
      entanglements defeated the #ifndef memremap_compat_align design pattern
      and the need to export it defeats the __weak design pattern for arch
      overrides.
      
      Based on an initial patch by Aneesh.
      
      Link: http://lore.kernel.org/r/CAPcyv4gBGNP95APYaBcsocEa50tQj9b5h__83vgngjq3ouGX_Q@mail.gmail.comReported-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      9ffc1d19
  8. 19 Feb, 2020 1 commit
  9. 18 Feb, 2020 4 commits
    • Jan Kara's avatar
      tools/testing/nvdimm: Fix compilation failure without CONFIG_DEV_DAX_PMEM_COMPAT · c0e71d60
      Jan Kara authored
      When a kernel is configured without CONFIG_DEV_DAX_PMEM_COMPAT, the
      compilation of tools/testing/nvdimm fails with:
      
        Building modules, stage 2.
        MODPOST 11 modules
      ERROR: "dax_pmem_compat_test" [tools/testing/nvdimm/test/nfit_test.ko] undefined!
      
      Fix the problem by calling dax_pmem_compat_test() only if the kernel has
      the required functionality.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200123154720.12097-1-jack@suse.czSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      c0e71d60
    • Lukas Bulwahn's avatar
      MAINTAINERS: clarify maintenance of nvdimm testing tool · b9bd8039
      Lukas Bulwahn authored
      The git history shows that the files under ./tools/testing/nvdimm are
      being developed and maintained by the LIBNVDIMM maintainers.
      
      This was identified with a small script that finds all files only
      belonging to "THE REST" according to the current MAINTAINERS file, and I
      acted upon its output.
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Link: https://lore.kernel.org/r/20200201170933.924-1-lukas.bulwahn@gmail.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      b9bd8039
    • Dan Williams's avatar
      libnvdimm/e820: Retrieve and populate correct 'target_node' info · 7b27a862
      Dan Williams authored
      Use the new phys_to_target_node() and numa_map_to_online_node() helpers
      to retrieve the correct id for the 'numa_node' ("local" / online
      initiator node) and 'target_node' (offline target memory node) sysfs
      attributes.
      
      Below is an example from a 4 NUMA node system where all the memory on
      node2 is pmem / reserved. It should be noted that with the arrival of
      the ACPI HMAT table and EFI Specific Purpose Memory the kernel will
      start to see more platforms with reserved / performance differentiated
      memory in its own NUMA node. Hence all the stakeholders on the Cc for
      what is ostensibly a libnvdimm local patch.
      
      === Before ===
      
      /* Notice no online memory on node2 at start */
      
      # numactl --hardware
      available: 3 nodes (0-1,3)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
      node 0 size: 3958 MB
      node 0 free: 3708 MB
      node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
      node 1 size: 4027 MB
      node 1 free: 3871 MB
      node 3 cpus:
      node 3 size: 3994 MB
      node 3 free: 3971 MB
      node distances:
      node   0   1   3
        0:  10  21  21
        1:  21  10  21
        3:  21  21  10
      
      /*
       * Put the pmem namespace into devdax mode so it can be assigned to the
       * kmem driver
       */
      
      # ndctl create-namespace -e namespace0.0 -m devdax -f
      {
        "dev":"namespace0.0",
        "mode":"devdax",
        "map":"dev",
        "size":"3.94 GiB (4.23 GB)",
        "uuid":"1650af9b-9ba3-4704-acd6-10178399d9a3",
        [..]
      }
      
      /* Online Persistent Memory as System RAM */
      
      # daxctl reconfigure-device --mode=system-ram dax0.0
      libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success
      libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success
      libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success
      libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success
      [
        {
          "chardev":"dax0.0",
          "size":4225761280,
          "target_node":0,
          "mode":"system-ram"
        }
      ]
      reconfigured 1 device
      
      /* Note that the memory is onlined by default to the wrong node, node0 */
      
      # numactl --hardware
      available: 3 nodes (0-1,3)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
      node 0 size: 7926 MB
      node 0 free: 7655 MB
      node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
      node 1 size: 4027 MB
      node 1 free: 3871 MB
      node 3 cpus:
      node 3 size: 3994 MB
      node 3 free: 3971 MB
      node distances:
      node   0   1   3
        0:  10  21  21
        1:  21  10  21
        3:  21  21  10
      
      === After ===
      
      /* Notice that the "phys_index" error messages are gone */
      
      # daxctl reconfigure-device --mode=system-ram dax0.0
      [
        {
          "chardev":"dax0.0",
          "size":4225761280,
          "target_node":2,
          "mode":"system-ram"
        }
      ]
      reconfigured 1 device
      
      /* Notice that node2 is now correctly populated */
      
      # numactl --hardware
      available: 4 nodes (0-3)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
      node 0 size: 3958 MB
      node 0 free: 3793 MB
      node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
      node 1 size: 4027 MB
      node 1 free: 3851 MB
      node 2 cpus:
      node 2 size: 3968 MB
      node 2 free: 3968 MB
      node 3 cpus:
      node 3 size: 3994 MB
      node 3 free: 3908 MB
      node distances:
      node   0   1   2   3
        0:  10  21  21  21
        1:  21  10  21  21
        2:  21  21  10  21
        3:  21  21  21  10
      
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Link: https://lore.kernel.org/r/158188327614.894464.13122730362187722603.stgit@dwillia2-desk3.amr.corp.intel.com
      7b27a862
    • Dan Williams's avatar
      x86/NUMA: Provide a range-to-target_node lookup facility · 5d30f92e
      Dan Williams authored
      The DEV_DAX_KMEM facility is a generic mechanism to allow device-dax
      instances, fronting performance-differentiated-memory like pmem, to be
      added to the System RAM pool. The NUMA node for that hot-added memory is
      derived from the device-dax instance's 'target_node' attribute.
      
      Recall that the 'target_node' is the ACPI-PXM-to-node translation for
      memory when it comes online whereas the 'numa_node' attribute of the
      device represents the closest online cpu node.
      
      Presently useful target_node information from the ACPI SRAT is discarded
      with the expectation that "Reserved" memory will never be onlined. Now,
      DEV_DAX_KMEM violates that assumption, there is a need to retain the
      translation. Move, rather than discard, numa_memblk data to a secondary
      array that memory_add_physaddr_to_target_node() may consider at a later
      point in time.
      
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <x86@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/158188326978.894464.217282995221175417.stgit@dwillia2-desk3.amr.corp.intel.com
      5d30f92e
  10. 17 Feb, 2020 4 commits
  11. 16 Feb, 2020 8 commits
    • Linus Torvalds's avatar
      Linux 5.6-rc2 · 11a48a5a
      Linus Torvalds authored
      11a48a5a
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.6-1' of https://github.com/cminyard/linux-ipmi · ab02b61f
      Linus Torvalds authored
      Pull IPMI update from Corey Minyard:
       "Minor bug fixes for IPMI
      
        I know this is late; I've been travelling and, well, I've been
        distracted.
      
        This is just a few bug fixes and adding i2c support to the IPMB
        driver, which is something I wanted from the beginning for it"
      
      * tag 'for-linus-5.6-1' of https://github.com/cminyard/linux-ipmi:
        drivers: ipmi: fix off-by-one bounds check that leads to a out-of-bounds write
        ipmi:ssif: Handle a possible NULL pointer reference
        drivers: ipmi: Modify max length of IPMB packet
        drivers: ipmi: Support raw i2c packet in IPMB
      ab02b61f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 44024adb
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Bugfixes and improvements to selftests.
      
        On top of this, Mauro converted the KVM documentation to rst format,
        which was very welcome"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
        docs: virt: guest-halt-polling.txt convert to ReST
        docs: kvm: review-checklist.txt: rename to ReST
        docs: kvm: Convert timekeeping.txt to ReST format
        docs: kvm: Convert s390-diag.txt to ReST format
        docs: kvm: Convert ppc-pv.txt to ReST format
        docs: kvm: Convert nested-vmx.txt to ReST format
        docs: kvm: Convert mmu.txt to ReST format
        docs: kvm: Convert locking.txt to ReST format
        docs: kvm: Convert hypercalls.txt to ReST format
        docs: kvm: arm/psci.txt: convert to ReST
        docs: kvm: convert arm/hyp-abi.txt to ReST
        docs: kvm: Convert api.txt to ReST format
        docs: kvm: convert devices/xive.txt to ReST
        docs: kvm: convert devices/xics.txt to ReST
        docs: kvm: convert devices/vm.txt to ReST
        docs: kvm: convert devices/vfio.txt to ReST
        docs: kvm: convert devices/vcpu.txt to ReST
        docs: kvm: convert devices/s390_flic.txt to ReST
        docs: kvm: convert devices/mpic.txt to ReST
        docs: kvm: convert devices/arm-vgit.txt to ReST
        ...
      44024adb
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · b982df72
      Linus Torvalds authored
      Pull EDAC fixes from Borislav Petkov:
       "Two fixes for use-after-free and memory leaking in the EDAC core, by
        Robert Richter.
      
        Debug options like DEBUG_TEST_DRIVER_REMOVE, KASAN and DEBUG_KMEMLEAK
        unearthed issues with the lifespan of memory allocated by the EDAC
        memory controller descriptor due to misdesigned memory freeing, done
        partially by the EDAC core *and* the driver core, which is problematic
        to say the least.
      
        These two are minimal fixes to take care of stable - a proper rework
        is following which cleans up that mess properly"
      
      * tag 'edac_urgent_for_5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/sysfs: Remove csrow objects on errors
        EDAC/mc: Fix use-after-free and memleaks during device removal
      b982df72
    • Linus Torvalds's avatar
      Merge tag 'block-5.6-2020-02-16' of git://git.kernel.dk/linux-block · e29c6a13
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Not a lot here, which is great, basically just three small bcache
        fixes from Coly, and four NVMe fixes via Keith"
      
      * tag 'block-5.6-2020-02-16' of git://git.kernel.dk/linux-block:
        nvme: fix the parameter order for nvme_get_log in nvme_get_fw_slot_info
        nvme/pci: move cqe check after device shutdown
        nvme: prevent warning triggered by nvme_stop_keep_alive
        nvme/tcp: fix bug on double requeue when send fails
        bcache: remove macro nr_to_fifo_front()
        bcache: Revert "bcache: shrink btree node cache after bch_btree_check()"
        bcache: ignore pending signals when creating gc and allocator thread
      e29c6a13
    • Linus Torvalds's avatar
      Merge tag 'for-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 713db356
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Two races fixed, memory leak fix, sysfs directory fixup and two new
        log messages:
      
         - two fixed race conditions: extent map merging and truncate vs
           fiemap
      
         - create the right sysfs directory with device information and move
           the individual device dirs under it
      
         - print messages when the tree-log is replayed at mount time or
           cannot be replayed on remount"
      
      * tag 'for-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: sysfs, move device id directories to UUID/devinfo
        btrfs: sysfs, add UUID/devinfo kobject
        Btrfs: fix race between shrinking truncate and fiemap
        btrfs: log message when rw remount is attempted with unclean tree-log
        btrfs: print message when tree-log replay starts
        Btrfs: fix race between using extent maps and merging them
        btrfs: ref-verify: fix memory leaks
      713db356
    • Linus Torvalds's avatar
      Merge tag '5.6-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 288b27a0
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Four small CIFS/SMB3 fixes. One (the EA overflow fix) for stable"
      
      * tag '5.6-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: make sure we do not overflow the max EA buffer size
        cifs: enable change notification for SMB2.1 dialect
        cifs: Fix mode output in debugging statements
        cifs: fix mount option display for sec=krb5i
      288b27a0
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 8a8b8096
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Miscellaneous ext4 bug fixes (all stable fodder)"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: improve explanation of a mount failure caused by a misconfigured kernel
        jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer
        jbd2: move the clearing of b_modified flag to the journal_unmap_buffer()
        ext4: add cond_resched() to ext4_protect_reserved_inode
        ext4: fix checksum errors with indexed dirs
        ext4: fix support for inode sizes > 1024 bytes
        ext4: simplify checking quota limits in ext4_statfs()
        ext4: don't assume that mmp_nodename/bdevname have NUL
      8a8b8096