Commit 5ccac54f authored by Dan Williams's avatar Dan Williams Committed by Linus Torvalds

ACPI: HMAT: attach a device for each soft-reserved range

The hmem enabling in commit cf8741ac ("ACPI: NUMA: HMAT: Register
"soft reserved" memory as an "hmem" device") only registered ranges to the
hmem driver for each soft-reservation that also appeared in the HMAT.
While this is meant to encourage platform firmware to "do the right thing"
and publish an HMAT, the corollary is that platforms that fail to publish
an accurate HMAT will strand memory from Linux usage.  Additionally, the
"efi_fake_mem" kernel command line option enabling will strand memory by
default without an HMAT.

Arrange for "soft reserved" memory that goes unclaimed by HMAT entries to
be published as raw resource ranges for the hmem driver to consume.

Include a module parameter to disable either this fallback behavior, or
the hmat enabling from creating hmem devices.  The module parameter
requires the hmem device enabling to have unique name in the module
namespace: "device_hmem".

The driver depends on the architecture providing phys_to_target_node()
which is only x86 via numa_meminfo() and arm64 via a generic memblock
implementation.

[joao.m.martins@oracle.com: require NUMA_KEEP_MEMINFO for phys_to_target_node()]
  Link: https://lkml.kernel.org/r/aaae71a7-4846-f5cc-5acf-cf05fdb1f2dc@oracle.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: default avatarJoao Martins <joao.m.martins@oracle.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Jia He <justin.he@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Hulk Robot <hulkci@huawei.com>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: https://lkml.kernel.org/r/159643098298.4062302.17587338161136144730.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent a035b6bf
...@@ -35,6 +35,7 @@ config DEV_DAX_PMEM ...@@ -35,6 +35,7 @@ config DEV_DAX_PMEM
config DEV_DAX_HMEM config DEV_DAX_HMEM
tristate "HMEM DAX: direct access to 'specific purpose' memory" tristate "HMEM DAX: direct access to 'specific purpose' memory"
depends on EFI_SOFT_RESERVE depends on EFI_SOFT_RESERVE
select NUMA_KEEP_MEMINFO if (NUMA && X86)
default DEV_DAX default DEV_DAX
help help
EFI 2.8 platforms, and others, may advertise 'specific purpose' EFI 2.8 platforms, and others, may advertise 'specific purpose'
...@@ -49,6 +50,7 @@ config DEV_DAX_HMEM ...@@ -49,6 +50,7 @@ config DEV_DAX_HMEM
Say M if unsure. Say M if unsure.
config DEV_DAX_HMEM_DEVICES config DEV_DAX_HMEM_DEVICES
depends on NUMA_KEEP_MEMINFO # for phys_to_target_node()
depends on DEV_DAX_HMEM && DAX=y depends on DEV_DAX_HMEM && DAX=y
def_bool y def_bool y
......
# SPDX-License-Identifier: GPL-2.0 # SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device.o obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o
device_hmem-y := device.o
dax_hmem-y := hmem.o dax_hmem-y := hmem.o
...@@ -5,6 +5,9 @@ ...@@ -5,6 +5,9 @@
#include <linux/dax.h> #include <linux/dax.h>
#include <linux/mm.h> #include <linux/mm.h>
static bool nohmem;
module_param_named(disable, nohmem, bool, 0444);
void hmem_register_device(int target_nid, struct resource *r) void hmem_register_device(int target_nid, struct resource *r)
{ {
/* define a clean / non-busy resource for the platform device */ /* define a clean / non-busy resource for the platform device */
...@@ -17,6 +20,9 @@ void hmem_register_device(int target_nid, struct resource *r) ...@@ -17,6 +20,9 @@ void hmem_register_device(int target_nid, struct resource *r)
struct memregion_info info; struct memregion_info info;
int rc, id; int rc, id;
if (nohmem)
return;
rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM, rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM,
IORES_DESC_SOFT_RESERVED); IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS) if (rc != REGION_INTERSECTS)
...@@ -63,3 +69,32 @@ void hmem_register_device(int target_nid, struct resource *r) ...@@ -63,3 +69,32 @@ void hmem_register_device(int target_nid, struct resource *r)
out_pdev: out_pdev:
memregion_free(id); memregion_free(id);
} }
static __init int hmem_register_one(struct resource *res, void *data)
{
/*
* If the resource is not a top-level resource it was already
* assigned to a device by the HMAT parsing.
*/
if (res->parent != &iomem_resource) {
pr_info("HMEM: skip %pr, already claimed\n", res);
return 0;
}
hmem_register_device(phys_to_target_node(res->start), res);
return 0;
}
static __init int hmem_init(void)
{
walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
return 0;
}
/*
* As this is a fallback for address ranges unclaimed by the ACPI HMAT
* parsing it must be at an initcall level greater than hmat_init().
*/
late_initcall(hmem_init);
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment