Commit 972d5e7e authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 EFI changes from Ingo Molnar:
 "This consists of two main parts:

   - New static EFI runtime services virtual mapping layout which is
     groundwork for kexec support on EFI (Borislav Petkov)

   - EFI kexec support itself (Dave Young)"

* 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/efi: parse_efi_setup() build fix
  x86: ksysfs.c build fix
  x86/efi: Delete superfluous global variables
  x86: Reserve setup_data ranges late after parsing memmap cmdline
  x86: Export x86 boot_params to sysfs
  x86: Add xloadflags bit for EFI runtime support on kexec
  x86/efi: Pass necessary EFI data for kexec via setup_data
  efi: Export EFI runtime memory mapping to sysfs
  efi: Export more EFI table variables to sysfs
  x86/efi: Cleanup efi_enter_virtual_mode() function
  x86/efi: Fix off-by-one bug in EFI Boot Services reservation
  x86/efi: Add a wrapper function efi_map_region_fixed()
  x86/efi: Remove unused variables in __map_region()
  x86/efi: Check krealloc return value
  x86/efi: Runtime services virtual mapping
  x86/mm/cpa: Map in an arbitrary pgd
  x86/mm/pageattr: Add last levels of error path
  x86/mm/pageattr: Add a PUD error unwinding path
  x86/mm/pageattr: Add a PTE pagetable populating function
  x86/mm/pageattr: Add a PMD pagetable populating function
  ...
parents 5d4863e4 ef0b8b9a
What: /sys/firmware/efi/fw_vendor
Date: December 2013
Contact: Dave Young <dyoung@redhat.com>
Description: It shows the physical address of firmware vendor field in the
EFI system table.
Users: Kexec
What: /sys/firmware/efi/runtime
Date: December 2013
Contact: Dave Young <dyoung@redhat.com>
Description: It shows the physical address of runtime service table entry in
the EFI system table.
Users: Kexec
What: /sys/firmware/efi/config_table
Date: December 2013
Contact: Dave Young <dyoung@redhat.com>
Description: It shows the physical address of config table entry in the EFI
system table.
Users: Kexec
What: /sys/firmware/efi/runtime-map/
Date: December 2013
Contact: Dave Young <dyoung@redhat.com>
Description: Switching efi runtime services to virtual mode requires
that all efi memory ranges which have the runtime attribute
bit set to be mapped to virtual addresses.
The efi runtime services can only be switched to virtual
mode once without rebooting. The kexec kernel must maintain
the same physical to virtual address mappings as the first
kernel. The mappings are exported to sysfs so userspace tools
can reassemble them and pass them into the kexec kernel.
/sys/firmware/efi/runtime-map/ is the directory the kernel
exports that information in.
subdirectories are named with the number of the memory range:
/sys/firmware/efi/runtime-map/0
/sys/firmware/efi/runtime-map/1
/sys/firmware/efi/runtime-map/2
/sys/firmware/efi/runtime-map/3
...
Each subdirectory contains five files:
attribute : The attributes of the memory range.
num_pages : The size of the memory range in pages.
phys_addr : The physical address of the memory range.
type : The type of the memory range.
virt_addr : The virtual address of the memory range.
Above values are all hexadecimal numbers with the '0x' prefix.
Users: Kexec
What: /sys/kernel/boot_params
Date: December 2013
Contact: Dave Young <dyoung@redhat.com>
Description: The /sys/kernel/boot_params directory contains two
files: "data" and "version" and one subdirectory "setup_data".
It is used to export the kernel boot parameters of an x86
platform to userspace for kexec and debugging purpose.
If there's no setup_data in boot_params the subdirectory will
not be created.
"data" file is the binary representation of struct boot_params.
"version" file is the string representation of boot
protocol version.
"setup_data" subdirectory contains the setup_data data
structure in boot_params. setup_data is maintained in kernel
as a link list. In "setup_data" subdirectory there's one
subdirectory for each link list node named with the number
of the list nodes. The list node subdirectory contains two
files "type" and "data". "type" file is the string
representation of setup_data type. "data" file is the binary
representation of setup_data payload.
The whole boot_params directory structure is like below:
/sys/kernel/boot_params
|__ data
|__ setup_data
| |__ 0
| | |__ data
| | |__ type
| |__ 1
| |__ data
| |__ type
|__ version
Users: Kexec
......@@ -899,6 +899,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
efi= [EFI]
Format: { "old_map" }
old_map [X86-64]: switch to the old ioremap-based EFI
runtime services mapping. 32-bit still uses this one by
default.
efi_no_storage_paranoia [EFI; X86]
Using this parameter you can use more than 50% of
your efi variable storage. Use this parameter only if
......
......@@ -608,6 +608,9 @@ Protocol: 2.12+
- If 1, the kernel supports the 64-bit EFI handoff entry point
given at handover_offset + 0x200.
Bit 4 (read): XLF_EFI_KEXEC
- If 1, the kernel supports kexec EFI boot with EFI runtime support.
Field name: cmdline_size
Type: read
Offset/size: 0x238/4
......
......@@ -28,4 +28,11 @@ reference.
Current X86-64 implementations only support 40 bits of address space,
but we support up to 46 bits. This expands into MBZ space in the page tables.
->trampoline_pgd:
We map EFI runtime services in the aforementioned PGD in the virtual
range of 64Gb (arbitrarily set, can be raised if needed)
0xffffffef00000000 - 0xffffffff00000000
-Andi Kleen, Jul 2004
......@@ -391,7 +391,14 @@ xloadflags:
#else
# define XLF23 0
#endif
.word XLF0 | XLF1 | XLF23
#if defined(CONFIG_X86_64) && defined(CONFIG_EFI) && defined(CONFIG_KEXEC)
# define XLF4 XLF_EFI_KEXEC
#else
# define XLF4 0
#endif
.word XLF0 | XLF1 | XLF23 | XLF4
cmdline_size: .long COMMAND_LINE_SIZE-1 #length of the command line,
#added with boot protocol
......
#ifndef _ASM_X86_EFI_H
#define _ASM_X86_EFI_H
/*
* We map the EFI regions needed for runtime services non-contiguously,
* with preserved alignment on virtual addresses starting from -4G down
* for a total max space of 64G. This way, we provide for stable runtime
* services addresses across kernels so that a kexec'd kernel can still
* use them.
*
* This is the main reason why we're doing stable VA mappings for RT
* services.
*
* This flag is used in conjuction with a chicken bit called
* "efi=old_map" which can be used as a fallback to the old runtime
* services mapping method in case there's some b0rkage with a
* particular EFI implementation (haha, it is hard to hold up the
* sarcasm here...).
*/
#define EFI_OLD_MEMMAP EFI_ARCH_1
#ifdef CONFIG_X86_32
#define EFI_LOADER_SIGNATURE "EL32"
......@@ -69,24 +87,31 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
(u64)(a4), (u64)(a5), (u64)(a6))
#define _efi_call_virtX(x, f, ...) \
({ \
efi_status_t __s; \
\
efi_sync_low_kernel_mappings(); \
preempt_disable(); \
__s = efi_call##x((void *)efi.systab->runtime->f, __VA_ARGS__); \
preempt_enable(); \
__s; \
})
#define efi_call_virt0(f) \
efi_call0((efi.systab->runtime->f))
#define efi_call_virt1(f, a1) \
efi_call1((efi.systab->runtime->f), (u64)(a1))
#define efi_call_virt2(f, a1, a2) \
efi_call2((efi.systab->runtime->f), (u64)(a1), (u64)(a2))
#define efi_call_virt3(f, a1, a2, a3) \
efi_call3((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3))
#define efi_call_virt4(f, a1, a2, a3, a4) \
efi_call4((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4))
#define efi_call_virt5(f, a1, a2, a3, a4, a5) \
efi_call5((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4), (u64)(a5))
#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6) \
efi_call6((efi.systab->runtime->f), (u64)(a1), (u64)(a2), \
(u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
_efi_call_virtX(0, f)
#define efi_call_virt1(f, a1) \
_efi_call_virtX(1, f, (u64)(a1))
#define efi_call_virt2(f, a1, a2) \
_efi_call_virtX(2, f, (u64)(a1), (u64)(a2))
#define efi_call_virt3(f, a1, a2, a3) \
_efi_call_virtX(3, f, (u64)(a1), (u64)(a2), (u64)(a3))
#define efi_call_virt4(f, a1, a2, a3, a4) \
_efi_call_virtX(4, f, (u64)(a1), (u64)(a2), (u64)(a3), (u64)(a4))
#define efi_call_virt5(f, a1, a2, a3, a4, a5) \
_efi_call_virtX(5, f, (u64)(a1), (u64)(a2), (u64)(a3), (u64)(a4), (u64)(a5))
#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6) \
_efi_call_virtX(6, f, (u64)(a1), (u64)(a2), (u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
u32 type, u64 attribute);
......@@ -95,12 +120,28 @@ extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
extern int add_efi_memmap;
extern unsigned long x86_efi_facility;
extern struct efi_scratch efi_scratch;
extern void efi_set_executable(efi_memory_desc_t *md, bool executable);
extern int efi_memblock_x86_reserve_range(void);
extern void efi_call_phys_prelog(void);
extern void efi_call_phys_epilog(void);
extern void efi_unmap_memmap(void);
extern void efi_memory_uc(u64 addr, unsigned long size);
extern void __init efi_map_region(efi_memory_desc_t *md);
extern void __init efi_map_region_fixed(efi_memory_desc_t *md);
extern void efi_sync_low_kernel_mappings(void);
extern void efi_setup_page_tables(void);
extern void __init old_map_region(efi_memory_desc_t *md);
struct efi_setup_data {
u64 fw_vendor;
u64 runtime;
u64 tables;
u64 smbios;
u64 reserved[8];
};
extern u64 efi_setup;
#ifdef CONFIG_EFI
......@@ -110,7 +151,7 @@ static inline bool efi_is_native(void)
}
extern struct console early_efi_console;
extern void parse_efi_setup(u64 phys_addr, u32 data_len);
#else
/*
* IF EFI is not configured, have the EFI calls return -ENOSYS.
......@@ -122,6 +163,7 @@ extern struct console early_efi_console;
#define efi_call4(_f, _a1, _a2, _a3, _a4) (-ENOSYS)
#define efi_call5(_f, _a1, _a2, _a3, _a4, _a5) (-ENOSYS)
#define efi_call6(_f, _a1, _a2, _a3, _a4, _a5, _a6) (-ENOSYS)
static inline void parse_efi_setup(u64 phys_addr, u32 data_len) {}
#endif /* CONFIG_EFI */
#endif /* _ASM_X86_EFI_H */
......@@ -382,7 +382,8 @@ static inline void update_page_count(int level, unsigned long pages) { }
*/
extern pte_t *lookup_address(unsigned long address, unsigned int *level);
extern phys_addr_t slow_virt_to_phys(void *__address);
extern int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
unsigned numpages, unsigned long page_flags);
#endif /* !__ASSEMBLY__ */
#endif /* _ASM_X86_PGTABLE_DEFS_H */
......@@ -6,6 +6,7 @@
#define SETUP_E820_EXT 1
#define SETUP_DTB 2
#define SETUP_PCI 3
#define SETUP_EFI 4
/* ram_size flags */
#define RAMDISK_IMAGE_START_MASK 0x07FF
......@@ -23,6 +24,7 @@
#define XLF_CAN_BE_LOADED_ABOVE_4G (1<<1)
#define XLF_EFI_HANDOVER_32 (1<<2)
#define XLF_EFI_HANDOVER_64 (1<<3)
#define XLF_EFI_KEXEC (1<<4)
#ifndef __ASSEMBLY__
......
......@@ -29,6 +29,7 @@ obj-$(CONFIG_X86_64) += sys_x86_64.o x8664_ksyms_64.o
obj-y += syscall_$(BITS).o
obj-$(CONFIG_X86_64) += vsyscall_64.o
obj-$(CONFIG_X86_64) += vsyscall_emu_64.o
obj-$(CONFIG_SYSFS) += ksysfs.o
obj-y += bootflag.o e820.o
obj-y += pci-dma.o quirks.o topology.o kdebugfs.o
obj-y += alternative.o i8253.o pci-nommu.o hw_breakpoint.o
......
/*
* Architecture specific sysfs attributes in /sys/kernel
*
* Copyright (C) 2007, Intel Corp.
* Huang Ying <ying.huang@intel.com>
* Copyright (C) 2013, 2013 Red Hat, Inc.
* Dave Young <dyoung@redhat.com>
*
* This file is released under the GPLv2
*/
#include <linux/kobject.h>
#include <linux/string.h>
#include <linux/sysfs.h>
#include <linux/init.h>
#include <linux/stat.h>
#include <linux/slab.h>
#include <linux/mm.h>
#include <asm/io.h>
#include <asm/setup.h>
static ssize_t version_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
return sprintf(buf, "0x%04x\n", boot_params.hdr.version);
}
static struct kobj_attribute boot_params_version_attr = __ATTR_RO(version);
static ssize_t boot_params_data_read(struct file *fp, struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf, loff_t off, size_t count)
{
memcpy(buf, (void *)&boot_params + off, count);
return count;
}
static struct bin_attribute boot_params_data_attr = {
.attr = {
.name = "data",
.mode = S_IRUGO,
},
.read = boot_params_data_read,
.size = sizeof(boot_params),
};
static struct attribute *boot_params_version_attrs[] = {
&boot_params_version_attr.attr,
NULL,
};
static struct bin_attribute *boot_params_data_attrs[] = {
&boot_params_data_attr,
NULL,
};
static struct attribute_group boot_params_attr_group = {
.attrs = boot_params_version_attrs,
.bin_attrs = boot_params_data_attrs,
};
static int kobj_to_setup_data_nr(struct kobject *kobj, int *nr)
{
const char *name;
name = kobject_name(kobj);
return kstrtoint(name, 10, nr);
}
static int get_setup_data_paddr(int nr, u64 *paddr)
{
int i = 0;
struct setup_data *data;
u64 pa_data = boot_params.hdr.setup_data;
while (pa_data) {
if (nr == i) {
*paddr = pa_data;
return 0;
}
data = ioremap_cache(pa_data, sizeof(*data));
if (!data)
return -ENOMEM;
pa_data = data->next;
iounmap(data);
i++;
}
return -EINVAL;
}
static int __init get_setup_data_size(int nr, size_t *size)
{
int i = 0;
struct setup_data *data;
u64 pa_data = boot_params.hdr.setup_data;
while (pa_data) {
data = ioremap_cache(pa_data, sizeof(*data));
if (!data)
return -ENOMEM;
if (nr == i) {
*size = data->len;
iounmap(data);
return 0;
}
pa_data = data->next;
iounmap(data);
i++;
}
return -EINVAL;
}
static ssize_t type_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
int nr, ret;
u64 paddr;
struct setup_data *data;
ret = kobj_to_setup_data_nr(kobj, &nr);
if (ret)
return ret;
ret = get_setup_data_paddr(nr, &paddr);
if (ret)
return ret;
data = ioremap_cache(paddr, sizeof(*data));
if (!data)
return -ENOMEM;
ret = sprintf(buf, "0x%x\n", data->type);
iounmap(data);
return ret;
}
static ssize_t setup_data_data_read(struct file *fp,
struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf,
loff_t off, size_t count)
{
int nr, ret = 0;
u64 paddr;
struct setup_data *data;
void *p;
ret = kobj_to_setup_data_nr(kobj, &nr);
if (ret)
return ret;
ret = get_setup_data_paddr(nr, &paddr);
if (ret)
return ret;
data = ioremap_cache(paddr, sizeof(*data));
if (!data)
return -ENOMEM;
if (off > data->len) {
ret = -EINVAL;
goto out;
}
if (count > data->len - off)
count = data->len - off;
if (!count)
goto out;
ret = count;
p = ioremap_cache(paddr + sizeof(*data), data->len);
if (!p) {
ret = -ENOMEM;
goto out;
}
memcpy(buf, p + off, count);
iounmap(p);
out:
iounmap(data);
return ret;
}
static struct kobj_attribute type_attr = __ATTR_RO(type);
static struct bin_attribute data_attr = {
.attr = {
.name = "data",
.mode = S_IRUGO,
},
.read = setup_data_data_read,
};
static struct attribute *setup_data_type_attrs[] = {
&type_attr.attr,
NULL,
};
static struct bin_attribute *setup_data_data_attrs[] = {
&data_attr,
NULL,
};
static struct attribute_group setup_data_attr_group = {
.attrs = setup_data_type_attrs,
.bin_attrs = setup_data_data_attrs,
};
static int __init create_setup_data_node(struct kobject *parent,
struct kobject **kobjp, int nr)
{
int ret = 0;
size_t size;
struct kobject *kobj;
char name[16]; /* should be enough for setup_data nodes numbers */
snprintf(name, 16, "%d", nr);
kobj = kobject_create_and_add(name, parent);
if (!kobj)
return -ENOMEM;
ret = get_setup_data_size(nr, &size);
if (ret)
goto out_kobj;
data_attr.size = size;
ret = sysfs_create_group(kobj, &setup_data_attr_group);
if (ret)
goto out_kobj;
*kobjp = kobj;
return 0;
out_kobj:
kobject_put(kobj);
return ret;
}
static void __init cleanup_setup_data_node(struct kobject *kobj)
{
sysfs_remove_group(kobj, &setup_data_attr_group);
kobject_put(kobj);
}
static int __init get_setup_data_total_num(u64 pa_data, int *nr)
{
int ret = 0;
struct setup_data *data;
*nr = 0;
while (pa_data) {
*nr += 1;
data = ioremap_cache(pa_data, sizeof(*data));
if (!data) {
ret = -ENOMEM;
goto out;
}
pa_data = data->next;
iounmap(data);
}
out:
return ret;
}
static int __init create_setup_data_nodes(struct kobject *parent)
{
struct kobject *setup_data_kobj, **kobjp;
u64 pa_data;
int i, j, nr, ret = 0;
pa_data = boot_params.hdr.setup_data;
if (!pa_data)
return 0;
setup_data_kobj = kobject_create_and_add("setup_data", parent);
if (!setup_data_kobj) {
ret = -ENOMEM;
goto out;
}
ret = get_setup_data_total_num(pa_data, &nr);
if (ret)
goto out_setup_data_kobj;
kobjp = kmalloc(sizeof(*kobjp) * nr, GFP_KERNEL);
if (!kobjp) {
ret = -ENOMEM;
goto out_setup_data_kobj;
}
for (i = 0; i < nr; i++) {
ret = create_setup_data_node(setup_data_kobj, kobjp + i, i);
if (ret)
goto out_clean_nodes;
}
kfree(kobjp);
return 0;
out_clean_nodes:
for (j = i - 1; j > 0; j--)
cleanup_setup_data_node(*(kobjp + j));
kfree(kobjp);
out_setup_data_kobj:
kobject_put(setup_data_kobj);
out:
return ret;
}
static int __init boot_params_ksysfs_init(void)
{
int ret;
struct kobject *boot_params_kobj;
boot_params_kobj = kobject_create_and_add("boot_params",
kernel_kobj);
if (!boot_params_kobj) {
ret = -ENOMEM;
goto out;
}
ret = sysfs_create_group(boot_params_kobj, &boot_params_attr_group);
if (ret)
goto out_boot_params_kobj;
ret = create_setup_data_nodes(boot_params_kobj);
if (ret)
goto out_create_group;
return 0;
out_create_group:
sysfs_remove_group(boot_params_kobj, &boot_params_attr_group);
out_boot_params_kobj:
kobject_put(boot_params_kobj);
out:
return ret;
}
arch_initcall(boot_params_ksysfs_init);
......@@ -447,6 +447,9 @@ static void __init parse_setup_data(void)
case SETUP_DTB:
add_dtb(pa_data);
break;
case SETUP_EFI:
parse_efi_setup(pa_data, data_len);
break;
default:
break;
}
......@@ -924,8 +927,6 @@ void __init setup_arch(char **cmdline_p)
iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
setup_memory_map();
parse_setup_data();
/* update the e820_saved too */
e820_reserve_setup_data();
copy_edd();
......@@ -987,6 +988,8 @@ void __init setup_arch(char **cmdline_p)
early_dump_pci_devices();
#endif
/* update the e820_saved too */
e820_reserve_setup_data();
finish_e820_parsing();
if (efi_enabled(EFI_BOOT))
......
This diff is collapsed.
This diff is collapsed.
......@@ -37,9 +37,19 @@
* claim EFI runtime service handler exclusively and to duplicate a memory in
* low memory space say 0 - 3G.
*/
static unsigned long efi_rt_eflags;
void efi_sync_low_kernel_mappings(void) {}
void efi_setup_page_tables(void) {}
void __init efi_map_region(efi_memory_desc_t *md)
{
old_map_region(md);
}
void __init efi_map_region_fixed(efi_memory_desc_t *md) {}
void __init parse_efi_setup(u64 phys_addr, u32 data_len) {}
void efi_call_phys_prelog(void)
{
struct desc_ptr gdt_descr;
......
......@@ -38,10 +38,28 @@
#include <asm/efi.h>
#include <asm/cacheflush.h>
#include <asm/fixmap.h>
#include <asm/realmode.h>
static pgd_t *save_pgd __initdata;
static unsigned long efi_flags __initdata;
/*
* We allocate runtime services regions bottom-up, starting from -4G, i.e.
* 0xffff_ffff_0000_0000 and limit EFI VA mapping space to 64G.
*/
static u64 efi_va = -4 * (1UL << 30);
#define EFI_VA_END (-68 * (1UL << 30))
/*
* Scratch space used for switching the pagetable in the EFI stub
*/
struct efi_scratch {
u64 r15;
u64 prev_cr3;
pgd_t *efi_pgt;
bool use_pgd;
};
static void __init early_code_mapping_set_exec(int executable)
{
efi_memory_desc_t *md;
......@@ -65,6 +83,9 @@ void __init efi_call_phys_prelog(void)
int pgd;
int n_pgds;
if (!efi_enabled(EFI_OLD_MEMMAP))
return;
early_code_mapping_set_exec(1);
local_irq_save(efi_flags);
......@@ -86,6 +107,10 @@ void __init efi_call_phys_epilog(void)
*/
int pgd;
int n_pgds = DIV_ROUND_UP((max_pfn << PAGE_SHIFT) , PGDIR_SIZE);
if (!efi_enabled(EFI_OLD_MEMMAP))
return;
for (pgd = 0; pgd < n_pgds; pgd++)
set_pgd(pgd_offset_k(pgd * PGDIR_SIZE), save_pgd[pgd]);
kfree(save_pgd);
......@@ -94,6 +119,96 @@ void __init efi_call_phys_epilog(void)
early_code_mapping_set_exec(0);
}
/*
* Add low kernel mappings for passing arguments to EFI functions.
*/
void efi_sync_low_kernel_mappings(void)
{
unsigned num_pgds;
pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
if (efi_enabled(EFI_OLD_MEMMAP))
return;
num_pgds = pgd_index(MODULES_END - 1) - pgd_index(PAGE_OFFSET);
memcpy(pgd + pgd_index(PAGE_OFFSET),
init_mm.pgd + pgd_index(PAGE_OFFSET),
sizeof(pgd_t) * num_pgds);
}
void efi_setup_page_tables(void)
{
efi_scratch.efi_pgt = (pgd_t *)(unsigned long)real_mode_header->trampoline_pgd;
if (!efi_enabled(EFI_OLD_MEMMAP))
efi_scratch.use_pgd = true;
}
static void __init __map_region(efi_memory_desc_t *md, u64 va)
{
pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
unsigned long pf = 0;
if (!(md->attribute & EFI_MEMORY_WB))
pf |= _PAGE_PCD;
if (kernel_map_pages_in_pgd(pgd, md->phys_addr, va, md->num_pages, pf))
pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
md->phys_addr, va);
}
void __init efi_map_region(efi_memory_desc_t *md)
{
unsigned long size = md->num_pages << PAGE_SHIFT;
u64 pa = md->phys_addr;
if (efi_enabled(EFI_OLD_MEMMAP))
return old_map_region(md);
/*
* Make sure the 1:1 mappings are present as a catch-all for b0rked
* firmware which doesn't update all internal pointers after switching
* to virtual mode and would otherwise crap on us.
*/
__map_region(md, md->phys_addr);
efi_va -= size;
/* Is PA 2M-aligned? */
if (!(pa & (PMD_SIZE - 1))) {
efi_va &= PMD_MASK;
} else {
u64 pa_offset = pa & (PMD_SIZE - 1);
u64 prev_va = efi_va;
/* get us the same offset within this 2M page */
efi_va = (efi_va & PMD_MASK) + pa_offset;
if (efi_va > prev_va)
efi_va -= PMD_SIZE;
}
if (efi_va < EFI_VA_END) {
pr_warn(FW_WARN "VA address range overflow!\n");
return;
}
/* Do the VA map */
__map_region(md, efi_va);
md->virt_addr = efi_va;
}
/*
* kexec kernel will use efi_map_region_fixed to map efi runtime memory ranges.
* md->virt_addr is the original virtual address which had been mapped in kexec
* 1st kernel.
*/
void __init efi_map_region_fixed(efi_memory_desc_t *md)
{
__map_region(md, md->virt_addr);
}
void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
u32 type, u64 attribute)
{
......@@ -113,3 +228,8 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
return (void __iomem *)__va(phys_addr);
}
void __init parse_efi_setup(u64 phys_addr, u32 data_len)
{
efi_setup = phys_addr + sizeof(struct setup_data);
}
......@@ -34,10 +34,47 @@
mov %rsi, %cr0; \
mov (%rsp), %rsp
/* stolen from gcc */
.macro FLUSH_TLB_ALL
movq %r15, efi_scratch(%rip)
movq %r14, efi_scratch+8(%rip)
movq %cr4, %r15
movq %r15, %r14
andb $0x7f, %r14b
movq %r14, %cr4
movq %r15, %cr4
movq efi_scratch+8(%rip), %r14
movq efi_scratch(%rip), %r15
.endm
.macro SWITCH_PGT
cmpb $0, efi_scratch+24(%rip)
je 1f
movq %r15, efi_scratch(%rip) # r15
# save previous CR3
movq %cr3, %r15
movq %r15, efi_scratch+8(%rip) # prev_cr3
movq efi_scratch+16(%rip), %r15 # EFI pgt
movq %r15, %cr3
1:
.endm
.macro RESTORE_PGT
cmpb $0, efi_scratch+24(%rip)
je 2f
movq efi_scratch+8(%rip), %r15
movq %r15, %cr3
movq efi_scratch(%rip), %r15
FLUSH_TLB_ALL
2:
.endm
ENTRY(efi_call0)
SAVE_XMM
subq $32, %rsp
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
......@@ -47,7 +84,9 @@ ENTRY(efi_call1)
SAVE_XMM
subq $32, %rsp
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
......@@ -57,7 +96,9 @@ ENTRY(efi_call2)
SAVE_XMM
subq $32, %rsp
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
......@@ -68,7 +109,9 @@ ENTRY(efi_call3)
subq $32, %rsp
mov %rcx, %r8
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
......@@ -80,7 +123,9 @@ ENTRY(efi_call4)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $32, %rsp
RESTORE_XMM
ret
......@@ -93,7 +138,9 @@ ENTRY(efi_call5)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $48, %rsp
RESTORE_XMM
ret
......@@ -109,8 +156,15 @@ ENTRY(efi_call6)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
SWITCH_PGT
call *%rdi
RESTORE_PGT
addq $48, %rsp
RESTORE_XMM
ret
ENDPROC(efi_call6)
.data
ENTRY(efi_scratch)
.fill 3,8,0
.byte 0
......@@ -36,6 +36,17 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE
backend for pstore by default. This setting can be overridden
using the efivars module's pstore_disable parameter.
config EFI_RUNTIME_MAP
bool "Export efi runtime maps to sysfs"
depends on X86 && EFI && KEXEC
default y
help
Export efi runtime memory maps to /sys/firmware/efi/runtime-map.
That memory map is used for example by kexec to set up efi virtual
mapping the 2nd kernel, but can also be used for debugging purposes.
See also Documentation/ABI/testing/sysfs-firmware-efi-runtime-map.
endmenu
config UEFI_CPER
......
......@@ -5,3 +5,4 @@ obj-$(CONFIG_EFI) += efi.o vars.o
obj-$(CONFIG_EFI_VARS) += efivars.o
obj-$(CONFIG_EFI_VARS_PSTORE) += efi-pstore.o
obj-$(CONFIG_UEFI_CPER) += cper.o
obj-$(CONFIG_EFI_RUNTIME_MAP) += runtime-map.o
......@@ -32,6 +32,9 @@ struct efi __read_mostly efi = {
.hcdp = EFI_INVALID_TABLE_ADDR,
.uga = EFI_INVALID_TABLE_ADDR,
.uv_systab = EFI_INVALID_TABLE_ADDR,
.fw_vendor = EFI_INVALID_TABLE_ADDR,
.runtime = EFI_INVALID_TABLE_ADDR,
.config_table = EFI_INVALID_TABLE_ADDR,
};
EXPORT_SYMBOL(efi);
......@@ -71,13 +74,49 @@ static ssize_t systab_show(struct kobject *kobj,
static struct kobj_attribute efi_attr_systab =
__ATTR(systab, 0400, systab_show, NULL);
#define EFI_FIELD(var) efi.var
#define EFI_ATTR_SHOW(name) \
static ssize_t name##_show(struct kobject *kobj, \
struct kobj_attribute *attr, char *buf) \
{ \
return sprintf(buf, "0x%lx\n", EFI_FIELD(name)); \
}
EFI_ATTR_SHOW(fw_vendor);
EFI_ATTR_SHOW(runtime);
EFI_ATTR_SHOW(config_table);
static struct kobj_attribute efi_attr_fw_vendor = __ATTR_RO(fw_vendor);
static struct kobj_attribute efi_attr_runtime = __ATTR_RO(runtime);
static struct kobj_attribute efi_attr_config_table = __ATTR_RO(config_table);
static struct attribute *efi_subsys_attrs[] = {
&efi_attr_systab.attr,
NULL, /* maybe more in the future? */
&efi_attr_fw_vendor.attr,
&efi_attr_runtime.attr,
&efi_attr_config_table.attr,
NULL,
};
static umode_t efi_attr_is_visible(struct kobject *kobj,
struct attribute *attr, int n)
{
umode_t mode = attr->mode;
if (attr == &efi_attr_fw_vendor.attr)
return (efi.fw_vendor == EFI_INVALID_TABLE_ADDR) ? 0 : mode;
else if (attr == &efi_attr_runtime.attr)
return (efi.runtime == EFI_INVALID_TABLE_ADDR) ? 0 : mode;
else if (attr == &efi_attr_config_table.attr)
return (efi.config_table == EFI_INVALID_TABLE_ADDR) ? 0 : mode;
return mode;
}
static struct attribute_group efi_subsys_attr_group = {
.attrs = efi_subsys_attrs,
.is_visible = efi_attr_is_visible,
};
static struct efivars generic_efivars;
......@@ -128,6 +167,10 @@ static int __init efisubsys_init(void)
goto err_unregister;
}
error = efi_runtime_map_init(efi_kobj);
if (error)
goto err_remove_group;
/* and the standard mountpoint for efivarfs */
efivars_kobj = kobject_create_and_add("efivars", efi_kobj);
if (!efivars_kobj) {
......
/*
* linux/drivers/efi/runtime-map.c
* Copyright (C) 2013 Red Hat, Inc., Dave Young <dyoung@redhat.com>
*
* This file is released under the GPLv2.
*/
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/types.h>
#include <linux/efi.h>
#include <linux/slab.h>
#include <asm/setup.h>
static void *efi_runtime_map;
static int nr_efi_runtime_map;
static u32 efi_memdesc_size;
struct efi_runtime_map_entry {
efi_memory_desc_t md;
struct kobject kobj; /* kobject for each entry */
};
static struct efi_runtime_map_entry **map_entries;
struct map_attribute {
struct attribute attr;
ssize_t (*show)(struct efi_runtime_map_entry *entry, char *buf);
};
static inline struct map_attribute *to_map_attr(struct attribute *attr)
{
return container_of(attr, struct map_attribute, attr);
}
static ssize_t type_show(struct efi_runtime_map_entry *entry, char *buf)
{
return snprintf(buf, PAGE_SIZE, "0x%x\n", entry->md.type);
}
#define EFI_RUNTIME_FIELD(var) entry->md.var
#define EFI_RUNTIME_U64_ATTR_SHOW(name) \
static ssize_t name##_show(struct efi_runtime_map_entry *entry, char *buf) \
{ \
return snprintf(buf, PAGE_SIZE, "0x%llx\n", EFI_RUNTIME_FIELD(name)); \
}
EFI_RUNTIME_U64_ATTR_SHOW(phys_addr);
EFI_RUNTIME_U64_ATTR_SHOW(virt_addr);
EFI_RUNTIME_U64_ATTR_SHOW(num_pages);
EFI_RUNTIME_U64_ATTR_SHOW(attribute);
static inline struct efi_runtime_map_entry *to_map_entry(struct kobject *kobj)
{
return container_of(kobj, struct efi_runtime_map_entry, kobj);
}
static ssize_t map_attr_show(struct kobject *kobj, struct attribute *attr,
char *buf)
{
struct efi_runtime_map_entry *entry = to_map_entry(kobj);
struct map_attribute *map_attr = to_map_attr(attr);
return map_attr->show(entry, buf);
}
static struct map_attribute map_type_attr = __ATTR_RO(type);
static struct map_attribute map_phys_addr_attr = __ATTR_RO(phys_addr);
static struct map_attribute map_virt_addr_attr = __ATTR_RO(virt_addr);
static struct map_attribute map_num_pages_attr = __ATTR_RO(num_pages);
static struct map_attribute map_attribute_attr = __ATTR_RO(attribute);
/*
* These are default attributes that are added for every memmap entry.
*/
static struct attribute *def_attrs[] = {
&map_type_attr.attr,
&map_phys_addr_attr.attr,
&map_virt_addr_attr.attr,
&map_num_pages_attr.attr,
&map_attribute_attr.attr,
NULL
};
static const struct sysfs_ops map_attr_ops = {
.show = map_attr_show,
};
static void map_release(struct kobject *kobj)
{
struct efi_runtime_map_entry *entry;
entry = to_map_entry(kobj);
kfree(entry);
}
static struct kobj_type __refdata map_ktype = {
.sysfs_ops = &map_attr_ops,
.default_attrs = def_attrs,
.release = map_release,
};
static struct kset *map_kset;
static struct efi_runtime_map_entry *
add_sysfs_runtime_map_entry(struct kobject *kobj, int nr)
{
int ret;
struct efi_runtime_map_entry *entry;
if (!map_kset) {
map_kset = kset_create_and_add("runtime-map", NULL, kobj);
if (!map_kset)
return ERR_PTR(-ENOMEM);
}
entry = kzalloc(sizeof(*entry), GFP_KERNEL);
if (!entry) {
kset_unregister(map_kset);
return entry;
}
memcpy(&entry->md, efi_runtime_map + nr * efi_memdesc_size,
sizeof(efi_memory_desc_t));
kobject_init(&entry->kobj, &map_ktype);
entry->kobj.kset = map_kset;
ret = kobject_add(&entry->kobj, NULL, "%d", nr);
if (ret) {
kobject_put(&entry->kobj);
kset_unregister(map_kset);
return ERR_PTR(ret);
}
return entry;
}
void efi_runtime_map_setup(void *map, int nr_entries, u32 desc_size)
{
efi_runtime_map = map;
nr_efi_runtime_map = nr_entries;
efi_memdesc_size = desc_size;
}
int __init efi_runtime_map_init(struct kobject *efi_kobj)
{
int i, j, ret = 0;
struct efi_runtime_map_entry *entry;
if (!efi_runtime_map)
return 0;
map_entries = kzalloc(nr_efi_runtime_map * sizeof(entry), GFP_KERNEL);
if (!map_entries) {
ret = -ENOMEM;
goto out;
}
for (i = 0; i < nr_efi_runtime_map; i++) {
entry = add_sysfs_runtime_map_entry(efi_kobj, i);
if (IS_ERR(entry)) {
ret = PTR_ERR(entry);
goto out_add_entry;
}
*(map_entries + i) = entry;
}
return 0;
out_add_entry:
for (j = i - 1; j > 0; j--) {
entry = *(map_entries + j);
kobject_put(&entry->kobj);
}
if (map_kset)
kset_unregister(map_kset);
out:
return ret;
}
......@@ -556,6 +556,9 @@ extern struct efi {
unsigned long hcdp; /* HCDP table */
unsigned long uga; /* UGA table */
unsigned long uv_systab; /* UV system table */
unsigned long fw_vendor; /* fw_vendor */
unsigned long runtime; /* runtime table */
unsigned long config_table; /* config tables */
efi_get_time_t *get_time;
efi_set_time_t *set_time;
efi_get_wakeup_time_t *get_wakeup_time;
......@@ -653,6 +656,7 @@ extern int __init efi_setup_pcdp_console(char *);
#define EFI_RUNTIME_SERVICES 3 /* Can we use runtime services? */
#define EFI_MEMMAP 4 /* Can we use EFI memory map? */
#define EFI_64BIT 5 /* Is the firmware 64-bit? */
#define EFI_ARCH_1 6 /* First arch-specific bit */
#ifdef CONFIG_EFI
# ifdef CONFIG_X86
......@@ -872,4 +876,17 @@ int efivars_sysfs_init(void);
#endif /* CONFIG_EFI_VARS */
#ifdef CONFIG_EFI_RUNTIME_MAP
int efi_runtime_map_init(struct kobject *);
void efi_runtime_map_setup(void *, int, u32);
#else
static inline int efi_runtime_map_init(struct kobject *kobj)
{
return 0;
}
static inline void
efi_runtime_map_setup(void *map, int nr_entries, u32 desc_size) {}
#endif
#endif /* _LINUX_EFI_H */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment