Commit 47e4079c authored by Andi Kleen's avatar Andi Kleen Committed by Linus Torvalds

[PATCH] x86-64 merge

Lots of changes that have accumulated over the last weeks.

This makes it compile and boot again, Lots of bug fixes, including
security fixes.  Several speedups.

Only changes x86-64 specific files.

 - Use private copy of siginfo.h (for si_band)
 - Align 32bit vsyscall coredump (from Roland McGrath)
 - First steps towards 64bit vsyscall coredump (not working yet)
 - Use in kernel trampoline for signals
 - Merge APIC pm update from Pavel/Mikael
 - Security fix for ioperm (from i386)
 - Reenable vsyscall dumping for 32bit coredumps
 - Fix bugs in 32bit coredump that could lead to oopses.
 - Fix 64bit vsyscalls
 - Revert change in pci-gart.c: pci_alloc_consistent must use an
   0xffffffff mask hardcoded.
 - Fix bug in noexec= option handling
 - Export fake_node
 - Cleanups from Pavel
 - Disable 32bit vsyscall coredump again.  Still has some problems.
 - Implement new noexec= and noexec32= options to give a wide choice
   of support for non executable mappings for 32bit and 64bit processes.
   The default is now to honor PROT_EXEC, but mark stack and heap
   PROT_EXEC.
 - 32bit emulation changes from Pavel: use compat_* types.
 - (2.4) Use physical address for GART register.
 - Convert debugreg array to individual members and clean up ptrace
   access.  This saves 16 byte per task.
 - (2.4) Use new streamlined context switch code.  This avoids a
   pipeline stall and pushes the register saving to C code.
 - Save flags register in context switch
 - Clean up SMP early bootup.  Remove some unnecessary code.
 - (2.4) Process numa= option early
 - (2.4) Merge 2.4 clear_page, copy_*_user, copy_page, memcpy, memset.
   These are much faster.  clear/copy_page don't force the new page out
   of memory now which should speed up user processes.  Also full
   workaround for errata #91.
 - Some cleanup in pageattr.c code.
 - Fix warning in i387.h
 - Fix wrong PAGE_KERNEL_LARGE define.  This fixes a security hole and
   makes AGP work again.
 - Fix wrong segment exception handler to not crash.
 - Fix incorrect swapgs handling in bad iret exception handling
 - Clean up some boot printks
 - Micro optimize exception handling preamble.
 - New reboot handling.  Supports warm reboot and BIOS reboot vector
   reboot now.
 - (2.4) Use MTRRs by default in vesafb
 - Fix bug in put_dirty_page: use correct page permissions for the stack
 - Fix type of si_band in asm-generic/siginfo.h to match POSIX/glibc
   (needs checking with other architecture maintainers)
 - (2.4) Define ARCH_HAS_NMI_WATCHDOG
 - Minor cleanup in calling.h
 - IOMMU tuning: only flush the GART TLB when the IOMMU aperture area
   allocation wraps.  Also don't clear entries until needed.  This
   should increase IO performance for IOMMU devices greatly.  Still a
   bit experimental, handle with care.
 - Unmap the IOMMU aperture from kernel mapping to prevent unwanted CPU
   prefetches.
 - Make IOMMU_LEAK_TRACE depend on IOMMU_DEBUG
 - Fix minor bug in pci_alloc_consistent - always check against the dma
   mask of the device, not 0xffffffff.
 - Remove streamining mapping delayed flush in IOMMU: not needed anymore
   and didn't work correctly in 2.5 anyways.
 - Fix the bad pte warnings caused by the SMP/APIC bootup.
 - Forward port 2.4 fix: ioperm was changing the wrong io ports in some
   cases.
 - Minor cleanups
 - Some cleanups in pageattr.c (still buggy)
 - Fix some bugs in the AGP driver.
 - Forward port from 2.4: mask all reserved bits in debug register in
   ptrace.  Previously gdb could crash the kernel by passing invalid
   values.
 - Security fix: make sure FPU is in a defined state after an
   FXSAVE/FXRSTOR exception occurred.
 - Eats keys on panic (works around a buggy KVM)
 - Make user.h user includeable.
 - Disable sign compare warnings for gcc 3.3-hammer
 - Use DSO for 32bit vsyscalls and dump it in core dumps.  Add dwarf2
   information for the vsyscalls.
   Thanks to Richard Henderson for helping me with the nasty parts of
   it.  I had to do some changes over his patch and it's currently only
   lightly tested.  Handle with care.  This only affects 32bit programs
   that use a glibc 3.2 with sysenter support.
 - Security fixes for the 32bit ioctl handlers.  Also some simplications
   and speedups.
 - gcc 3.3-hammer compile fixes for inline assembly
 - Remove acpi.c file corpse.
 - Lots of warning fixes
 - Disable some Dprintks to make the bootup quieter again
 - Clean up ptrace a bit (together with warning fixes)
 - Merge with i386 (handle ACPI dynamic irq entries properly)
 - Disable change_page_attr in pci-gart for now.  Strictly that's
   incorrect, need to do more testing for the root cause of the current
   IOMMU problems.
 - Update defconfig
 - Disable first prefetch in copy_user that is likely to trigger Opteron
   Errata #91
 - More irqreturn_t fixes
 - Add pte_user and fix the vsyscall ptrace hack in generic code.
   It's still partly broken
 - Port verbose MCE handler from 2.4
parent eaf7c976
......@@ -652,6 +652,7 @@ config IOMMU_DEBUG
config IOMMU_LEAK
bool "IOMMU leak tracing"
depends on DEBUG_KERNEL
depends on IOMMU_DEBUG
help
Add a simple leak tracer to the IOMMU code. This is useful when you
are debugging a buggy device driver that leaks IOMMU mappings.
......
......@@ -46,6 +46,7 @@ CFLAGS += -pipe
CFLAGS += -fno-reorder-blocks
# should lower this a lot and see how much .text is saves
CFLAGS += -finline-limit=2000
CFLAGS += -Wno-sign-compare
#CFLAGS += -g
# don't enable this when you use kgdb:
ifneq ($(CONFIG_X86_REMOTE_DEBUG),y)
......
......@@ -4,7 +4,6 @@
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_SWAP=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
......@@ -18,6 +17,7 @@ CONFIG_EXPERIMENTAL=y
#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
......@@ -47,8 +47,12 @@ CONFIG_X86_CPUID=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_MTRR=y
CONFIG_HUGETLB_PAGE=y
# CONFIG_HUGETLB_PAGE is not set
CONFIG_SMP=y
# CONFIG_PREEMPT is not set
CONFIG_K8_NUMA=y
CONFIG_DISCONTIGMEM=y
CONFIG_NUMA=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_NR_CPUS=8
CONFIG_GART_IOMMU=y
......@@ -222,9 +226,8 @@ CONFIG_NET=y
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_NETLINK_DEV=y
# CONFIG_NETLINK_DEV is not set
# CONFIG_NETFILTER is not set
CONFIG_FILTER=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
......@@ -239,8 +242,9 @@ CONFIG_IP_MULTICAST=y
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_XFRM_USER is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_IPV6 is not set
# CONFIG_XFRM_USER is not set
#
# SCTP Configuration (EXPERIMENTAL)
......@@ -331,6 +335,11 @@ CONFIG_E1000=m
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
CONFIG_TIGON3=y
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
......@@ -405,15 +414,7 @@ CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
CONFIG_INPUT_JOYSTICK=y
# CONFIG_JOYSTICK_IFORCE is not set
# CONFIG_JOYSTICK_WARRIOR is not set
# CONFIG_JOYSTICK_MAGELLAN is not set
# CONFIG_JOYSTICK_SPACEORB is not set
# CONFIG_JOYSTICK_SPACEBALL is not set
# CONFIG_JOYSTICK_STINGER is not set
# CONFIG_JOYSTICK_TWIDDLER is not set
# CONFIG_INPUT_JOYDUMP is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
......@@ -452,6 +453,7 @@ CONFIG_UNIX98_PTY_COUNT=256
#
# I2C Hardware Sensors Chip support
#
# CONFIG_I2C_SENSOR is not set
#
# Mice
......@@ -468,8 +470,7 @@ CONFIG_UNIX98_PTY_COUNT=256
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_INTEL_RNG is not set
# CONFIG_AMD_RNG is not set
CONFIG_HW_RANDOM=y
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
......@@ -481,8 +482,8 @@ CONFIG_RTC=y
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
# CONFIG_AGP is not set
# CONFIG_AGP_GART is not set
CONFIG_AGP=y
CONFIG_AGP_AMD_8151=y
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=y
......@@ -497,58 +498,76 @@ CONFIG_RAW_DRIVER=y
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# File systems
#
# CONFIG_QUOTA is not set
CONFIG_AUTOFS_FS=y
# CONFIG_AUTOFS4_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=m
# CONFIG_XFS_RT is not set
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
CONFIG_AUTOFS_FS=y
# CONFIG_AUTOFS4_FS is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
# CONFIG_JOLIET is not set
# CONFIG_ZISOFS is not set
# CONFIG_UDF_FS is not set
#
# DOS/FAT/NT Filesystems
#
# CONFIG_FAT_FS is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
# CONFIG_FAT_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_HUGETLBFS=y
CONFIG_ISO9660_FS=y
# CONFIG_JOLIET is not set
# CONFIG_ZISOFS is not set
# CONFIG_JFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UDF_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_XFS_FS=m
# CONFIG_XFS_RT is not set
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_POSIX_ACL is not set
#
# Network File Systems
#
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
......@@ -556,14 +575,16 @@ CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
CONFIG_NFSD_TCP=y
CONFIG_SUNRPC=y
# CONFIG_SUNRPC_GSS is not set
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
# CONFIG_CIFS is not set
CONFIG_SUNRPC=y
# CONFIG_SUNRPC_GSS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
# CONFIG_AFS_FS is not set
#
......@@ -594,6 +615,7 @@ CONFIG_DUMMY_CONSOLE=y
# USB support
#
# CONFIG_USB is not set
# CONFIG_USB_GADGET is not set
#
# Bluetooth support
......@@ -615,6 +637,9 @@ CONFIG_MAGIC_SYSRQ=y
# CONFIG_INIT_DEBUG is not set
CONFIG_KALLSYMS=y
# CONFIG_FRAME_POINTER is not set
CONFIG_IOMMU_DEBUG=y
CONFIG_IOMMU_LEAK=y
CONFIG_MCE_DEBUG=y
#
# Security options
......
......@@ -5,3 +5,12 @@
obj-$(CONFIG_IA32_EMULATION) := ia32entry.o sys_ia32.o ia32_ioctl.o \
ia32_signal.o tls32.o \
ia32_binfmt.o fpu32.o ptrace32.o ipc32.o syscall32.o
$(obj)/syscall32.o: $(src)/syscall32.c $(obj)/vsyscall.so
# The DSO images are built using a special linker script.
$(obj)/vsyscall.so: $(src)/vsyscall.lds $(obj)/vsyscall.o
$(CC) -m32 -nostdlib -shared -s -Wl,-soname=linux-vsyscall.so.1 \
-o $@ -Wl,-T,$^
AFLAGS_vsyscall.o = -m32
......@@ -6,11 +6,11 @@
* of ugly preprocessor tricks. Talk about very very poor man's inheritance.
*/
#include <linux/types.h>
#include <linux/compat.h>
#include <linux/config.h>
#include <linux/stddef.h>
#include <linux/rwsem.h>
#include <linux/sched.h>
#include <linux/compat.h>
#include <linux/string.h>
#include <linux/binfmts.h>
#include <linux/mm.h>
......@@ -23,12 +23,17 @@
#include <asm/i387.h>
#include <asm/uaccess.h>
#include <asm/ia32.h>
#include <asm/vsyscall32.h>
#define ELF_NAME "elf/i386"
#define AT_SYSINFO 32
#define AT_SYSINFO_EHDR 33
#define ARCH_DLINFO NEW_AUX_ENT(AT_SYSINFO, 0xffffe000)
#define ARCH_DLINFO do { \
NEW_AUX_ENT(AT_SYSINFO, (u32)(u64)VSYSCALL32_VSYSCALL); \
NEW_AUX_ENT(AT_SYSINFO_EHDR, VSYSCALL32_BASE); \
} while(0)
struct file;
struct elf_phdr;
......@@ -54,6 +59,47 @@ typedef unsigned int elf_greg_t;
#define ELF_NGREG (sizeof (struct user_regs_struct32) / sizeof(elf_greg_t))
typedef elf_greg_t elf_gregset_t[ELF_NGREG];
/*
* These macros parameterize elf_core_dump in fs/binfmt_elf.c to write out
* extra segments containing the vsyscall DSO contents. Dumping its
* contents makes post-mortem fully interpretable later without matching up
* the same kernel and hardware config to see what PC values meant.
* Dumping its extra ELF program headers includes all the other information
* a debugger needs to easily find how the vsyscall DSO was being used.
*/
#define ELF_CORE_EXTRA_PHDRS (VSYSCALL32_EHDR->e_phnum)
#define ELF_CORE_WRITE_EXTRA_PHDRS \
do { \
const struct elf32_phdr *const vsyscall_phdrs = \
(const struct elf32_phdr *) (VSYSCALL32_BASE \
+ VSYSCALL32_EHDR->e_phoff); \
int i; \
Elf32_Off ofs = 0; \
for (i = 0; i < VSYSCALL32_EHDR->e_phnum; ++i) { \
struct elf_phdr phdr = vsyscall_phdrs[i]; \
if (phdr.p_type == PT_LOAD) { \
ofs = phdr.p_offset = offset; \
offset += phdr.p_filesz; \
} \
else \
phdr.p_offset += ofs; \
phdr.p_paddr = 0; /* match other core phdrs */ \
DUMP_WRITE(&phdr, sizeof(phdr)); \
} \
} while (0)
#define ELF_CORE_WRITE_EXTRA_DATA \
do { \
const struct elf32_phdr *const vsyscall_phdrs = \
(const struct elf32_phdr *) (VSYSCALL32_BASE \
+ VSYSCALL32_EHDR->e_phoff); \
int i; \
for (i = 0; i < VSYSCALL32_EHDR->e_phnum; ++i) { \
if (vsyscall_phdrs[i].p_type == PT_LOAD) \
DUMP_WRITE((void *) (u64) vsyscall_phdrs[i].p_vaddr, \
vsyscall_phdrs[i].p_filesz); \
} \
} while (0)
struct elf_siginfo
{
int si_signo; /* signal number */
......@@ -157,7 +203,6 @@ elf_core_copy_task_fpregs(struct task_struct *tsk, elf_fpregset_t *fpu)
struct _fpstate_ia32 *fpstate = (void*)fpu;
struct pt_regs *regs = (struct pt_regs *)(tsk->thread.rsp0);
mm_segment_t oldfs = get_fs();
int ret;
if (!tsk->used_math)
return 0;
......@@ -165,12 +210,12 @@ elf_core_copy_task_fpregs(struct task_struct *tsk, elf_fpregset_t *fpu)
if (tsk == current)
unlazy_fpu(tsk);
set_fs(KERNEL_DS);
ret = save_i387_ia32(tsk, fpstate, regs, 1);
save_i387_ia32(tsk, fpstate, regs, 1);
/* Correct for i386 bug. It puts the fop into the upper 16bits of
the tag word (like FXSAVE), not into the fcs*/
fpstate->cssel |= fpstate->tag & 0xffff0000;
set_fs(oldfs);
return ret;
return 1;
}
#define ELF_CORE_COPY_XFPREGS 1
......@@ -302,8 +347,9 @@ int setup_arg_pages(struct linux_binprm *bprm)
mpnt->vm_mm = mm;
mpnt->vm_start = PAGE_MASK & (unsigned long) bprm->p;
mpnt->vm_end = IA32_STACK_TOP;
mpnt->vm_page_prot = PAGE_COPY_EXEC;
mpnt->vm_flags = VM_STACK_FLAGS;
mpnt->vm_flags = vm_stack_flags32;
mpnt->vm_page_prot = (mpnt->vm_flags & VM_EXEC) ?
PAGE_COPY_EXEC : PAGE_COPY;
mpnt->vm_ops = NULL;
mpnt->vm_pgoff = 0;
mpnt->vm_file = NULL;
......@@ -333,7 +379,7 @@ elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int p
struct task_struct *me = current;
if (prot & PROT_READ)
prot |= PROT_EXEC;
prot |= vm_force_exec32;
down_write(&me->mm->mmap_sem);
map_addr = do_mmap(filep, ELF_PAGESTART(addr),
......
This diff is collapsed.
......@@ -33,6 +33,7 @@
#include <asm/sigcontext32.h>
#include <asm/fpu32.h>
#include <asm/proto.h>
#include <asm/vsyscall32.h>
#define ptr_to_u32(x) ((u32)(u64)(x)) /* avoid gcc warning */
......@@ -428,7 +429,7 @@ void ia32_setup_frame(int sig, struct k_sigaction *ka,
/* Return stub is in 32bit vsyscall page */
{
void *restorer = syscall32_page + 32;
void *restorer = VSYSCALL32_SIGRETURN;
if (ka->sa.sa_flags & SA_RESTORER)
restorer = ka->sa.sa_restorer;
err |= __put_user(ptr_to_u32(restorer), &frame->pretcode);
......@@ -521,7 +522,7 @@ void ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
{
void *restorer = syscall32_page + 32;
void *restorer = VSYSCALL32_RTSIGRETURN;
if (ka->sa.sa_flags & SA_RESTORER)
restorer = ka->sa.sa_restorer;
err |= __put_user(ptr_to_u32(restorer), &frame->pretcode);
......
......@@ -182,9 +182,9 @@ quiet_ni_syscall:
PTREGSCALL stub32_sigaltstack, sys32_sigaltstack
PTREGSCALL stub32_sigsuspend, sys32_sigsuspend
PTREGSCALL stub32_execve, sys32_execve
PTREGSCALL stub32_fork, sys32_fork
PTREGSCALL stub32_fork, sys_fork
PTREGSCALL stub32_clone, sys32_clone
PTREGSCALL stub32_vfork, sys32_vfork
PTREGSCALL stub32_vfork, sys_vfork
PTREGSCALL stub32_iopl, sys_iopl
PTREGSCALL stub32_rt_sigsuspend, sys_rt_sigsuspend
......
......@@ -78,12 +78,24 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val)
case offsetof(struct user32, u_debugreg[5]):
return -EIO;
case offsetof(struct user32, u_debugreg[0]) ...
offsetof(struct user32, u_debugreg[3]):
case offsetof(struct user32, u_debugreg[0]):
child->thread.debugreg0 = val;
break;
case offsetof(struct user32, u_debugreg[1]):
child->thread.debugreg1 = val;
break;
case offsetof(struct user32, u_debugreg[2]):
child->thread.debugreg2 = val;
break;
case offsetof(struct user32, u_debugreg[3]):
child->thread.debugreg3 = val;
break;
case offsetof(struct user32, u_debugreg[6]):
child->thread.debugreg
[(regno-offsetof(struct user32, u_debugreg[0]))/4]
= val;
child->thread.debugreg6 = val;
break;
case offsetof(struct user32, u_debugreg[7]):
......@@ -92,7 +104,7 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val)
for(i=0; i<4; i++)
if ((0x5454 >> ((val >> (16 + 4*i)) & 0xf)) & 1)
return -EIO;
child->thread.debugreg[7] = val;
child->thread.debugreg7 = val;
break;
default:
......@@ -142,8 +154,23 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val)
R32(eflags, eflags);
R32(esp, rsp);
case offsetof(struct user32, u_debugreg[0]) ... offsetof(struct user32, u_debugreg[7]):
*val = child->thread.debugreg[(regno-offsetof(struct user32, u_debugreg[0]))/4];
case offsetof(struct user32, u_debugreg[0]):
*val = child->thread.debugreg0;
break;
case offsetof(struct user32, u_debugreg[1]):
*val = child->thread.debugreg1;
break;
case offsetof(struct user32, u_debugreg[2]):
*val = child->thread.debugreg2;
break;
case offsetof(struct user32, u_debugreg[3]):
*val = child->thread.debugreg3;
break;
case offsetof(struct user32, u_debugreg[6]):
*val = child->thread.debugreg6;
break;
case offsetof(struct user32, u_debugreg[7]):
*val = child->thread.debugreg7;
break;
default:
......
......@@ -234,7 +234,7 @@ sys32_mmap(struct mmap_arg_struct *arg)
}
if (a.prot & PROT_READ)
a.prot |= PROT_EXEC;
a.prot |= vm_force_exec32;
mm = current->mm;
down_write(&mm->mmap_sem);
......@@ -253,7 +253,7 @@ asmlinkage long
sys32_mprotect(unsigned long start, size_t len, unsigned long prot)
{
if (prot & PROT_READ)
prot |= PROT_EXEC;
prot |= vm_force_exec32;
return sys_mprotect(start,len,prot);
}
......@@ -929,7 +929,11 @@ struct sysinfo32 {
u32 totalswap;
u32 freeswap;
unsigned short procs;
char _f[22];
unsigned short pad;
u32 totalhigh;
u32 freehigh;
u32 mem_unit;
char _f[20-2*sizeof(u32)-sizeof(int)];
};
extern asmlinkage long sys_sysinfo(struct sysinfo *info);
......@@ -955,7 +959,10 @@ sys32_sysinfo(struct sysinfo32 *info)
__put_user (s.bufferram, &info->bufferram) ||
__put_user (s.totalswap, &info->totalswap) ||
__put_user (s.freeswap, &info->freeswap) ||
__put_user (s.procs, &info->procs))
__put_user (s.procs, &info->procs) ||
__put_user (s.totalhigh, &info->totalhigh) ||
__put_user (s.freehigh, &info->freehigh) ||
__put_user (s.mem_unit, &info->mem_unit))
return -EFAULT;
return 0;
}
......@@ -1419,7 +1426,7 @@ asmlinkage long sys32_mmap2(unsigned long addr, unsigned long len,
}
if (prot & PROT_READ)
prot |= PROT_EXEC;
prot |= vm_force_exec32;
down_write(&mm->mmap_sem);
error = do_mmap_pgoff(file, addr, len, prot, flags, pgoff);
......@@ -1587,40 +1594,14 @@ asmlinkage long sys32_execve(char *name, u32 argv, u32 envp, struct pt_regs regs
return ret;
}
asmlinkage long sys32_fork(struct pt_regs regs)
{
struct task_struct *p;
p = do_fork(SIGCHLD, regs.rsp, &regs, 0, NULL, NULL);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
asmlinkage long sys32_clone(unsigned int clone_flags, unsigned int newsp, struct pt_regs regs)
{
struct task_struct *p;
void *parent_tid = (void *)regs.rdx;
void *child_tid = (void *)regs.rdi;
if (!newsp)
newsp = regs.rsp;
p = do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
parent_tid, child_tid);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
/*
* This is trivial, and on the face of it looks like it
* could equally well be done in user mode.
*
* Not so, for quite unobvious reasons - register pressure.
* In user mode vfork() cannot have a stack frame, and if
* done by calling the "clone()" system call directly, you
* do not have enough call-clobbered registers to hold all
* the information you need.
*/
asmlinkage long sys32_vfork(struct pt_regs regs)
{
struct task_struct *p;
p = do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.rsp, &regs, 0, NULL, NULL);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
/*
......
......@@ -13,33 +13,14 @@
#include <asm/tlbflush.h>
#include <asm/ia32_unistd.h>
/* 32bit SYSCALL stub mapped into user space. */
asm(" .code32\n"
"\nsyscall32:\n"
" pushl %ebp\n"
" movl %ecx,%ebp\n"
" syscall\n"
" popl %ebp\n"
" ret\n"
/* 32bit VDSO mapped into user space. */
asm(".section \".init.data\",\"aw\"\n"
"syscall32:\n"
".incbin \"arch/x86_64/ia32/vsyscall.so\"\n"
"syscall32_end:\n"
/* signal trampolines */
"sig32_rt_tramp:\n"
" movl $" __stringify(__NR_ia32_rt_sigreturn) ",%eax\n"
" syscall\n"
"sig32_rt_tramp_end:\n"
"sig32_tramp:\n"
" popl %eax\n"
" movl $" __stringify(__NR_ia32_sigreturn) ",%eax\n"
" syscall\n"
"sig32_tramp_end:\n"
" .code64\n");
".previous");
extern unsigned char syscall32[], syscall32_end[];
extern unsigned char sig32_rt_tramp[], sig32_rt_tramp_end[];
extern unsigned char sig32_tramp[], sig32_tramp_end[];
char *syscall32_page;
......@@ -76,10 +57,6 @@ static int __init init_syscall32(void)
panic("Cannot allocate syscall32 page");
SetPageReserved(virt_to_page(syscall32_page));
memcpy(syscall32_page, syscall32, syscall32_end - syscall32);
memcpy(syscall32_page + 32, sig32_rt_tramp,
sig32_rt_tramp_end - sig32_rt_tramp);
memcpy(syscall32_page + 64, sig32_tramp,
sig32_tramp_end - sig32_tramp);
return 0;
}
......
/*
* Code for the vsyscall page. This version uses the syscall instruction.
*/
#include <asm/ia32_unistd.h>
#include <asm/offset.h>
.text
.section .text.vsyscall,"ax"
.globl __kernel_vsyscall
.type __kernel_vsyscall,@function
__kernel_vsyscall:
.LSTART_vsyscall:
push %ebp
.Lpush_ebp:
movl %ecx, %ebp
syscall
popl %ebp
.Lpop_ebp:
ret
.LEND_vsyscall:
.size __kernel_vsyscall,.-.LSTART_vsyscall
.section .text.sigreturn,"ax"
.balign 32
.globl __kernel_sigreturn
.type __kernel_sigreturn,@function
__kernel_sigreturn:
.LSTART_sigreturn:
popl %eax
movl $__NR_ia32_sigreturn, %eax
syscall
.LEND_sigreturn:
.size __kernel_sigreturn,.-.LSTART_sigreturn
.section .text.rtsigreturn,"ax"
.balign 32
.globl __kernel_rt_sigreturn,"ax"
.type __kernel_rt_sigreturn,@function
__kernel_rt_sigreturn:
.LSTART_rt_sigreturn:
movl $__NR_ia32_rt_sigreturn, %eax
syscall
.LEND_rt_sigreturn:
.size __kernel_rt_sigreturn,.-.LSTART_rt_sigreturn
.section .eh_frame,"a",@progbits
.LSTARTFRAME:
.long .LENDCIE-.LSTARTCIE
.LSTARTCIE:
.long 0 /* CIE ID */
.byte 1 /* Version number */
.string "zR" /* NUL-terminated augmentation string */
.uleb128 1 /* Code alignment factor */
.sleb128 -4 /* Data alignment factor */
.byte 8 /* Return address register column */
.uleb128 1 /* Augmentation value length */
.byte 0x1b /* DW_EH_PE_pcrel|DW_EH_PE_sdata4. */
.byte 0x0c /* DW_CFA_def_cfa */
.uleb128 4
.uleb128 4
.byte 0x88 /* DW_CFA_offset, column 0x8 */
.uleb128 1
.align 4
.LENDCIE:
.long .LENDFDE1-.LSTARTFDE1 /* Length FDE */
.LSTARTFDE1:
.long .LSTARTFDE1-.LSTARTFRAME /* CIE pointer */
.long .LSTART_vsyscall-. /* PC-relative start address */
.long .LEND_vsyscall-.LSTART_vsyscall
.uleb128 0 /* Augmentation length */
/* What follows are the instructions for the table generation.
We have to record all changes of the stack pointer. */
.byte 0x40 + .Lpush_ebp-.LSTART_vsyscall /* DW_CFA_advance_loc */
.byte 0x0e /* DW_CFA_def_cfa_offset */
.uleb128 8
.byte 0x85, 0x02 /* DW_CFA_offset %ebp -8 */
.byte 0x40 + .Lpop_ebp-.Lpush_ebp /* DW_CFA_advance_loc */
.byte 0xc5 /* DW_CFA_restore %ebp */
.byte 0x0e /* DW_CFA_def_cfa_offset */
.uleb128 4
.align 4
.LENDFDE1:
.long .LENDFDE2-.LSTARTFDE2 /* Length FDE */
.LSTARTFDE2:
.long .LSTARTFDE2-.LSTARTFRAME /* CIE pointer */
/* HACK: The dwarf2 unwind routines will subtract 1 from the
return address to get an address in the middle of the
presumed call instruction. Since we didn't get here via
a call, we need to include the nop before the real start
to make up for it. */
.long .LSTART_sigreturn-1-. /* PC-relative start address */
.long .LEND_sigreturn-.LSTART_sigreturn+1
.uleb128 0 /* Augmentation length */
/* What follows are the instructions for the table generation.
We record the locations of each register saved. This is
complicated by the fact that the "CFA" is always assumed to
be the value of the stack pointer in the caller. This means
that we must define the CFA of this body of code to be the
saved value of the stack pointer in the sigcontext. Which
also means that there is no fixed relation to the other
saved registers, which means that we must use DW_CFA_expression
to compute their addresses. It also means that when we
adjust the stack with the popl, we have to do it all over again. */
#define do_cfa_expr(offset) \
.byte 0x0f; /* DW_CFA_def_cfa_expression */ \
.uleb128 1f-0f; /* length */ \
0: .byte 0x74; /* DW_OP_breg4 */ \
.sleb128 offset; /* offset */ \
.byte 0x06; /* DW_OP_deref */ \
1:
#define do_expr(regno, offset) \
.byte 0x10; /* DW_CFA_expression */ \
.uleb128 regno; /* regno */ \
.uleb128 1f-0f; /* length */ \
0: .byte 0x74; /* DW_OP_breg4 */ \
.sleb128 offset; /* offset */ \
1:
do_cfa_expr(IA32_SIGCONTEXT_esp+4)
do_expr(0, IA32_SIGCONTEXT_eax+4)
do_expr(1, IA32_SIGCONTEXT_ecx+4)
do_expr(2, IA32_SIGCONTEXT_edx+4)
do_expr(3, IA32_SIGCONTEXT_ebx+4)
do_expr(5, IA32_SIGCONTEXT_ebp+4)
do_expr(6, IA32_SIGCONTEXT_esi+4)
do_expr(7, IA32_SIGCONTEXT_edi+4)
do_expr(8, IA32_SIGCONTEXT_eip+4)
.byte 0x42 /* DW_CFA_advance_loc 2 -- nop; popl eax. */
do_cfa_expr(IA32_SIGCONTEXT_esp)
do_expr(0, IA32_SIGCONTEXT_eax)
do_expr(1, IA32_SIGCONTEXT_ecx)
do_expr(2, IA32_SIGCONTEXT_edx)
do_expr(3, IA32_SIGCONTEXT_ebx)
do_expr(5, IA32_SIGCONTEXT_ebp)
do_expr(6, IA32_SIGCONTEXT_esi)
do_expr(7, IA32_SIGCONTEXT_edi)
do_expr(8, IA32_SIGCONTEXT_eip)
.align 4
.LENDFDE2:
.long .LENDFDE3-.LSTARTFDE3 /* Length FDE */
.LSTARTFDE3:
.long .LSTARTFDE3-.LSTARTFRAME /* CIE pointer */
/* HACK: See above wrt unwind library assumptions. */
.long .LSTART_rt_sigreturn-1-. /* PC-relative start address */
.long .LEND_rt_sigreturn-.LSTART_rt_sigreturn+1
.uleb128 0 /* Augmentation */
/* What follows are the instructions for the table generation.
We record the locations of each register saved. This is
slightly less complicated than the above, since we don't
modify the stack pointer in the process. */
do_cfa_expr(IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_esp)
do_expr(0, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_eax)
do_expr(1, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ecx)
do_expr(2, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_edx)
do_expr(3, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ebx)
do_expr(5, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ebp)
do_expr(6, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_esi)
do_expr(7, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_edi)
do_expr(8, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_eip)
.align 4
.LENDFDE3:
/*
* Linker script for vsyscall DSO. The vsyscall page is an ELF shared
* object prelinked to its virtual address. This script controls its layout.
*/
/* This must match <asm/fixmap.h>. */
VSYSCALL_BASE = 0xffffe000;
SECTIONS
{
. = VSYSCALL_BASE + SIZEOF_HEADERS;
.hash : { *(.hash) } :text
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
/* This linker script is used both with -r and with -shared.
For the layouts to match, we need to skip more than enough
space for the dynamic symbol table et al. If this amount
is insufficient, ld -shared will barf. Just increase it here. */
. = VSYSCALL_BASE + 0x400;
.text.vsyscall : { *(.text.vsyscall) } :text =0x90909090
/* This is an 32bit object and we cannot easily get the offsets
into the 64bit kernel. Just hardcode them here. This assumes
that all the stubs don't need more than 0x100 bytes. */
. = VSYSCALL_BASE + 0x500;
.text.sigreturn : { *(.text.sigreturn) } :text =0x90909090
. = VSYSCALL_BASE + 0x600;
.text.rtsigreturn : { *(.text.rtsigreturn) } :text =0x90909090
.eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr
.eh_frame : { KEEP (*(.eh_frame)) } :text
.dynamic : { *(.dynamic) } :text :dynamic
.useless : {
*(.got.plt) *(.got)
*(.data .data.* .gnu.linkonce.d.*)
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
} :text
}
/*
* We must supply the ELF program headers explicitly to get just one
* PT_LOAD segment, and set the flags explicitly to make segments read-only.
*/
PHDRS
{
text PT_LOAD FILEHDR PHDRS FLAGS(5); /* PF_R|PF_X */
dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
eh_frame_hdr 0x6474e550; /* PT_GNU_EH_FRAME, but ld doesn't match the name */
}
/*
* This controls what symbols we export from the DSO.
*/
VERSION
{
LINUX_2.5 {
global:
__kernel_vsyscall;
__kernel_sigreturn;
__kernel_rt_sigreturn;
local: *;
};
}
/* The ELF entry point can be used to set the AT_SYSINFO value. */
ENTRY(__kernel_vsyscall);
......@@ -7,7 +7,7 @@ EXTRA_AFLAGS := -traditional
obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o \
ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_x86_64.o \
pci-dma.o x8664_ksyms.o i387.o syscall.o vsyscall.o \
setup64.o bluesmoke.o bootflag.o e820.o reboot.o
setup64.o bluesmoke.o bootflag.o e820.o reboot.o warmreboot.o
obj-$(CONFIG_MTRR) += mtrr/
obj-$(CONFIG_ACPI) += acpi/
......
This diff is collapsed.
......@@ -94,12 +94,6 @@ wakeup_32:
movw %ax, %ss
mov $(wakeup_stack - __START_KERNEL_map), %esp
call 1f
1: popl %eax
movl $0xb8040, %ebx
call early_print
movl saved_magic - __START_KERNEL_map, %eax
cmpl $0x9abcdef0, %eax
jne bogus_32_magic
......@@ -115,11 +109,7 @@ wakeup_32:
movl %eax, %cr4
/* Setup early boot stage 4 level pagetables */
#if 1
movl $(wakeup_level4_pgt - __START_KERNEL_map), %eax
#else
movl saved_cr3 - __START_KERNEL_map, %eax
#endif
movl %eax, %cr3
/* Setup EFER (Extended Feature Enable Register) */
......@@ -223,19 +213,6 @@ wakeup_long64:
.code32
early_print:
movl $16, %edx
1:
movl %eax, %ecx
andl $0xf, %ecx
shrl $4, %eax
addw $0x0e00 + '0', %ecx
movw %ecx, %ds:(%edx, %ebx)
decl %edx
decl %edx
jnz 1b
ret
.align 64
gdta:
.word 0, 0, 0, 0 # dummy
......
......@@ -385,10 +385,10 @@ void __init setup_local_APIC (void)
value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
if (!smp_processor_id() && (pic_mode || !value)) {
value = APIC_DM_EXTINT;
printk("enabled ExtINT on CPU#%d\n", smp_processor_id());
Dprintk(KERN_INFO "enabled ExtINT on CPU#%d\n", smp_processor_id());
} else {
value = APIC_DM_EXTINT | APIC_LVT_MASKED;
printk("masked ExtINT on CPU#%d\n", smp_processor_id());
Dprintk(KERN_INFO "masked ExtINT on CPU#%d\n", smp_processor_id());
}
apic_write_around(APIC_LVT0, value);
......
......@@ -12,6 +12,7 @@
#include <asm/processor.h>
#include <asm/segment.h>
#include <asm/thread_info.h>
#include <asm/ia32.h>
#define DEFINE(sym, val) \
asm volatile("\n->" #sym " %0 " #val : : "i" (val))
......@@ -43,5 +44,21 @@ int main(void)
ENTRY(irqstackptr);
BLANK();
#undef ENTRY
#define ENTRY(entry) DEFINE(IA32_SIGCONTEXT_ ## entry, offsetof(struct sigcontext_ia32, entry))
ENTRY(eax);
ENTRY(ebx);
ENTRY(ecx);
ENTRY(edx);
ENTRY(esi);
ENTRY(edi);
ENTRY(ebp);
ENTRY(esp);
ENTRY(eip);
BLANK();
#undef ENTRY
DEFINE(IA32_RT_SIGFRAME_sigcontext,
offsetof (struct rt_sigframe32, uc.uc_mcontext));
BLANK();
return 0;
}
......@@ -129,12 +129,75 @@ static struct pci_dev *find_k8_nb(void)
int cpu = smp_processor_id();
pci_for_each_dev(dev) {
if (dev->bus->number==0 && PCI_FUNC(dev->devfn)==3 &&
PCI_SLOT(dev->devfn) == (24+cpu))
PCI_SLOT(dev->devfn) == (24U+cpu))
return dev;
}
return NULL;
}
/* When we have kallsyms we can afford kmcedecode too. */
static char *transaction[] = {
"instruction", "data", "generic", "reserved"
};
static char *cachelevel[] = {
"level 0", "level 1", "level 2", "level generic"
};
static char *memtrans[] = {
"generic error", "generic read", "generic write", "data read",
"data write", "instruction fetch", "prefetch", "snoop",
"?", "?", "?", "?", "?", "?", "?"
};
static char *partproc[] = {
"local node origin", "local node response",
"local node observed", "generic"
};
static char *timeout[] = {
"request didn't time out",
"request timed out"
};
static char *memoryio[] = {
"memory access", "res.", "i/o access", "generic"
};
static char *extendederr[] = {
"ecc error",
"crc error",
"sync error",
"mst abort",
"tgt abort",
"gart error",
"rmw error",
"wdog error",
"chipkill ecc error",
"<9>","<10>","<11>","<12>",
"<13>","<14>","<15>"
};
static char *highbits[32] = {
[31] = "previous error lost",
[30] = "error overflow",
[29] = "error uncorrected",
[28] = "error enable",
[27] = "misc error valid",
[26] = "error address valid",
[25] = "processor context corrupt",
[24] = "res24",
[23] = "res23",
/* 22-15 ecc syndrome bits */
[14] = "corrected ecc error",
[13] = "uncorrected ecc error",
[12] = "res12",
[11] = "res11",
[10] = "res10",
[9] = "res9",
[8] = "dram scrub error",
[7] = "res7",
/* 6-4 ht link number of error */
[3] = "res3",
[2] = "res2",
[1] = "err cpu0",
[0] = "err cpu1",
};
static void check_k8_nb(void)
{
struct pci_dev *nb;
......@@ -149,20 +212,52 @@ static void check_k8_nb(void)
return;
printk(KERN_ERR "Northbridge status %08x%08x\n",
statushigh,statuslow);
if (statuslow & 0x10)
printk(KERN_ERR "GART error %d\n", statuslow & 0xf);
if (statushigh & (1<<31))
printk(KERN_ERR "Lost an northbridge error\n");
if (statushigh & (1<<25))
printk(KERN_EMERG "NB status: unrecoverable\n");
unsigned short errcode = statuslow & 0xffff;
switch (errcode >> 8) {
case 0:
printk(KERN_ERR " GART TLB error %s %s\n",
transaction[(errcode >> 2) & 3],
cachelevel[errcode & 3]);
break;
case 1:
if (errcode & (1<<11)) {
printk(KERN_ERR " bus error %s %s %s %s %s\n",
partproc[(errcode >> 10) & 0x3],
timeout[(errcode >> 9) & 1],
memtrans[(errcode >> 4) & 0xf],
memoryio[(errcode >> 2) & 0x3],
cachelevel[(errcode & 0x3)]);
} else if (errcode & (1<<8)) {
printk(KERN_ERR " memory error %s %s %s\n",
memtrans[(errcode >> 4) & 0xf],
transaction[(errcode >> 2) & 0x3],
cachelevel[(errcode & 0x3)]);
} else {
printk(KERN_ERR " unknown error code %x\n", errcode);
}
break;
}
if (statushigh & ((1<<14)|(1<<13)))
printk(KERN_ERR " ECC syndrome bits %x\n",
(((statuslow >> 24) & 0xff) << 8) | ((statushigh >> 15) & 0x7f));
errcode = (statuslow >> 16) & 0xf;
printk(KERN_ERR " extended error %s\n", extendederr[(statuslow >> 16) & 0xf]);
/* should only print when it was a HyperTransport related error. */
printk(KERN_ERR " link number %x\n", (statushigh >> 4) & 3);
int i;
for (i = 0; i < 32; i++)
if (highbits[i] && (statushigh & (1<<i)))
printk(KERN_ERR " %s\n", highbits[i]);
if (statushigh & (1<<26)) {
u32 addrhigh, addrlow;
pci_read_config_dword(nb, 0x54, &addrhigh);
pci_read_config_dword(nb, 0x50, &addrlow);
printk(KERN_ERR "NB error address %08x%08x\n", addrhigh,addrlow);
printk(KERN_ERR " error address %08x%08x\n", addrhigh,addrlow);
}
if (statushigh & (1<<29))
printk(KERN_EMERG "Error uncorrected\n");
statushigh &= ~(1<<31);
pci_write_config_dword(nb, 0x4c, statushigh);
}
......
......@@ -75,7 +75,7 @@ static inline int bad_addr(unsigned long *addrp, unsigned long size)
return 0;
}
int __init e820_mapped(unsigned long start, unsigned long end, int type)
int __init e820_mapped(unsigned long start, unsigned long end, unsigned type)
{
int i;
for (i = 0; i < e820.nr_map; i++) {
......
......@@ -97,8 +97,8 @@
/*
* A newly forked process directly context switches into this.
*/
/* rdi: prev */
ENTRY(ret_from_fork)
movq %rax,%rdi /* prev task, returned by __switch_to -> arg1 */
call schedule_tail
GET_THREAD_INFO(%rcx)
bt $TIF_SYSCALL_TRACE,threadinfo_flags(%rcx)
......@@ -414,6 +414,7 @@ iret_label:
.previous
.section .fixup,"ax"
/* force a signal here? this matches i386 behaviour */
/* running with kernel gs */
bad_iret:
movq $-9999,%rdi /* better code? */
jmp do_exit
......@@ -519,21 +520,27 @@ ENTRY(spurious_interrupt)
*/
ENTRY(error_entry)
/* rdi slot contains rax, oldrax contains error code */
pushq %rsi
movq 8(%rsp),%rsi /* load rax */
pushq %rdx
pushq %rcx
pushq %rsi /* store rax */
pushq %r8
pushq %r9
pushq %r10
pushq %r11
cld
SAVE_REST
subq $14*8,%rsp
movq %rsi,13*8(%rsp)
movq 14*8(%rsp),%rsi /* load rax from rdi slot */
movq %rdx,12*8(%rsp)
movq %rcx,11*8(%rsp)
movq %rsi,10*8(%rsp) /* store rax */
movq %r8, 9*8(%rsp)
movq %r9, 8*8(%rsp)
movq %r10,7*8(%rsp)
movq %r11,6*8(%rsp)
movq %rbx,5*8(%rsp)
movq %rbp,4*8(%rsp)
movq %r12,3*8(%rsp)
movq %r13,2*8(%rsp)
movq %r14,1*8(%rsp)
movq %r15,(%rsp)
xorl %ebx,%ebx
testl $3,CS(%rsp)
je error_kernelspace
error_swapgs:
xorl %ebx,%ebx
swapgs
error_sti:
movq %rdi,RDI(%rsp)
......@@ -557,13 +564,14 @@ error_exit:
iretq
error_kernelspace:
incl %ebx
/* There are two places in the kernel that can potentially fault with
usergs. Handle them here. */
usergs. Handle them here. The exception handlers after
iret run with kernel gs again, so don't set the user space flag. */
cmpq $iret_label,RIP(%rsp)
je error_swapgs
cmpq $gs_change,RIP(%rsp)
je error_swapgs
movl $1,%ebx
jmp error_sti
/* Reload gs selector with exception handling */
......@@ -584,7 +592,9 @@ gs_change:
.quad gs_change,bad_gs
.previous
.section .fixup,"ax"
/* running with kernelgs */
bad_gs:
swapgs /* switch back to user gs */
xorl %eax,%eax
movl %eax,%gs
jmp 2b
......@@ -614,12 +624,8 @@ ENTRY(kernel_thread)
# clone now
call do_fork
movq %rax,RAX(%rsp)
xorl %edi,%edi
cmpq $-1000,%rax
jnb 1f
movl tsk_pid(%rax),%eax
1: movq %rax,RAX(%rsp)
/*
* It isn't worth to check for reschedule here,
......
......@@ -351,7 +351,7 @@ gdt32_end:
ENTRY(cpu_gdt_table)
.quad 0x0000000000000000 /* NULL descriptor */
.quad 0x0000000000000000 /* unused */
.quad 0x00af9a000000ffff ^ (1<<21) /* __KERNEL_COMPAT32_CS */
.quad 0x00af9a000000ffff /* __KERNEL_CS */
.quad 0x00cf92000000ffff /* __KERNEL_DS */
.quad 0x00cffe000000ffff /* __USER32_CS */
......
......@@ -1060,7 +1060,8 @@ static void __init setup_ioapic_ids_from_mpc (void)
phys_id_present_map |= 1 << i;
mp_ioapics[apic].mpc_apicid = i;
} else {
printk("Setting %d in the phys_id_present_map\n", mp_ioapics[apic].mpc_apicid);
printk(KERN_INFO
"Using IO-APIC %d\n", mp_ioapics[apic].mpc_apicid);
phys_id_present_map |= 1 << mp_ioapics[apic].mpc_apicid;
}
......
/*
* linux/arch/i386/kernel/ioport.c
* linux/arch/x86_64/kernel/ioport.c
*
* This contains the io-permission bitmap code - written by obz, with changes
* by Linus.
......@@ -15,34 +15,35 @@
#include <linux/smp_lock.h>
#include <linux/stddef.h>
#include <linux/slab.h>
#include <asm/io.h>
/* Set EXTENT bits starting at BASE in BITMAP to value TURN_ON. */
static void set_bitmap(unsigned long *bitmap, short base, short extent, int new_value)
{
int mask;
unsigned long *bitmap_base = bitmap + (base >> 6);
unsigned long mask;
unsigned long *bitmap_base = bitmap + (base / sizeof(unsigned long));
unsigned short low_index = base & 0x3f;
int length = low_index + extent;
if (low_index != 0) {
mask = (~0 << low_index);
mask = (~0UL << low_index);
if (length < 64)
mask &= ~(~0 << length);
mask &= ~(~0UL << length);
if (new_value)
*bitmap_base++ |= mask;
else
*bitmap_base++ &= ~mask;
length -= 32;
length -= 64;
}
mask = (new_value ? ~0 : 0);
mask = (new_value ? ~0UL : 0UL);
while (length >= 64) {
*bitmap_base++ = mask;
length -= 64;
}
if (length > 0) {
mask = ~(~0 << length);
mask = ~(~0UL << length);
if (new_value)
*bitmap_base++ |= mask;
else
......@@ -113,3 +114,10 @@ asmlinkage long sys_iopl(unsigned int level, struct pt_regs regs)
regs.eflags = (regs.eflags & 0xffffffffffffcfff) | (level << 12);
return 0;
}
void eat_key(void)
{
if (inb(0x60) & 1)
inb(0x64);
}
......@@ -795,7 +795,7 @@ static unsigned int parse_hex_value (const char *buffer,
{
unsigned char hexnum [HEX_DIGITS];
unsigned long value;
int i;
unsigned i;
if (!count)
return -EINVAL;
......
......@@ -32,13 +32,13 @@ static void flush_ldt(void *null)
}
#endif
static int alloc_ldt(mm_context_t *pc, int mincount, int reload)
static int alloc_ldt(mm_context_t *pc, unsigned mincount, int reload)
{
void *oldldt;
void *newldt;
int oldsize;
unsigned oldsize;
if (mincount <= pc->size)
if (mincount <= (unsigned)pc->size)
return 0;
oldsize = pc->size;
mincount = (mincount+511)&(~511);
......@@ -63,7 +63,7 @@ static int alloc_ldt(mm_context_t *pc, int mincount, int reload)
#ifdef CONFIG_SMP
preempt_disable();
load_LDT(pc);
if (current->mm->cpu_vm_mask != (1<<smp_processor_id()))
if (current->mm->cpu_vm_mask != (1UL<<smp_processor_id()))
smp_call_function(flush_ldt, 0, 1, 1);
preempt_enable();
#else
......@@ -116,7 +116,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
void destroy_context(struct mm_struct *mm)
{
if (mm->context.size) {
if (mm->context.size*LDT_ENTRY_SIZE > PAGE_SIZE)
if ((unsigned)mm->context.size*LDT_ENTRY_SIZE > PAGE_SIZE)
vfree(mm->context.ldt);
else
kfree(mm->context.ldt);
......@@ -190,7 +190,7 @@ static int write_ldt(void * ptr, unsigned long bytecount, int oldmode)
}
down(&mm->context.sem);
if (ldt_info.entry_number >= mm->context.size) {
if (ldt_info.entry_number >= (unsigned)mm->context.size) {
error = alloc_ldt(&current->mm->context, ldt_info.entry_number+1, 1);
if (error < 0)
goto out_unlock;
......
......@@ -892,11 +892,15 @@ void __init mp_parse_prt (void)
list_for_each(node, &acpi_prt.entries) {
entry = list_entry(node, struct acpi_prt_entry, node);
/* We're only interested in static (non-link) entries. */
if (entry->link.handle)
/* Need to get irq for dynamic entry */
if (entry->link.handle) {
irq = acpi_pci_link_get_irq(entry->link.handle, entry->link.index);
if (!irq)
continue;
}
else
irq = entry->link.index;
ioapic = mp_find_ioapic(irq);
if (ioapic < 0)
continue;
......
......@@ -32,7 +32,6 @@ int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
int i;
BUG_ON(direction == PCI_DMA_NONE);
for (i = 0; i < nents; i++ ) {
struct scatterlist *s = &sg[i];
......
......@@ -14,15 +14,10 @@
/*
* Notebook:
agpgart_be
check if the simple reservation scheme is enough.
possible future tuning:
fast path for sg streaming mappings
more intelligent flush strategy - flush only a single NB? flush only when
gart area fills up and alloc_iommu wraps.
don't flush on allocation - need to unmap the gart area first to avoid prefetches
by the CPU
fast path for sg streaming mappings - only take the locks once.
more intelligent flush strategy - flush only the NB of the CPU directly
connected to the device?
move boundary between IOMMU and AGP in GART dynamically
*/
......@@ -67,7 +62,8 @@ static unsigned long *iommu_gart_bitmap; /* guarded by iommu_bitmap_lock */
#define GPTE_VALID 1
#define GPTE_COHERENT 2
#define GPTE_ENCODE(x) (((x) & 0xfffff000) | (((x) >> 32) << 4) | GPTE_VALID | GPTE_COHERENT)
#define GPTE_ENCODE(x) \
(((x) & 0xfffff000) | (((x) >> 32) << 4) | GPTE_VALID | GPTE_COHERENT)
#define GPTE_DECODE(x) (((x) & 0xfffff000) | (((u64)(x) & 0xff0) << 28))
#define for_all_nb(dev) \
......@@ -90,20 +86,23 @@ AGPEXTERN __u32 *agp_gatt_table;
static unsigned long next_bit; /* protected by iommu_bitmap_lock */
static unsigned long alloc_iommu(int size)
static unsigned long alloc_iommu(int size, int *flush)
{
unsigned long offset, flags;
spin_lock_irqsave(&iommu_bitmap_lock, flags);
offset = find_next_zero_string(iommu_gart_bitmap,next_bit,iommu_pages,size);
if (offset == -1)
if (offset == -1) {
*flush = 1;
offset = find_next_zero_string(iommu_gart_bitmap,0,next_bit,size);
}
if (offset != -1) {
set_bit_string(iommu_gart_bitmap, offset, size);
next_bit = offset+size;
if (next_bit >= iommu_pages)
if (next_bit >= iommu_pages) {
next_bit = 0;
*flush = 1;
}
}
spin_unlock_irqrestore(&iommu_bitmap_lock, flags);
return offset;
......@@ -114,7 +113,6 @@ static void free_iommu(unsigned long offset, int size)
unsigned long flags;
spin_lock_irqsave(&iommu_bitmap_lock, flags);
clear_bit_string(iommu_gart_bitmap, offset, size);
next_bit = offset;
spin_unlock_irqrestore(&iommu_bitmap_lock, flags);
}
......@@ -137,6 +135,7 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
int gfp = GFP_ATOMIC;
int i;
unsigned long iommu_page;
int flush = 0;
if (hwdev == NULL || hwdev->dma_mask < 0xffffffff || no_iommu)
gfp |= GFP_DMA;
......@@ -150,9 +149,10 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
if (memory == NULL) {
return NULL;
} else {
int high = (unsigned long)virt_to_bus(memory) + size
>= 0xffffffff;
int mmu = high;
int high = 0, mmu;
if (((unsigned long)virt_to_bus(memory) + size) > 0xffffffffUL)
high = 1;
mmu = 1;
if (force_mmu && !(gfp & GFP_DMA))
mmu = 1;
if (no_iommu) {
......@@ -168,7 +168,7 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
size >>= PAGE_SHIFT;
iommu_page = alloc_iommu(size);
iommu_page = alloc_iommu(size, &flush);
if (iommu_page == -1)
goto error;
......@@ -183,6 +183,7 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
iommu_gatt_base[iommu_page + i] = GPTE_ENCODE(phys_mem);
}
if (flush)
flush_gart();
*dma_handle = iommu_bus_base + (iommu_page << PAGE_SHIFT);
return memory;
......@@ -199,25 +200,24 @@ void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
void pci_free_consistent(struct pci_dev *hwdev, size_t size,
void *vaddr, dma_addr_t bus)
{
u64 pte;
unsigned long iommu_page;
int i;
size = round_up(size, PAGE_SIZE);
if (bus < iommu_bus_base || bus > iommu_bus_base + iommu_size) {
free_pages((unsigned long)vaddr, get_order(size));
return;
}
size >>= PAGE_SHIFT;
iommu_page = (bus - iommu_bus_base) / PAGE_SIZE;
for (i = 0; i < size; i++) {
pte = iommu_gatt_base[iommu_page + i];
if (bus >= iommu_bus_base && bus <= iommu_bus_base + iommu_size) {
unsigned pages = size >> PAGE_SHIFT;
iommu_page = (bus - iommu_bus_base) >> PAGE_SHIFT;
vaddr = __va(GPTE_DECODE(iommu_gatt_base[iommu_page]));
#ifdef CONFIG_IOMMU_DEBUG
int i;
for (i = 0; i < pages; i++) {
u64 pte = iommu_gatt_base[iommu_page + i];
BUG_ON((pte & GPTE_VALID) == 0);
iommu_gatt_base[iommu_page + i] = 0;
free_page((unsigned long) __va(GPTE_DECODE(pte)));
}
flush_gart();
free_iommu(iommu_page, size);
#endif
free_iommu(iommu_page, pages);
}
free_pages((unsigned long)vaddr, get_order(size));
}
#ifdef CONFIG_IOMMU_LEAK
......@@ -257,7 +257,7 @@ static void iommu_full(struct pci_dev *dev, void *addr, size_t size, int dir)
*/
printk(KERN_ERR
"PCI-DMA: Error: ran out out IOMMU space for %p size %lu at device %s[%s]\n",
"PCI-DMA: Out of IOMMU space for %p size %lu at device %s[%s]\n",
addr,size, dev ? dev->dev.name : "?", dev ? dev->slot_name : "?");
if (size > PAGE_SIZE*EMERGENCY_PAGES) {
......@@ -287,12 +287,12 @@ static inline int need_iommu(struct pci_dev *dev, unsigned long addr, size_t siz
return mmu;
}
dma_addr_t __pci_map_single(struct pci_dev *dev, void *addr, size_t size,
int dir, int flush)
dma_addr_t pci_map_single(struct pci_dev *dev, void *addr, size_t size, int dir)
{
unsigned long iommu_page;
unsigned long phys_mem, bus;
int i, npages;
int flush = 0;
BUG_ON(dir == PCI_DMA_NONE);
......@@ -302,7 +302,7 @@ dma_addr_t __pci_map_single(struct pci_dev *dev, void *addr, size_t size,
npages = round_up(size + ((u64)addr & ~PAGE_MASK), PAGE_SIZE) >> PAGE_SHIFT;
iommu_page = alloc_iommu(npages);
iommu_page = alloc_iommu(npages, &flush);
if (iommu_page == -1) {
iommu_full(dev, addr, size, dir);
return iommu_bus_base;
......@@ -343,12 +343,14 @@ void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
size_t size, int direction)
{
unsigned long iommu_page;
int i, npages;
int npages;
if (dma_addr < iommu_bus_base + EMERGENCY_PAGES*PAGE_SIZE ||
dma_addr > iommu_bus_base + iommu_size)
return;
iommu_page = (dma_addr - iommu_bus_base)>>PAGE_SHIFT;
npages = round_up(size + (dma_addr & ~PAGE_MASK), PAGE_SIZE) >> PAGE_SHIFT;
#ifdef CONFIG_IOMMU_DEBUG
int i;
for (i = 0; i < npages; i++) {
iommu_gatt_base[iommu_page + i] = 0;
#ifdef CONFIG_IOMMU_LEAK
......@@ -356,11 +358,11 @@ void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
iommu_leak_tab[iommu_page + i] = 0;
#endif
}
flush_gart();
#endif
free_iommu(iommu_page, npages);
}
EXPORT_SYMBOL(__pci_map_single);
EXPORT_SYMBOL(pci_map_single);
EXPORT_SYMBOL(pci_unmap_single);
static __init unsigned long check_iommu_size(unsigned long aper, u64 aper_size)
......@@ -407,7 +409,7 @@ static __init unsigned read_aperture(struct pci_dev *dev, u32 *size)
* Private Northbridge GATT initialization in case we cannot use the
* AGP driver for some reason.
*/
static __init int init_k8_gatt(agp_kern_info *info)
static __init int init_k8_gatt(struct agp_kern_info *info)
{
struct pci_dev *dev;
void *gatt;
......@@ -443,7 +445,7 @@ static __init int init_k8_gatt(agp_kern_info *info)
u32 ctl;
u32 gatt_reg;
gatt_reg = ((u64)gatt) >> 12;
gatt_reg = __pa(gatt) >> 12;
gatt_reg <<= 4;
pci_write_config_dword(dev, 0x98, gatt_reg);
pci_read_config_dword(dev, 0x90, &ctl);
......@@ -465,9 +467,11 @@ static __init int init_k8_gatt(agp_kern_info *info)
return -1;
}
extern int agp_amdk8_init(void);
void __init pci_iommu_init(void)
{
agp_kern_info info;
struct agp_kern_info info;
unsigned long aper_size;
unsigned long iommu_start;
......@@ -476,7 +480,6 @@ void __init pci_iommu_init(void)
#else
/* Add other K8 AGP bridge drivers here */
no_agp = no_agp ||
(agp_init() < 0) ||
(agp_amdk8_init() < 0) ||
(agp_copy_info(&info) < 0);
#endif
......@@ -536,8 +539,17 @@ void __init pci_iommu_init(void)
iommu_gatt_base = agp_gatt_table + (iommu_start>>PAGE_SHIFT);
bad_dma_address = iommu_bus_base;
change_page_attr(virt_to_page(__va(iommu_start)), iommu_pages, PAGE_KERNEL);
global_flush_tlb();
/*
* Unmap the IOMMU part of the GART. The alias of the page is always mapped
* with cache enabled and there is no full cache coherency across the GART
* remapping. The unmapping avoids automatic prefetches from the CPU
* allocating cache lines in there. All CPU accesses are done via the
* direct mapping to the backing memory. The GART address is only used by PCI
* devices.
*/
clear_kernel_mapping((unsigned long)__va(iommu_bus_base), iommu_size);
flush_gart();
}
/* iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]]
......
......@@ -216,7 +216,12 @@ void flush_thread(void)
{
struct task_struct *tsk = current;
memset(tsk->thread.debugreg, 0, sizeof(unsigned long)*8);
tsk->thread.debugreg0 = 0;
tsk->thread.debugreg1 = 0;
tsk->thread.debugreg2 = 0;
tsk->thread.debugreg3 = 0;
tsk->thread.debugreg6 = 0;
tsk->thread.debugreg7 = 0;
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
/*
* Forget coprocessor state..
......@@ -285,7 +290,7 @@ int copy_thread(int nr, unsigned long clone_flags, unsigned long rsp,
childregs->rax = 0;
childregs->rsp = rsp;
if (rsp == ~0) {
if (rsp == ~0UL) {
childregs->rsp = (unsigned long)childregs;
}
p->set_child_tid = p->clear_child_tid = NULL;
......@@ -294,7 +299,7 @@ int copy_thread(int nr, unsigned long clone_flags, unsigned long rsp,
p->thread.rsp0 = (unsigned long) (childregs+1);
p->thread.userrsp = me->thread.userrsp;
p->thread.rip = (unsigned long) ret_from_fork;
set_ti_thread_flag(p->thread_info, TIF_FORK);
p->thread.fs = me->thread.fs;
p->thread.gs = me->thread.gs;
......@@ -335,8 +340,7 @@ int copy_thread(int nr, unsigned long clone_flags, unsigned long rsp,
/*
* This special macro can be used to load a debugging register
*/
#define loaddebug(thread,register) \
set_debug(thread->debugreg[register], register)
#define loaddebug(thread,r) set_debug(thread->debugreg ## r, r)
/*
* switch_to(x,y) should switch tasks from x to y.
......@@ -422,7 +426,7 @@ struct task_struct *__switch_to(struct task_struct *prev_p, struct task_struct *
/*
* Now maybe reload the debug registers
*/
if (unlikely(next->debugreg[7])) {
if (unlikely(next->debugreg7)) {
loaddebug(next, 0);
loaddebug(next, 1);
loaddebug(next, 2);
......@@ -490,19 +494,15 @@ void set_personality_64bit(void)
asmlinkage long sys_fork(struct pt_regs regs)
{
struct task_struct *p;
p = do_fork(SIGCHLD, regs.rsp, &regs, 0, NULL, NULL);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
return do_fork(SIGCHLD, regs.rsp, &regs, 0, NULL, NULL);
}
asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, void *parent_tid, void *child_tid, struct pt_regs regs)
{
struct task_struct *p;
if (!newsp)
newsp = regs.rsp;
p = do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
return do_fork(clone_flags & ~CLONE_IDLETASK, newsp, &regs, 0,
parent_tid, child_tid);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
/*
......@@ -517,10 +517,8 @@ asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp, void *
*/
asmlinkage long sys_vfork(struct pt_regs regs)
{
struct task_struct *p;
p = do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.rsp, &regs, 0,
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.rsp, &regs, 0,
NULL, NULL);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
/*
......
......@@ -178,11 +178,11 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno)
}
asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
asmlinkage long sys_ptrace(long request, long pid, unsigned long addr, long data)
{
struct task_struct *child;
struct user * dummy = NULL;
long i, ret;
unsigned ui;
/* This lock_kernel fixes a subtle race with suid exec */
lock_kernel();
......@@ -240,18 +240,35 @@ asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
unsigned long tmp;
ret = -EIO;
if ((addr & 7) || addr < 0 ||
if ((addr & 7) ||
addr > sizeof(struct user) - 7)
break;
tmp = 0; /* Default return condition */
if(addr < sizeof(struct user_regs_struct))
switch (addr) {
case 0 ... sizeof(struct user_regs_struct):
tmp = getreg(child, addr);
if(addr >= (long) &dummy->u_debugreg[0] &&
addr <= (long) &dummy->u_debugreg[7]){
addr -= (long) &dummy->u_debugreg[0];
addr = addr >> 3;
tmp = child->thread.debugreg[addr];
break;
case offsetof(struct user, u_debugreg[0]):
tmp = child->thread.debugreg0;
break;
case offsetof(struct user, u_debugreg[1]):
tmp = child->thread.debugreg1;
break;
case offsetof(struct user, u_debugreg[2]):
tmp = child->thread.debugreg2;
break;
case offsetof(struct user, u_debugreg[3]):
tmp = child->thread.debugreg3;
break;
case offsetof(struct user, u_debugreg[6]):
tmp = child->thread.debugreg6;
break;
case offsetof(struct user, u_debugreg[7]):
tmp = child->thread.debugreg7;
break;
default:
tmp = 0;
break;
}
ret = put_user(tmp,(unsigned long *) data);
break;
......@@ -268,47 +285,53 @@ asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
case PTRACE_POKEUSR: /* write the word at location addr in the USER area */
ret = -EIO;
if ((addr & 7) || addr < 0 ||
if ((addr & 7) ||
addr > sizeof(struct user) - 7)
break;
if (addr < sizeof(struct user_regs_struct)) {
switch (addr) {
case 0 ... sizeof(struct user_regs_struct):
ret = putreg(child, addr, data);
break;
}
/* We need to be very careful here. We implicitly
want to modify a portion of the task_struct, and we
have to be selective about what portions we allow someone
to modify. */
ret = -EIO;
if(addr >= (long) &dummy->u_debugreg[0] &&
addr <= (long) &dummy->u_debugreg[7]){
if(addr == (long) &dummy->u_debugreg[4]) break;
if(addr == (long) &dummy->u_debugreg[5]) break;
if(addr < (long) &dummy->u_debugreg[4] &&
((unsigned long) data) >= TASK_SIZE-3) break;
if (addr == (long) &dummy->u_debugreg[6]) {
/* Disallows to set a breakpoint into the vsyscall */
case offsetof(struct user, u_debugreg[0]):
if (data >= TASK_SIZE-7) break;
child->thread.debugreg0 = data;
ret = 0;
break;
case offsetof(struct user, u_debugreg[1]):
if (data >= TASK_SIZE-7) break;
child->thread.debugreg1 = data;
ret = 0;
break;
case offsetof(struct user, u_debugreg[2]):
if (data >= TASK_SIZE-7) break;
child->thread.debugreg2 = data;
ret = 0;
break;
case offsetof(struct user, u_debugreg[3]):
if (data >= TASK_SIZE-7) break;
child->thread.debugreg3 = data;
ret = 0;
break;
case offsetof(struct user, u_debugreg[6]):
if (data >> 32)
goto out_tsk;
}
if(addr == (long) &dummy->u_debugreg[7]) {
break;
child->thread.debugreg6 = data;
ret = 0;
break;
case offsetof(struct user, u_debugreg[7]):
data &= ~DR_CONTROL_RESERVED;
for(i=0; i<4; i++)
if ((0x5454 >> ((data >> (16 + 4*i)) & 0xf)) & 1)
goto out_tsk;
}
addr -= (long) &dummy->u_debugreg;
addr = addr >> 3;
child->thread.debugreg[addr] = data;
break;
if (i == 4) {
child->thread.debugreg7 = data;
ret = 0;
}
break;
}
break;
case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */
case PTRACE_CONT: { /* restart after signal. */
long tmp;
......@@ -408,8 +431,8 @@ asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
ret = -EIO;
break;
}
for ( i = 0; i < sizeof(struct user_regs_struct); i += sizeof(long) ) {
__put_user(getreg(child, i),(unsigned long *) data);
for (ui = 0; ui < sizeof(struct user_regs_struct); ui += sizeof(long)) {
__put_user(getreg(child, ui),(unsigned long *) data);
data += sizeof(long);
}
ret = 0;
......@@ -422,9 +445,9 @@ asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
ret = -EIO;
break;
}
for ( i = 0; i < sizeof(struct user_regs_struct); i += sizeof(long) ) {
for (ui = 0; ui < sizeof(struct user_regs_struct); ui += sizeof(long)) {
__get_user(tmp, (unsigned long *) data);
putreg(child, i, tmp);
putreg(child, ui, tmp);
data += sizeof(long);
}
ret = 0;
......
......@@ -9,7 +9,9 @@
#include <asm/kdebug.h>
#include <asm/delay.h>
#include <asm/hw_irq.h>
#include <asm/system.h>
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
/*
* Power off function, if any
......@@ -17,35 +19,37 @@
void (*pm_power_off)(void);
static long no_idt[3];
static int reboot_mode;
#ifdef CONFIG_SMP
int reboot_smp = 0;
static int reboot_cpu = -1;
#endif
static enum {
BOOT_BIOS = 'b',
BOOT_TRIPLE = 't',
BOOT_KBD = 'k'
} reboot_type = BOOT_KBD;
static int reboot_mode = 0;
/* reboot=b[ios] | t[riple] | k[bd] [, [w]arm | [c]old]
bios Use the CPU reboto vector for warm reset
warm Don't set the cold reboot flag
cold Set the cold reboto flag
triple Force a triple fault (init)
kbd Use the keyboard controller. cold reset (default)
*/
static int __init reboot_setup(char *str)
{
while(1) {
for (;;) {
switch (*str) {
case 'w': /* "warm" reboot (no memory testing etc) */
case 'w':
reboot_mode = 0x1234;
break;
case 'c': /* "cold" reboot (with memory testing etc) */
reboot_mode = 0x0;
case 'c':
reboot_mode = 0;
break;
#ifdef CONFIG_SMP
case 's': /* "smp" reboot by executing reset on BSP or other CPU*/
reboot_smp = 1;
if (isdigit(str[1]))
sscanf(str+1, "%d", &reboot_cpu);
else if (!strncmp(str,"smp",3))
sscanf(str+3, "%d", &reboot_cpu);
/* we will leave sorting out the final value
when we are ready to reboot, since we might not
have set up boot_cpu_id or smp_num_cpu */
case 't':
case 'b':
case 'k':
reboot_type = *str;
break;
#endif
}
if((str = strchr(str,',')) != NULL)
str++;
......@@ -57,6 +61,56 @@ static int __init reboot_setup(char *str)
__setup("reboot=", reboot_setup);
/* overwrites random kernel memory. Should not be kernel .text */
#define WARMBOOT_TRAMP 0x1000UL
static void reboot_warm(void)
{
extern unsigned char warm_reboot[], warm_reboot_end[];
printk("warm reboot\n");
local_irq_disable();
/* restore identity mapping */
init_level4_pgt[0] = __pml4(__pa(level3_ident_pgt) | 7);
__flush_tlb_all();
/* Move the trampoline to low memory */
memcpy(__va(WARMBOOT_TRAMP), warm_reboot, warm_reboot_end - warm_reboot);
/* Start it in compatibility mode. */
asm volatile( " pushq $0\n" /* ss */
" pushq $0x2000\n" /* rsp */
" pushfq\n" /* eflags */
" pushq %[cs]\n"
" pushq %[target]\n"
" iretq" ::
[cs] "i" (__KERNEL_COMPAT32_CS),
[target] "b" (WARMBOOT_TRAMP));
}
#ifdef CONFIG_SMP
static void smp_halt(void)
{
int cpuid = safe_smp_processor_id();
/* Only run this on the boot processor */
if (cpuid != boot_cpu_id) {
static int first_entry = 1;
if (first_entry) {
first_entry = 0;
smp_call_function((void *)machine_restart, NULL, 1, 0);
} else {
/* AP reentering. just halt */
for(;;)
asm volatile("hlt");
}
}
smp_send_stop();
}
#endif
static inline void kb_wait(void)
{
int i;
......@@ -68,48 +122,24 @@ static inline void kb_wait(void)
void machine_restart(char * __unused)
{
#ifdef CONFIG_SMP
int cpuid;
cpuid = GET_APIC_ID(apic_read(APIC_ID));
if (reboot_smp) {
/* check to see if reboot_cpu is valid
if its not, default to the BSP */
if ((reboot_cpu == -1) ||
(reboot_cpu > (NR_CPUS -1)) ||
!(phys_cpu_present_map & (1<<cpuid)))
reboot_cpu = boot_cpu_id;
reboot_smp = 0; /* use this as a flag to only go through this once*/
/* re-run this function on the other CPUs
it will fall though this section since we have
cleared reboot_smp, and do the reboot if it is the
correct CPU, otherwise it halts. */
if (reboot_cpu != cpuid)
smp_call_function((void *)machine_restart , NULL, 1, 0);
}
int i;
/* if reboot_cpu is still -1, then we want a tradional reboot,
and if we are not running on the reboot_cpu,, halt */
if ((reboot_cpu != -1) && (cpuid != reboot_cpu)) {
for (;;)
__asm__ __volatile__ ("hlt");
}
/*
* Stop all CPUs and turn off local APICs and the IO-APIC, so
* other OSs see a clean IRQ state.
*/
smp_send_stop();
disable_IO_APIC();
#if CONFIG_SMP
smp_halt();
#endif
disable_IO_APIC();
/* rebooting needs to touch the page at absolute addr 0 */
/* Tell the BIOS if we want cold or warm reboot */
*((unsigned short *)__va(0x472)) = reboot_mode;
for (;;) {
int i;
/* First fondle with the keyboard controller. */
/* Could also try the reset bit in the Hammer NB */
switch (reboot_type) {
case BOOT_BIOS:
reboot_warm();
case BOOT_KBD:
for (i=0; i<100; i++) {
kb_wait();
udelay(50);
......@@ -117,11 +147,13 @@ void machine_restart(char * __unused)
udelay(50);
}
/* Could do reset through the northbridge of Hammer here. */
/* That didn't work - force a triple fault.. */
case BOOT_TRIPLE:
__asm__ __volatile__("lidt %0": :"m" (no_idt));
__asm__ __volatile__("int3");
reboot_type = BOOT_KBD;
break;
}
}
}
......
......@@ -201,6 +201,11 @@ static __init void parse_cmdline_early (char ** cmdline_p)
if (!memcmp(from, "mem=", 4))
parse_memopt(from+4, &from);
#ifdef CONFIG_DISCONTIGMEM
if (!memcmp(from, "numa=", 5))
numa_setup(from+5);
#endif
#ifdef CONFIG_GART_IOMMU
if (!memcmp(from,"iommu=",6)) {
iommu_setup(from+6);
......@@ -236,8 +241,6 @@ static void __init contig_initmem_init(void)
void __init setup_arch(char **cmdline_p)
{
int i;
Dprintk("setup_arch\n");
ROOT_DEV = ORIG_ROOT_DEV;
......@@ -367,9 +370,12 @@ void __init setup_arch(char **cmdline_p)
request_resource(&iomem_resource, &vram_resource);
{
unsigned i;
/* request I/O space for devices used on all i[345]86 PCs */
for (i = 0; i < STANDARD_IO_RESOURCES; i++)
request_resource(&ioport_resource, standard_io_resources+i);
}
pci_mem_start = IOMAP_START;
......@@ -694,7 +700,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
seq_printf(m, "power management:");
{
int i;
unsigned i;
for (i = 0; i < 32; i++)
if (c->x86_power & (1 << i)) {
if (i < ARRAY_SIZE(x86_power_flags))
......
......@@ -23,6 +23,7 @@
#include <asm/percpu.h>
#include <asm/mtrr.h>
#include <asm/proto.h>
#include <asm/mman.h>
char x86_boot_params[2048] __initdata = {0,};
......@@ -40,22 +41,79 @@ struct desc_ptr idt_descr = { 256 * 16, (unsigned long) idt_table };
char boot_cpu_stack[IRQSTACKSIZE] __cacheline_aligned;
unsigned long __supported_pte_mask = ~0UL;
static int do_not_nx = 1;
static int do_not_nx __initdata = 0;
unsigned long vm_stack_flags = __VM_STACK_FLAGS;
unsigned long vm_stack_flags32 = __VM_STACK_FLAGS;
unsigned long vm_data_default_flags = __VM_DATA_DEFAULT_FLAGS;
unsigned long vm_data_default_flags32 = __VM_DATA_DEFAULT_FLAGS;
unsigned long vm_force_exec32 = PROT_EXEC;
/* noexec=on|off
Control non executable mappings for 64bit processes.
on Enable
off Disable
noforce (default) Don't enable by default for heap/stack/data,
but allow PROT_EXEC to be effective
*/
static int __init nonx_setup(char *str)
{
if (!strncmp(str,"off",3)) {
__supported_pte_mask &= ~_PAGE_NX;
do_not_nx = 1;
} else if (!strncmp(str, "on",3)) {
do_not_nx = 0;
if (!strncmp(str, "on",3)) {
__supported_pte_mask |= _PAGE_NX;
do_not_nx = 0;
vm_data_default_flags &= ~VM_EXEC;
vm_stack_flags &= ~VM_EXEC;
} else if (!strncmp(str, "noforce",7) || !strncmp(str,"off",3)) {
do_not_nx = (str[0] == 'o');
if (do_not_nx)
__supported_pte_mask &= ~_PAGE_NX;
vm_data_default_flags |= VM_EXEC;
vm_stack_flags |= VM_EXEC;
}
return 1;
}
__setup("noexec=", nonx_setup);
/* noexec32=opt{,opt}
Control the no exec default for 32bit processes. Can be also overwritten
per executable using ELF header flags (e.g. needed for the X server)
Requires noexec=on or noexec=noforce to be effective.
Valid options:
all,on Heap,stack,data is non executable.
off (default) Heap,stack,data is executable
stack Stack is non executable, heap/data is.
force Don't imply PROT_EXEC for PROT_READ
compat (default) Imply PROT_EXEC for PROT_READ
*/
static int __init nonx32_setup(char *str)
{
char *s;
while ((s = strsep(&str, ",")) != NULL) {
if (!strcmp(s, "all") || !strcmp(s,"on")) {
vm_data_default_flags32 &= ~VM_EXEC;
vm_stack_flags32 &= ~VM_EXEC;
} else if (!strcmp(s, "off")) {
vm_data_default_flags32 |= VM_EXEC;
vm_stack_flags32 |= VM_EXEC;
} else if (!strcmp(s, "stack")) {
vm_data_default_flags32 |= VM_EXEC;
vm_stack_flags32 &= ~VM_EXEC;
} else if (!strcmp(s, "force")) {
vm_force_exec32 = 0;
} else if (!strcmp(s, "compat")) {
vm_force_exec32 = PROT_EXEC;
}
}
return 1;
}
__setup("noexec32=", nonx32_setup);
#ifndef __GENERIC_PER_CPU
unsigned long __per_cpu_offset[NR_CPUS];
......
......@@ -371,7 +371,7 @@ handle_signal(unsigned long sig, siginfo_t *info, sigset_t *oldset,
regs->rax = regs->orig_rax;
regs->rip -= 2;
}
if (regs->rax == -ERESTART_RESTARTBLOCK){
if (regs->rax == (unsigned long)-ERESTART_RESTARTBLOCK){
regs->rax = __NR_restart_syscall;
regs->rip -= 2;
}
......@@ -434,8 +434,8 @@ int do_signal(struct pt_regs *regs, sigset_t *oldset)
* have been cleared if the watchpoint triggered
* inside the kernel.
*/
if (current->thread.debugreg[7])
asm volatile("movq %0,%%db7" : : "r" (current->thread.debugreg[7]));
if (current->thread.debugreg7)
asm volatile("movq %0,%%db7" : : "r" (current->thread.debugreg7));
/* Whee! Actually deliver the signal. */
handle_signal(signr, &info, oldset, regs);
......@@ -446,9 +446,10 @@ int do_signal(struct pt_regs *regs, sigset_t *oldset)
/* Did we come from a system call? */
if (regs->orig_rax >= 0) {
/* Restart the system call - no handlers present */
if (regs->rax == -ERESTARTNOHAND ||
regs->rax == -ERESTARTSYS ||
regs->rax == -ERESTARTNOINTR) {
long res = regs->rax;
if (res == -ERESTARTNOHAND ||
res == -ERESTARTSYS ||
res == -ERESTARTNOINTR) {
regs->rax = regs->orig_rax;
regs->rip -= 2;
}
......
......@@ -42,6 +42,7 @@
#include <linux/smp_lock.h>
#include <linux/irq.h>
#include <linux/bootmem.h>
#include <linux/thread_info.h>
#include <linux/delay.h>
#include <linux/mc146818rtc.h>
......@@ -123,7 +124,7 @@ static void __init synchronize_tsc_bp (void)
unsigned long long t0;
unsigned long long sum, avg;
long long delta;
unsigned long one_usec;
long one_usec;
int buggy = 0;
extern unsigned cpu_khz;
......@@ -339,7 +340,7 @@ extern int cpu_idle(void);
/*
* Activate a secondary processor.
*/
int __init start_secondary(void *unused)
void __init start_secondary(void)
{
/*
* Dont put anything before smp_callin(), SMP
......@@ -380,29 +381,7 @@ int __init start_secondary(void *unused)
set_bit(smp_processor_id(), &cpu_online_map);
wmb();
return cpu_idle();
}
/*
* Everything has been set up for the secondary
* CPUs - they just need to reload everything
* from the task structure
* This function must not return.
*/
void __init initialize_secondary(void)
{
struct task_struct *me = stack_current();
/*
* We don't actually need to load the full TSS,
* basically just the stack pointer and the eip.
*/
asm volatile(
"movq %0,%%rsp\n\t"
"jmp *%1"
:
:"r" (me->thread.rsp),"r" (me->thread.rip));
cpu_idle();
}
extern volatile unsigned long init_rsp;
......@@ -412,16 +391,16 @@ static struct task_struct * __init fork_by_hand(void)
{
struct pt_regs regs;
/*
* don't care about the rip and regs settings since
* don't care about the eip and regs settings since
* we'll never reschedule the forked task.
*/
return do_fork(CLONE_VM|CLONE_IDLETASK, 0, &regs, 0, NULL, NULL);
return copy_process(CLONE_VM|CLONE_IDLETASK, 0, &regs, 0, NULL, NULL);
}
#if APIC_DEBUG
static inline void inquire_remote_apic(int apicid)
{
int i, regs[] = { APIC_ID >> 4, APIC_LVR >> 4, APIC_SPIV >> 4 };
unsigned i, regs[] = { APIC_ID >> 4, APIC_LVR >> 4, APIC_SPIV >> 4 };
char *names[] = { "ID", "VERSION", "SPIV" };
int timeout, status;
......@@ -596,6 +575,7 @@ static void __init do_boot_cpu (int apicid)
idle = fork_by_hand();
if (IS_ERR(idle))
panic("failed fork for CPU %d", cpu);
wake_up_forked_process(idle);
/*
* We remove it from the pidhash and the runqueue
......@@ -603,22 +583,19 @@ static void __init do_boot_cpu (int apicid)
*/
init_idle(idle,cpu);
idle->thread.rip = (unsigned long)start_secondary;
// idle->thread.rsp = (unsigned long)idle->thread_info + THREAD_SIZE - 512;
unhash_process(idle);
cpu_pda[cpu].pcurrent = idle;
/* start_eip had better be page-aligned! */
start_rip = setup_trampoline();
init_rsp = (unsigned long)idle->thread_info + PAGE_SIZE + 1024;
init_rsp = idle->thread.rsp;
init_tss[cpu].rsp0 = init_rsp;
initial_code = initialize_secondary;
initial_code = start_secondary;
clear_ti_thread_flag(idle->thread_info, TIF_FORK);
printk(KERN_INFO "Booting processor %d/%d rip %lx rsp %lx rsp2 %lx\n", cpu, apicid,
start_rip, idle->thread.rsp, init_rsp);
printk(KERN_INFO "Booting processor %d/%d rip %lx rsp %lx\n", cpu, apicid,
start_rip, init_rsp);
/*
* This grunge runs the startup process for
......@@ -676,7 +653,7 @@ static void __init do_boot_cpu (int apicid)
if (test_bit(cpu, &cpu_callin_map)) {
/* number CPUs logically, starting from 1 (BSP is 0) */
Dprintk("OK.\n");
printk("KERN_INFO CPU%d: ", cpu);
printk(KERN_INFO "CPU%d: ", cpu);
print_cpu_info(&cpu_data[cpu]);
Dprintk("CPU has booted.\n");
} else {
......@@ -708,7 +685,7 @@ unsigned long cache_decay_ticks;
static void smp_tune_scheduling (void)
{
unsigned long cachesize; /* kB */
int cachesize; /* kB */
unsigned long bandwidth = 1000; /* MB/s */
/*
* Rough estimation for SMP scheduling, this is the number of
......@@ -753,7 +730,7 @@ static void smp_tune_scheduling (void)
static void __init smp_boot_cpus(unsigned int max_cpus)
{
int apicid, cpu;
unsigned apicid, cpu;
/*
* Setup boot CPU information
......
......@@ -117,5 +117,5 @@ asmlinkage long sys_uname(struct new_utsname * name)
asmlinkage long wrap_sys_shmat(int shmid, char *shmaddr, int shmflg)
{
unsigned long raddr;
return sys_shmat(shmid,shmaddr,shmflg,&raddr) ?: raddr;
return sys_shmat(shmid,shmaddr,shmflg,&raddr) ?: (long)raddr;
}
......@@ -584,12 +584,12 @@ asmlinkage void do_debug(struct pt_regs * regs, long error_code)
/* Mask out spurious debug traps due to lazy DR7 setting */
if (condition & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) {
if (!tsk->thread.debugreg[7]) {
if (!tsk->thread.debugreg7) {
goto clear_dr7;
}
}
tsk->thread.debugreg[6] = condition;
tsk->thread.debugreg6 = condition;
/* Mask out spurious TF errors due to lazy TF clearing */
if (condition & DR_STEP) {
......
/*
* Code for the vsyscall page. This version uses the syscall instruction.
*/
#include <asm/ia32_unistd.h>
#include <asm/offset.h>
.text
.globl __kernel_vsyscall
.type __kernel_vsyscall,@function
__kernel_vsyscall:
.LSTART_vsyscall:
push %ebp
.Lpush_ebp:
movl %ecx, %ebp
syscall
popl %ebp
.Lpop_ebp:
ret
.LEND_vsyscall:
.size __kernel_vsyscall,.-.LSTART_vsyscall
.balign 32
.globl __kernel_sigreturn
.type __kernel_sigreturn,@function
__kernel_sigreturn:
.LSTART_sigreturn:
popl %eax
movl $__NR_ia32_sigreturn, %eax
syscall
.LEND_sigreturn:
.size __kernel_sigreturn,.-.LSTART_sigreturn
.balign 32
.globl __kernel_rt_sigreturn
.type __kernel_rt_sigreturn,@function
__kernel_rt_sigreturn:
.LSTART_rt_sigreturn:
movl $__NR_ia32_rt_sigreturn, %eax
syscall
.LEND_rt_sigreturn:
.size __kernel_rt_sigreturn,.-.LSTART_rt_sigreturn
.section .eh_frame,"a",@progbits
.LSTARTFRAME:
.long .LENDCIE-.LSTARTCIE
.LSTARTCIE:
.long 0 /* CIE ID */
.byte 1 /* Version number */
.string "zR" /* NUL-terminated augmentation string */
.uleb128 1 /* Code alignment factor */
.sleb128 -4 /* Data alignment factor */
.byte 8 /* Return address register column */
.uleb128 1 /* Augmentation value length */
.byte 0x1b /* DW_EH_PE_pcrel|DW_EH_PE_sdata4. */
.byte 0x0c /* DW_CFA_def_cfa */
.uleb128 4
.uleb128 4
.byte 0x88 /* DW_CFA_offset, column 0x8 */
.uleb128 1
.align 4
.LENDCIE:
.long .LENDFDE1-.LSTARTFDE1 /* Length FDE */
.LSTARTFDE1:
.long .LSTARTFDE1-.LSTARTFRAME /* CIE pointer */
.long .LSTART_vsyscall-. /* PC-relative start address */
.long .LEND_vsyscall-.LSTART_vsyscall
.uleb128 0 /* Augmentation length */
/* What follows are the instructions for the table generation.
We have to record all changes of the stack pointer. */
.byte 0x40 + .Lpush_ebp-.LSTART_vsyscall /* DW_CFA_advance_loc */
.byte 0x0e /* DW_CFA_def_cfa_offset */
.uleb128 8
.byte 0x85, 0x02 /* DW_CFA_offset %ebp -8 */
.byte 0x40 + .Lpop_ebp-.Lpush_ebp /* DW_CFA_advance_loc */
.byte 0xc5 /* DW_CFA_restore %ebp */
.byte 0x0e /* DW_CFA_def_cfa_offset */
.uleb128 4
.align 4
.LENDFDE1:
.long .LENDFDE2-.LSTARTFDE2 /* Length FDE */
.LSTARTFDE2:
.long .LSTARTFDE2-.LSTARTFRAME /* CIE pointer */
/* HACK: The dwarf2 unwind routines will subtract 1 from the
return address to get an address in the middle of the
presumed call instruction. Since we didn't get here via
a call, we need to include the nop before the real start
to make up for it. */
.long .LSTART_sigreturn-1-. /* PC-relative start address */
.long .LEND_sigreturn-.LSTART_sigreturn+1
.uleb128 0 /* Augmentation length */
/* What follows are the instructions for the table generation.
We record the locations of each register saved. This is
complicated by the fact that the "CFA" is always assumed to
be the value of the stack pointer in the caller. This means
that we must define the CFA of this body of code to be the
saved value of the stack pointer in the sigcontext. Which
also means that there is no fixed relation to the other
saved registers, which means that we must use DW_CFA_expression
to compute their addresses. It also means that when we
adjust the stack with the popl, we have to do it all over again. */
#define do_cfa_expr(offset) \
.byte 0x0f; /* DW_CFA_def_cfa_expression */ \
.uleb128 1f-0f; /* length */ \
0: .byte 0x74; /* DW_OP_breg4 */ \
.sleb128 offset; /* offset */ \
.byte 0x06; /* DW_OP_deref */ \
1:
#define do_expr(regno, offset) \
.byte 0x10; /* DW_CFA_expression */ \
.uleb128 regno; /* regno */ \
.uleb128 1f-0f; /* length */ \
0: .byte 0x74; /* DW_OP_breg4 */ \
.sleb128 offset; /* offset */ \
1:
do_cfa_expr(IA32_SIGCONTEXT_esp+4)
do_expr(0, IA32_SIGCONTEXT_eax+4)
do_expr(1, IA32_SIGCONTEXT_ecx+4)
do_expr(2, IA32_SIGCONTEXT_edx+4)
do_expr(3, IA32_SIGCONTEXT_ebx+4)
do_expr(5, IA32_SIGCONTEXT_ebp+4)
do_expr(6, IA32_SIGCONTEXT_esi+4)
do_expr(7, IA32_SIGCONTEXT_edi+4)
do_expr(8, IA32_SIGCONTEXT_eip+4)
.byte 0x42 /* DW_CFA_advance_loc 2 -- nop; popl eax. */
do_cfa_expr(IA32_SIGCONTEXT_esp)
do_expr(0, IA32_SIGCONTEXT_eax)
do_expr(1, IA32_SIGCONTEXT_ecx)
do_expr(2, IA32_SIGCONTEXT_edx)
do_expr(3, IA32_SIGCONTEXT_ebx)
do_expr(5, IA32_SIGCONTEXT_ebp)
do_expr(6, IA32_SIGCONTEXT_esi)
do_expr(7, IA32_SIGCONTEXT_edi)
do_expr(8, IA32_SIGCONTEXT_eip)
.align 4
.LENDFDE2:
.long .LENDFDE3-.LSTARTFDE3 /* Length FDE */
.LSTARTFDE3:
.long .LSTARTFDE3-.LSTARTFRAME /* CIE pointer */
/* HACK: See above wrt unwind library assumptions. */
.long .LSTART_rt_sigreturn-1-. /* PC-relative start address */
.long .LEND_rt_sigreturn-.LSTART_rt_sigreturn+1
.uleb128 0 /* Augmentation */
/* What follows are the instructions for the table generation.
We record the locations of each register saved. This is
slightly less complicated than the above, since we don't
modify the stack pointer in the process. */
do_cfa_expr(IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_esp)
do_expr(0, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_eax)
do_expr(1, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ecx)
do_expr(2, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_edx)
do_expr(3, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ebx)
do_expr(5, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_ebp)
do_expr(6, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_esi)
do_expr(7, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_edi)
do_expr(8, IA32_RT_SIGFRAME_sigcontext-4 + IA32_SIGCONTEXT_eip)
.align 4
.LENDFDE3:
/*
* Switch back to real mode and call the BIOS reboot vector.
* This is a trampoline copied around in process.c
* Written 2003 by Andi Kleen, SuSE Labs.
*/
#include <asm/msr.h>
#define R(x) x-warm_reboot(%ebx)
#define R64(x) x-warm_reboot(%rbx)
/* running in identity mapping and in the first 64k of memory
and in compatibility mode. This must be position independent */
/* Follows 14.7 "Leaving Long Mode" in the AMD x86-64 manual, volume 2
and 8.9.2 "Switching Back to Real-Address Mode" in the Intel IA32
manual, volume 2 */
/* ebx: self pointer to warm_reboot */
.globl warm_reboot
warm_reboot:
addl %ebx, R64(real_mode_desc) /* relocate tables */
addl %ebx,2+R64(warm_gdt_desc)
movq %cr0,%rax
btr $31,%rax
movq %rax,%cr0 /* disable paging */
jmp 1f /* flush prefetch queue */
.code32
1: movl $MSR_EFER,%ecx
rdmsr
andl $~((1<<_EFER_LME)|(1<<_EFER_SCE)|(1<<_EFER_NX)),%eax
wrmsr /* disable long mode in EFER */
xorl %eax,%eax
movl %eax,%cr3 /* flush tlb */
/* Running protected mode without paging now */
wbinvd /* flush caches. Needed? */
lidt R(warm_idt_desc)
lgdt R(warm_gdt_desc)
movl $0x10,%ecx /* load segment registers with real mode settings */
movl %ecx,%ds
movl %ecx,%es
movl %ecx,%fs
movl %ecx,%gs
movl %ecx,%ss
lea R(real_mode_desc),%eax
ljmp *(%eax)
.code16:
real_mode:
xorl %eax,%eax
movl %eax,%cr0
/* some people claim $0xf000,0xfff0 is better. Use what 32bit linux uses. */
/* code as bytes because gas has problems with it */
.byte 0xea,0xf0,0xff,0x00,0xf0 /* ljmp 0xf000:0xfff0 */
real_mode_desc:
.long real_mode - warm_reboot
.short 8
warm_gdt_desc:
.short 8*3
.long warm_gdt - warm_reboot
warm_gdt:
.quad 0
.quad 0x00009a000000ffff /* 16-bit real-mode 64k code at 0x00000000 */
.quad 0x000092000100ffff /* 16-bit real-mode 64k data at 0x00000100 */
warm_idt_desc:
.short 0x3ff
.long 0
.globl warm_reboot_end
warm_reboot_end:
......@@ -207,3 +207,12 @@ EXPORT_SYMBOL(init_level4_pgt);
extern unsigned long __supported_pte_mask;
EXPORT_SYMBOL(__supported_pte_mask);
#ifdef CONFIG_DISCONTIGMEM
EXPORT_SYMBOL(memnode_shift);
EXPORT_SYMBOL(memnodemap);
EXPORT_SYMBOL(node_data);
EXPORT_SYMBOL(fake_node);
#endif
EXPORT_SYMBOL(clear_page);
......@@ -6,16 +6,13 @@
#define FIX_ALIGNMENT 1
#define movnti movq /* write to cache for now */
#define prefetch prefetcht2
#include <asm/current.h>
#include <asm/offset.h>
#include <asm/thread_info.h>
/* Standard copy_to_user with segment limit checking */
.globl copy_to_user
.p2align
.p2align 4
copy_to_user:
GET_THREAD_INFO(%rax)
movq %rdi,%rcx
......@@ -27,7 +24,7 @@ copy_to_user:
/* Standard copy_from_user with segment limit checking */
.globl copy_from_user
.p2align
.p2align 4
copy_from_user:
GET_THREAD_INFO(%rax)
movq %rsi,%rcx
......@@ -58,23 +55,23 @@ bad_to_user:
* rdx count
*
* Output:
* eax uncopied bytes or 0 if successful.
* eax uncopied bytes or 0 if successfull.
*/
.globl copy_user_generic
.p2align 4
copy_user_generic:
/* Put the first cacheline into cache. This should handle
the small movements in ioctls etc., but not penalize the bigger
filesystem data copies too much. */
pushq %rbx
prefetch (%rsi)
xorl %eax,%eax /*zero for the exception handler */
#ifdef FIX_ALIGNMENT
/* check for bad alignment of destination */
movl %edi,%ecx
andl $7,%ecx
jnz bad_alignment
after_bad_alignment:
jnz .Lbad_alignment
.Lafter_bad_alignment:
#endif
movq %rdx,%rcx
......@@ -82,133 +79,133 @@ after_bad_alignment:
movl $64,%ebx
shrq $6,%rdx
decq %rdx
js handle_tail
jz loop_no_prefetch
loop:
prefetch 64(%rsi)
js .Lhandle_tail
loop_no_prefetch:
s1: movq (%rsi),%r11
s2: movq 1*8(%rsi),%r8
s3: movq 2*8(%rsi),%r9
s4: movq 3*8(%rsi),%r10
d1: movnti %r11,(%rdi)
d2: movnti %r8,1*8(%rdi)
d3: movnti %r9,2*8(%rdi)
d4: movnti %r10,3*8(%rdi)
.p2align 4
.Lloop:
.Ls1: movq (%rsi),%r11
.Ls2: movq 1*8(%rsi),%r8
.Ls3: movq 2*8(%rsi),%r9
.Ls4: movq 3*8(%rsi),%r10
.Ld1: movq %r11,(%rdi)
.Ld2: movq %r8,1*8(%rdi)
.Ld3: movq %r9,2*8(%rdi)
.Ld4: movq %r10,3*8(%rdi)
s5: movq 4*8(%rsi),%r11
s6: movq 5*8(%rsi),%r8
s7: movq 6*8(%rsi),%r9
s8: movq 7*8(%rsi),%r10
d5: movnti %r11,4*8(%rdi)
d6: movnti %r8,5*8(%rdi)
d7: movnti %r9,6*8(%rdi)
d8: movnti %r10,7*8(%rdi)
addq %rbx,%rsi
addq %rbx,%rdi
.Ls5: movq 4*8(%rsi),%r11
.Ls6: movq 5*8(%rsi),%r8
.Ls7: movq 6*8(%rsi),%r9
.Ls8: movq 7*8(%rsi),%r10
.Ld5: movq %r11,4*8(%rdi)
.Ld6: movq %r8,5*8(%rdi)
.Ld7: movq %r9,6*8(%rdi)
.Ld8: movq %r10,7*8(%rdi)
decq %rdx
jz loop_no_prefetch
jns loop
handle_tail:
leaq 64(%rsi),%rsi
leaq 64(%rdi),%rdi
jns .Lloop
.p2align 4
.Lhandle_tail:
movl %ecx,%edx
andl $63,%ecx
shrl $3,%ecx
jz handle_7
jz .Lhandle_7
movl $8,%ebx
loop_8:
s9: movq (%rsi),%r8
d9: movq %r8,(%rdi)
addq %rbx,%rdi
addq %rbx,%rsi
.p2align 4
.Lloop_8:
.Ls9: movq (%rsi),%r8
.Ld9: movq %r8,(%rdi)
decl %ecx
jnz loop_8
leaq 8(%rdi),%rdi
leaq 8(%rsi),%rsi
jnz .Lloop_8
handle_7:
.Lhandle_7:
movl %edx,%ecx
andl $7,%ecx
jz ende
loop_1:
s10: movb (%rsi),%bl
d10: movb %bl,(%rdi)
jz .Lende
.p2align 4
.Lloop_1:
.Ls10: movb (%rsi),%bl
.Ld10: movb %bl,(%rdi)
incq %rdi
incq %rsi
decl %ecx
jnz loop_1
jnz .Lloop_1
ende:
sfence
.Lende:
popq %rbx
ret
#ifdef FIX_ALIGNMENT
/* align destination */
bad_alignment:
.p2align 4
.Lbad_alignment:
movl $8,%r9d
subl %ecx,%r9d
movl %r9d,%ecx
subq %r9,%rdx
jz small_align
js small_align
align_1:
s11: movb (%rsi),%bl
d11: movb %bl,(%rdi)
jz .Lsmall_align
js .Lsmall_align
.Lalign_1:
.Ls11: movb (%rsi),%bl
.Ld11: movb %bl,(%rdi)
incq %rsi
incq %rdi
decl %ecx
jnz align_1
jmp after_bad_alignment
small_align:
jnz .Lalign_1
jmp .Lafter_bad_alignment
.Lsmall_align:
addq %r9,%rdx
jmp handle_7
jmp .Lhandle_7
#endif
/* table sorted by exception address */
.section __ex_table,"a"
.align 8
.quad s1,s1e
.quad s2,s2e
.quad s3,s3e
.quad s4,s4e
.quad d1,s1e
.quad d2,s2e
.quad d3,s3e
.quad d4,s4e
.quad s5,s5e
.quad s6,s6e
.quad s7,s7e
.quad s8,s8e
.quad d5,s5e
.quad d6,s6e
.quad d7,s7e
.quad d8,s8e
.quad s9,e_quad
.quad d9,e_quad
.quad s10,e_byte
.quad d10,e_byte
.quad .Ls1,.Ls1e
.quad .Ls2,.Ls2e
.quad .Ls3,.Ls3e
.quad .Ls4,.Ls4e
.quad .Ld1,.Ls1e
.quad .Ld2,.Ls2e
.quad .Ld3,.Ls3e
.quad .Ld4,.Ls4e
.quad .Ls5,.Ls5e
.quad .Ls6,.Ls6e
.quad .Ls7,.Ls7e
.quad .Ls8,.Ls8e
.quad .Ld5,.Ls5e
.quad .Ld6,.Ls6e
.quad .Ld7,.Ls7e
.quad .Ld8,.Ls8e
.quad .Ls9,.Le_quad
.quad .Ld9,.Le_quad
.quad .Ls10,.Le_byte
.quad .Ld10,.Le_byte
#ifdef FIX_ALIGNMENT
.quad s11,e_byte
.quad d11,e_byte
.quad .Ls11,.Le_byte
.quad .Ld11,.Le_byte
#endif
.quad e5,e_zero
.quad .Le5,.Le_zero
.previous
/* compute 64-offset for main loop. 8 bytes accuracy with error on the
pessimistic side. this is gross. it would be better to fix the
interface. */
/* eax: zero, ebx: 64 */
s1e: addl $8,%eax
s2e: addl $8,%eax
s3e: addl $8,%eax
s4e: addl $8,%eax
s5e: addl $8,%eax
s6e: addl $8,%eax
s7e: addl $8,%eax
s8e: addl $8,%eax
.Ls1e: addl $8,%eax
.Ls2e: addl $8,%eax
.Ls3e: addl $8,%eax
.Ls4e: addl $8,%eax
.Ls5e: addl $8,%eax
.Ls6e: addl $8,%eax
.Ls7e: addl $8,%eax
.Ls8e: addl $8,%eax
addq %rbx,%rdi /* +64 */
subq %rax,%rdi /* correct destination with computed offset */
......@@ -216,22 +213,22 @@ s8e: addl $8,%eax
addq %rax,%rdx /* add offset to loopcnt */
andl $63,%ecx /* remaining bytes */
addq %rcx,%rdx /* add them */
jmp zero_rest
jmp .Lzero_rest
/* exception on quad word loop in tail handling */
/* ecx: loopcnt/8, %edx: length, rdi: correct */
e_quad:
.Le_quad:
shll $3,%ecx
andl $7,%edx
addl %ecx,%edx
/* edx: bytes to zero, rdi: dest, eax:zero */
zero_rest:
.Lzero_rest:
movq %rdx,%rcx
e_byte:
.Le_byte:
xorl %eax,%eax
e5: rep
.Le5: rep
stosb
/* when there is another exception while zeroing the rest just return */
e_zero:
.Le_zero:
movq %rdx,%rax
jmp ende
jmp .Lende
......@@ -226,17 +226,18 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code)
bad_area_nosemaphore:
/* User mode accesses just cause a SIGSEGV */
if (error_code & 4) {
#ifdef CONFIG_IA32_EMULATION
/* 32bit vsyscall. map on demand. */
if (test_thread_flag(TIF_IA32) &&
address >= 0xffffe000 && address < 0xffffefff-7) {
address >= 0xffffe000 && address < 0xffffe000 + PAGE_SIZE) {
if (map_syscall32(mm, address) < 0)
goto out_of_memory2;
return;
}
#endif
/* User mode accesses just cause a SIGSEGV */
if (error_code & 4) {
printk(KERN_INFO
"%s[%d] segfault at rip:%lx rsp:%lx adr:%lx err:%lx\n",
tsk->comm, tsk->pid, regs->rip, regs->rsp, address,
......
......@@ -39,7 +39,7 @@
#include <asm/proto.h>
#include <asm/smp.h>
#define Dprintk(x...) printk(x)
#define Dprintk(x...)
struct mmu_gather mmu_gathers[NR_CPUS];
......@@ -105,7 +105,7 @@ static void set_pte_phys(unsigned long vaddr,
pml4_t *level4;
pgd_t *pgd;
pmd_t *pmd;
pte_t *pte;
pte_t *pte, new_pte;
Dprintk("set_pte_phys %lx to %lx\n", vaddr, phys);
......@@ -132,11 +132,13 @@ static void set_pte_phys(unsigned long vaddr,
return;
}
}
new_pte = pfn_pte(phys >> PAGE_SHIFT, prot);
pte = pte_offset_kernel(pmd, vaddr);
/* CHECKME: */
if (pte_val(*pte))
if (!pte_none(*pte) &&
pte_val(*pte) != (pte_val(new_pte) & __supported_pte_mask))
pte_ERROR(*pte);
set_pte(pte, pfn_pte(phys >> PAGE_SHIFT, prot));
set_pte(pte, new_pte);
/*
* It's enough to flush this one mapping.
......@@ -340,6 +342,35 @@ void __init paging_init(void)
}
#endif
/* Unmap a kernel mapping if it exists. This is useful to avoid prefetches
from the CPU leading to inconsistent cache lines. address and size
must be aligned to 2MB boundaries.
Does nothing when the mapping doesn't exist. */
void __init clear_kernel_mapping(unsigned long address, unsigned long size)
{
unsigned long end = address + size;
BUG_ON(address & ~LARGE_PAGE_MASK);
BUG_ON(size & ~LARGE_PAGE_MASK);
for (; address < end; address += LARGE_PAGE_SIZE) {
pgd_t *pgd = pgd_offset_k(address);
if (!pgd || pgd_none(*pgd))
continue;
pmd_t *pmd = pmd_offset(pgd, address);
if (!pmd || pmd_none(*pmd))
continue;
if (0 == (pmd_val(*pmd) & _PAGE_PSE)) {
/* Could handle this, but it should not happen currently. */
printk(KERN_ERR
"clear_kernel_mapping: mapping has been split. will leak memory\n");
pmd_ERROR(*pmd);
}
set_pmd(pmd, __pmd(0));
}
__flush_tlb_all();
}
static inline int page_is_ram (unsigned long pagenr)
{
int i;
......
......@@ -87,10 +87,8 @@ int __init k8_scan_nodes(unsigned long start, unsigned long end)
if (limit > end_pfn_map << PAGE_SHIFT)
limit = end_pfn_map << PAGE_SHIFT;
if (limit <= base) {
printk(KERN_INFO "Node %d beyond memory map\n", nodeid);
if (limit <= base)
continue;
}
base >>= 16;
base <<= 24;
......
......@@ -15,7 +15,7 @@
#include <asm/dma.h>
#include <asm/numa.h>
#define Dprintk(x...) printk(x)
#define Dprintk(x...)
struct pglist_data *node_data[MAXNODE];
bootmem_data_t plat_node_bdata[MAX_NUMNODES];
......@@ -104,8 +104,11 @@ void __init setup_node_bootmem(int nodeid, unsigned long start, unsigned long en
reserve_bootmem_node(NODE_DATA(nodeid), nodedata_phys, pgdat_size);
reserve_bootmem_node(NODE_DATA(nodeid), bootmap_start, bootmap_pages<<PAGE_SHIFT);
if (nodeid + 1 > numnodes)
if (nodeid + 1 > numnodes) {
numnodes = nodeid + 1;
printk(KERN_INFO
"setup_node_bootmem: enlarging numnodes to %d\n", numnodes);
}
nodes_present |= (1UL << nodeid);
}
......@@ -121,7 +124,7 @@ void __init setup_node_zones(int nodeid)
start_pfn = node_start_pfn(nodeid);
end_pfn = node_end_pfn(nodeid);
printk("setting up node %d %lx-%lx\n", nodeid, start_pfn, end_pfn);
printk(KERN_INFO "setting up node %d %lx-%lx\n", nodeid, start_pfn, end_pfn);
/* All nodes > 0 have a zero length zone DMA */
dma_end_pfn = __pa(MAX_DMA_ADDRESS) >> PAGE_SHIFT;
......
......@@ -63,29 +63,53 @@ static void flush_kernel_map(void *address)
__flush_tlb_one(address);
}
static inline void flush_map(unsigned long address)
{
on_each_cpu(flush_kernel_map, (void *)address, 1, 1);
}
struct deferred_page {
struct deferred_page *next;
struct page *fpage;
unsigned long address;
};
static struct deferred_page *df_list; /* protected by init_mm.mmap_sem */
static inline void save_page(unsigned long address, struct page *fpage)
{
struct deferred_page *df;
df = kmalloc(sizeof(struct deferred_page), GFP_KERNEL);
if (!df) {
flush_map(address);
__free_page(fpage);
} else {
df->next = df_list;
df->fpage = fpage;
df->address = address;
df_list = df;
}
}
/*
* No more special protections in this 2/4MB area - revert to a
* large page again.
*/
static inline void revert_page(struct page *kpte_page, unsigned long address)
static void revert_page(struct page *kpte_page, unsigned long address)
{
pgd_t *pgd;
pmd_t *pmd;
pte_t large_pte;
pgd = pgd_offset_k(address);
if (!pgd) BUG();
pmd = pmd_offset(pgd, address);
if (!pmd) BUG();
if ((pmd_val(*pmd) & _PAGE_GLOBAL) == 0) BUG();
BUG_ON(pmd_val(*pmd) & _PAGE_PSE);
large_pte = mk_pte_phys(__pa(address) & LARGE_PAGE_MASK, PAGE_KERNEL_LARGE);
set_pte((pte_t *)pmd, large_pte);
}
static int
__change_page_attr(unsigned long address, struct page *page, pgprot_t prot,
struct page **oldpage)
__change_page_attr(unsigned long address, struct page *page, pgprot_t prot)
{
pte_t *kpte;
struct page *kpte_page;
......@@ -107,6 +131,7 @@ __change_page_attr(unsigned long address, struct page *page, pgprot_t prot,
struct page *split = split_large_page(address, prot);
if (!split)
return -ENOMEM;
atomic_inc(&kpte_page->count);
set_pte(kpte,mk_pte(split, PAGE_KERNEL));
}
} else if ((kpte_flags & _PAGE_PSE) == 0) {
......@@ -115,39 +140,12 @@ __change_page_attr(unsigned long address, struct page *page, pgprot_t prot,
}
if (atomic_read(&kpte_page->count) == 1) {
*oldpage = kpte_page;
save_page(address, kpte_page);
revert_page(kpte_page, address);
}
return 0;
}
static inline void flush_map(unsigned long address)
{
on_each_cpu(flush_kernel_map, (void *)address, 1, 1);
}
struct deferred_page {
struct deferred_page *next;
struct page *fpage;
unsigned long address;
};
static struct deferred_page *df_list; /* protected by init_mm.mmap_sem */
static inline void save_page(unsigned long address, struct page *fpage)
{
struct deferred_page *df;
df = kmalloc(sizeof(struct deferred_page), GFP_KERNEL);
if (!df) {
flush_map(address);
__free_page(fpage);
} else {
df->next = df_list;
df->fpage = fpage;
df->address = address;
df_list = df;
}
}
/*
* Change the page attributes of an page in the linear mapping.
*
......@@ -164,24 +162,19 @@ static inline void save_page(unsigned long address, struct page *fpage)
int change_page_attr(struct page *page, int numpages, pgprot_t prot)
{
int err = 0;
struct page *fpage, *fpage2;
int i;
down_write(&init_mm.mmap_sem);
for (i = 0; i < numpages; i++, page++) {
for (i = 0; i < numpages; !err && i++, page++) {
unsigned long address = (unsigned long)page_address(page);
fpage = NULL;
err = __change_page_attr(address, page, prot, &fpage);
err = __change_page_attr(address, page, prot);
if (err)
break;
/* Handle kernel mapping too which aliases part of the lowmem */
if (!err && page_to_phys(page) < KERNEL_TEXT_SIZE) {
if (page_to_phys(page) < KERNEL_TEXT_SIZE) {
unsigned long addr2 = __START_KERNEL_map + page_to_phys(page);
fpage2 = NULL;
err = __change_page_attr(addr2, page, prot, &fpage2);
if (fpage2)
save_page(addr2, fpage2);
err = __change_page_attr(addr2, page, prot);
}
if (fpage)
save_page(address, fpage);
}
up_write(&init_mm.mmap_sem);
return err;
......
......@@ -378,8 +378,9 @@ static struct irq_info *pirq_get_info(struct pci_dev *dev)
return NULL;
}
static void pcibios_test_irq_handler(int irq, void *dev_id, struct pt_regs *regs)
static irqreturn_t pcibios_test_irq_handler(int irq, void *dev_id, struct pt_regs *regs)
{
return IRQ_NONE;
}
static int pcibios_lookup_irq(struct pci_dev *dev, int assign)
......
......@@ -127,7 +127,7 @@ SECTIONS
/* Sections to be discarded */
/DISCARD/ : {
*(.exit.data)
*(.exit.text)
/* *(.exit.text) */
*(.exitcall.exit)
*(.eh_frame)
}
......
......@@ -9,7 +9,7 @@
#ifdef CONFIG_X86_LOCAL_APIC
#define APIC_DEBUG 1
#define APIC_DEBUG 0
#if APIC_DEBUG
#define Dprintk(x...) printk(x)
......
......@@ -84,8 +84,9 @@
movq \offset+72(%rsp),%rax
.endm
#define REST_SKIP 6*8
.macro SAVE_REST
subq $6*8,%rsp
subq $REST_SKIP,%rsp
movq %rbx,5*8(%rsp)
movq %rbp,4*8(%rsp)
movq %r12,3*8(%rsp)
......@@ -94,7 +95,6 @@
movq %r15,(%rsp)
.endm
#define REST_SKIP 6*8
.macro RESTORE_REST
movq (%rsp),%r15
movq 1*8(%rsp),%r14
......
#ifndef _ASM_X86_64_COMPAT_H
#define _ASM_X86_64_COMPAT_H
/*
* Architecture specific compatibility types
*/
#include <linux/types.h>
#include <linux/sched.h>
#define COMPAT_USER_HZ 100
......
......@@ -58,7 +58,7 @@
We can slow the instruction pipeline for instructions coming via the
gdt or the ldt if we want to. I am not sure why this is an advantage */
#define DR_CONTROL_RESERVED (0xFFFFFFFFFC00) /* Reserved by Intel */
#define DR_CONTROL_RESERVED (0xFFFFFFFF0000FC00UL) /* Reserved */
#define DR_LOCAL_SLOWDOWN (0x100) /* Local slow the pipeline */
#define DR_GLOBAL_SLOWDOWN (0x200) /* Global slow the pipeline */
......
......@@ -50,7 +50,7 @@ extern void contig_e820_setup(void);
extern unsigned long e820_end_of_ram(void);
extern void e820_reserve_resources(void);
extern void e820_print_map(char *who);
extern int e820_mapped(unsigned long start, unsigned long end, int type);
extern int e820_mapped(unsigned long start, unsigned long end, unsigned type);
extern void e820_bootmem_free(pg_data_t *pgdat, unsigned long start,unsigned long end);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -18,6 +18,7 @@
#define __USER_CS 0x33 /* 6*8+3 */
#define __USER32_DS __USER_DS
#define __KERNEL16_CS (GDT_ENTRY_KERNELCS16 * 8)
#define __KERNEL_COMPAT32_CS 0x8
#define GDT_ENTRY_TLS 1
#define GDT_ENTRY_TSS 8 /* needs two entries */
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment