- 20 Aug, 2009 28 commits
-
-
Benjamin Herrenschmidt authored
That patch used to just add a hook to page table flushing but pulling that string brought out a whole bunch of issues, so it now does that and more: - We now make the RCU batching of page freeing SMP only, as I believe it was intended initially. We make a few more things compile to nothing on !CONFIG_SMP - Some macros are turned into functions, though that forced me to out of line a few stuffs due to unsolvable include depenencies, however it's probably better that way anyway, it's not -that- critical code path. - 32-bit didn't call pte_free_finish() on tlb_flush() which means that it wouldn't push out the batch to RCU for delayed freeing when a bunch of page tables have been freed, they would just stay in there until the batch gets full. 64-bit BookE will use that hook to maintain the virtually linear page tables or the indirect entries in the TLB when using the HW loader. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Those definitions are currently declared extern in the .c file where they are used, move them to a header file instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Currently, a single ifdef covers SLB related bits and more generic ppc64 related bits, split this in two separate ifdef's since 64-bit BookE will need one but not the other. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
Our 64-bit hash context handling has no init function, but 64-bit Book3E will use the common mmu_context_nohash.c code which does, so define an empty inline mmu_context_init() for 64-bit server and call it from our 64-bit setup_arch() Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
We need to pass down whether the page is direct or indirect and we'll need to pass the page size to _tlbil_va and _tlbivax_bcast We also add a new low level _tlbil_pid_noind() which does a TLB flush by PID but avoids flushing indirect entries if possible This implements those new prototypes but defines them with inlines or macros so that no additional arguments are actually passed on current processors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The way I intend to use tophys/tovirt on 64-bit BookE is different from the "trick" that we currently play for 32-bit BookE so change the condition of definition of these macros to make it so. Also, make sure we only use rfid and mtmsrd instead of rfi and mtmsr for 64-bit server processors, not all 64-bit processors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This adds some code to do early ioremap's using page tables instead of bolting entries in the hash table. This will be used by the upcoming 64-bits BookE port. The patch also changes the test for early vs. late ioremap to use slab_is_available() instead of our old hackish mem_init_done. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This adds various additional bit definitions for various MMU related SPRs used on Book3E. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This adds the opcode definitions to ppc-opcode.h for the two instructions tlbivax and tlbsrx. as defined by Book3E 2.06 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The current "no hash" MMU context management code is written with the assumption that one CPU == one TLB. This is not the case on implementations that support HW multithreading, where several linux CPUs can share the same TLB. This adds some basic support for this to our context management and our TLB flushing code. It also cleans up the optional debugging output a bit Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
enter_prom() used to save and restore registers such as CTR, XER etc.. which are volatile, or SRR0,1... which we don't care about. This removes a bunch of useless code and while at it turns an mtmsrd into an MTMSRD macro which will be useful to Book3E. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
A misplaced #endif causes more definitions than intended to be protected by #ifndef __ASSEMBLY__. This breaks upcoming 64-bit BookE support patch when using 64k pages. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The truncate syscall has a signed long parameter, so when using a 32- bit userspace with a 64-bit kernel the argument is zero-extended instead of sign-extended. Adding the compat_sys_truncate function fixes the issue. This was noticed during an LSB truncate test failure. The test was checking for the correct error number set when truncate is called with a length of -1. The test can be found at: http://bzr.linuxfoundation.org/lsb/devel/runtime-test?cmd=inventory;rev=stewb%40linux-foundation.org-20090626205411-sfb23cc0tjj7jzgm;path=modules/vsx-pcts/tset/POSIX.os/files/truncate/ BenH: Added compat_sys_ftruncate() as well, same issue. Signed-off-by: Chase Douglas <cndougla@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Lucian Adrian Grijincu authored
dtc was moved in 9fffb55f from arch/powerpc/boot/ to scripts/dtc/ This patch updates the wrapper script to point to the new location of dtc. Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Frans Pop authored
Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
This change the SPRG used to store the PACA on ppc64 from SPRG3 to SPRG1. SPRG3 is user readable on most processors and we want to use it for other things. We change the scratch SPRG used by exception vectors from SRPG1 to SPRG2. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The code for setting up the IPIs for SMP PowerSurge marchines bitrot, it needs to properly map the HW interrupt number Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The current definitions set ranges and defaults for 32 and 64-bit only using "PPC_STD_MMU" which means hash based MMU. This uselessly restrict the usefulness for the upcoming 64-bit BookE port, but more than that, it's broken on 32-bit since the only 32-bit platform supporting multiple page sizes currently is 44x which does -not- have PPC_STD_MMU_32 set. This fixes it by using PPC64 and PPC32 instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
roel kluin authored
Replace strncpy() and explicit null-termination by strlcpy() Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The STAB code used on Power3 and RS/64 uses a second scratch SPRG to save a GPR in order to decide whether to go to do_stab_bolted_* or to handle a normal data access exception. This prevents our scheme of freeing SPRG3 which is user visible for user uses since we cannot use SPRG0 which, on RS/64, seems to be read-only for supervisor mode (like POWER4). This reworks the STAB exception entry to use the PACA as temporary storage instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The kernel uses SPRG registers for various purposes, typically in low level assembly code as scratch registers or to hold per-cpu global infos such as the PACA or the current thread_info pointer. We want to be able to easily shuffle the usage of those registers as some implementations have specific constraints realted to some of them, for example, some have userspace readable aliases, etc.. and the current choice isn't always the best. This patch should not change any code generation, and replaces the usage of SPRN_SPRGn everywhere in the kernel with a named replacement and adds documentation next to the definition of the names as to what those are used for on each processor family. The only parts that still use the original numbers are bits of KVM or suspend/resume code that just blindly needs to save/restore all the SPRGs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
The file include/asm/exception.h contains definitions that are specific to exception handling on 64-bit server type processors. This renames the file to exception-64s.h to reflect that fact and avoid confusion. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can reuse this preload slot by loading in the segment at 0x10000000, where almost all PowerPC binaries are linked at. On a microbenchmark that bounces a token between two 64bit processes over pipes and calls gettimeofday each iteration (to access the VDSO), both the 32bit and 64bit context switch rate improves (tested on a 4GHz POWER6): 32bit: 273k/sec -> 283k/sec 64bit: 277k/sec -> 284k/sec Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
With the new top down layout it is likely that the pc and stack will be in the same segment, because the pc is most likely in a library allocated via a top down mmap. Right now we bail out early if these segments match. Rearrange the SLB preload code to sanity check all SLB preload addresses are not in the kernel, then check all addresses for conflicts. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO is position independent we can remove the hint and let get_unmapped_area pick an area. This will mean the vdso will be near other mmaps and will share an SLB entry: 10000000-10001000 r-xp 00000000 08:06 5778459 /root/context_switch_64 10010000-10011000 r--p 00000000 08:06 5778459 /root/context_switch_64 10011000-10012000 rw-p 00001000 08:06 5778459 /root/context_switch_64 fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0 fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051 /lib64/power6/libc-2.9.so fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051 /lib64/power6/libc-2.9.so fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051 /lib64/power6/libc-2.9.so fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051 /lib64/power6/libc-2.9.so fffa947c000-fffa9480000 rw-p 00000000 00:00 0 fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852 /lib64/ld-2.9.so fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0 fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0 [vdso] <----- here I am fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852 /lib64/ld-2.9.so fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852 /lib64/ld-2.9.so fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0 fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0 [stack] On a microbenchmark that bounces a token between two 64bit processes over pipes and calls gettimeofday each iteration (to access the VDSO), our context switch rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Geoff Thorpe authored
The bitops.h functions that operate on a single bit in a bitfield are implemented by operating on the corresponding word location. In all cases the inner logic is valid if the mask being applied has more than one bit set, so this patch exposes those inner operations. Indeed, set_bits() was already available, but it duplicated code from set_bit() (rather than making the latter a wrapper) - it was also missing the PPC405_ERR77() workaround and the "volatile" address qualifier present in other APIs. This corrects that, and exposes the other multi-bit equivalents. One advantage of these multi-bit forms is that they allow word-sized variables to essentially be their own spinlocks, eg. very useful for state machines where an atomic "flags" variable can obviate the need for any additional locking. Signed-off-by: Geoff Thorpe <geoff@geoffthorpe.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Michael Ellerman authored
The workaround enabled by CONFIG_MPIC_BROKEN_REGREAD does not work on non-broken MPICs. The symptom is no interrupts being received. The fix is twofold. Firstly the code was broken for multiple isus, we need to index into the shadow array with the src_no, not the idx. Secondly, we always do the read, but only use the VECPRI_MASK and VECPRI_ACTIVITY bits from the hardware, the rest of "val" comes from the shadow. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Gerhard Pircher authored
This allows to remove the ppc_md.init() hook in the setup code. Signed-off-by: Gerhard Pircher <gerhard_pircher@gmx.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 19 Aug, 2009 2 commits
-
-
Linus Torvalds authored
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: security: Fix prompt for LSM_MMAP_MIN_ADDR security: Make LSM_MMAP_MIN_ADDR default match its help text.
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: use the right flag for get_vm_area() percpu, sparc64: fix sparse possible cpu map handling init: set nr_cpu_ids before setup_per_cpu_areas()
-
- 18 Aug, 2009 10 commits
-
-
Linus Torvalds authored
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: Don't initialize MCEs on unknown CPUs x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c x86, mce: therm_throt: Don't log redundant normality x86: Fix UV BAU destination subnode id
-
Bo Liu authored
If node_load[] is cleared everytime build_zonelists() is called,node_load[] will have no help to find the next node that should appear in the given node's fallback list. Because of the bug, zonelist's node_order is not calculated as expected. This bug affects on big machine, which has asynmetric node distance. [synmetric NUMA's node distance] 0 1 2 0 10 12 12 1 12 10 12 2 12 12 10 [asynmetric NUMA's node distance] 0 1 2 0 10 12 20 1 12 10 14 2 20 14 10 This (my bug) is very old but no one has reported this for a long time. Maybe because the number of asynmetric NUMA is very small and they use cpuset for customizing node memory allocation fallback. [akpm@linux-foundation.org: fix CONFIG_NUMA=n build] Signed-off-by: Bo Liu <bo-liu@hotmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Graff Yang authored
According to the POSIX (1003.1-2008), the file descriptor shall have been opened with read permission, regardless of the protection options specified to mmap(). The ltp test cases mmap06/07 need this. Signed-off-by: Graff Yang <graff.yang@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ben Dooks authored
Since the changes to the bitbang driver, there is the possibility we will be called with either the speed_hz or bpw values zero. We take these to mean that the default values (8 bits per word, or maximum bus speed). Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ben Dooks authored
Currently the clock rate calculation may round as pleased, which means that it is possible that we will round down and end up with a faster clock rate than intended. Change the calculation to use DIV_ROUND_UP() to ensure that we end up with a clock rate either the same as or lower than the user requested one. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
There are a number of individual MMC drivers listed in MAINTAINERS. I didn't modify those records. Perhaps I should have. Cc: <linux-mmc@vger.kernel.org> Cc: Manuel Lauss <manuel.lauss@gmail.com> Cc: Nicolas Pitre <nico@cam.org> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: Pavel Pisa <ppisa@pikron.com> Cc: Jarkko Lavinen <jarkko.lavinen@nokia.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Sascha Sommer <saschasommer@freenet.de> Cc: Ian Molton <ian@mnementh.co.uk> Cc: Joseph Chan <JosephChan@via.com.tw> Cc: Harald Welte <HaraldWelte@viatech.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
KOSAKI Motohiro authored
The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to the mm_struct. It was a very good first step for sanitize OOM. However Paul Menage reported the commit makes regression to his job scheduler. Current OOM logic can kill OOM_DISABLED process. Why? His program has the code of similar to the following. ... set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */ ... if (vfork() == 0) { set_oom_adj(0); /* Invoked child can be killed */ execve("foo-bar-cmd"); } .... vfork() parent and child are shared the same mm_struct. then above set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also change oom_adj for vfork() parent. Then, vfork() parent (job scheduler) lost OOM immune and it was killed. Actually, fork-setting-exec idiom is very frequently used in userland program. We must not break this assumption. Then, this patch revert commit 2ff05b2b and related commit. Reverted commit list --------------------- - commit 2ff05b2b (oom: move oom_adj value from task_struct to mm_struct) - commit 4d8b9135 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE) - commit 81236810 (oom: only oom kill exiting tasks with attached memory) - commit 933b787b (mm: copy over oom_adj value at fork time) Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jeff Layton authored
get_sb_pseudo sets s_maxbytes to ~0ULL which becomes negative when cast to a signed value. Fix it to use MAX_LFS_FILESIZE which casts properly to a positive signed value. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Steve French <smfrench@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Robert Love <rlove@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Benny Halevy <bhalevy@panasas.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-