[PATCH] ppc64: change bad choice of VSID_MULTIPLIER

We recently changed the VSID allocation on PPC64 to use a new scheme based on a multiplicative hash. It turns out our choice of multiplier (the largest 28-bit prime) wasn't so great: with large contiguous mappings, we can get very poor hash scattering. In particular earlier machines (without 16M pages) which had a reasonable about of RAM (>2G or so) wouldn't boot, because the linear mapping overflowed some hash buckets. This patch changes the multiplier to something which seems to work better (it is, rather arbitrarily, the median of the primes between 2^27 and 2^28). Some more theory should almost certainly go into the choice of this constant, to avoid more pathological cases. But for now, this choice fixes a serious bug, and seems to do at least as well at scattering as the old choice on a handful of simple testcases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] ppc64: change bad choice of VSID_MULTIPLIER
We recently changed the VSID allocation on PPC64 to use a new scheme based on a multiplicative hash. It turns out our choice of multiplier (the largest 28-bit prime) wasn't so great: with large contiguous mappings, we can get very poor hash scattering. In particular earlier machines (without 16M pages) which had a reasonable about of RAM (>2G or so) wouldn't boot, because the linear mapping overflowed some hash buckets. This patch changes the multiplier to something which seems to work better (it is, rather arbitrarily, the median of the primes between 2^27 and 2^28). Some more theory should almost certainly go into the choice of this constant, to avoid more pathological cases. But for now, this choice fixes a serious bug, and seems to do at least as well at scattering as the old choice on a handful of simple testcases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2c2d4b3f · David Gibson · Linus Torvalds · e54e74ca · 2c2d4b3f · 2c2d4b3f
Commit 2c2d4b3f authored Sep 30, 2004 by David Gibson Committed by Linus Torvalds Sep 30, 2004
Showing with 8 additions and 9 deletions

arch/ppc64/kernel/head.S arch/ppc64/kernel/head.S +3 -3

include/asm-ppc64/mmu.h include/asm-ppc64/mmu.h +1 -1

include/asm-ppc64/mmu_context.h include/asm-ppc64/mmu_context.h +4 -5

No files found.
--- a/arch/ppc64/kernel/head.S
+++ b/arch/ppc64/kernel/head.S
@@ -551,14 +551,14 @@ __end_systemcfg:
 	.llong	0		/* Reserved */
 	.llong	0		/* Reserved */
 	.llong	(KERNELBASE>>SID_SHIFT)
-	.llong	0x40bffffd5	/* KERNELBASE VSID */
+	.llong	0x408f92c94	/* KERNELBASE VSID */
 	/* We have to list the bolted VMALLOC segment here, too, so that it
 	 * will be restored on shared processor switch */
 	.llong	(VMALLOCBASE>>SID_SHIFT)
-	.llong	0xb0cffffd1	/* VMALLOCBASE VSID */
+	.llong	0xf09b89af5	/* VMALLOCBASE VSID */
 	.llong	8192		/* # pages to map (32 MB) */
 	.llong	0		/* Offset from start of loadarea to start of map */
-	.llong	0x40bffffd50000	/* VPN of first page to map */
+	.llong	0x408f92c940000	/* VPN of first page to map */
 	. = 0x6100

--- a/include/asm-ppc64/mmu.h
+++ b/include/asm-ppc64/mmu.h
@@ -202,7 +202,7 @@ extern void htab_finish_init(void);
 #define SLB_VSID_KERNEL		(SLB_VSID_KP|SLB_VSID_C)
 #define SLB_VSID_USER		(SLB_VSID_KP|SLB_VSID_KS)
-#define VSID_MULTIPLIER	ASM_CONST(268435399)	/* largest 28-bit prime */
+#define VSID_MULTIPLIER	ASM_CONST(200730139)	/* 28-bit prime */
 #define VSID_BITS	36
 #define VSID_MODULUS	((1UL<<VSID_BITS)-1)

--- a/include/asm-ppc64/mmu_context.h
+++ b/include/asm-ppc64/mmu_context.h
@@ -108,11 +108,10 @@ static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
 *
 * This scramble is only well defined for proto-VSIDs below
 * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are
- * reserved.  VSID_MULTIPLIER is prime (the largest 28-bit prime, in
+ * reserved.  VSID_MULTIPLIER is prime, so in particular it is
- * fact), so in particular it is co-prime to VSID_MODULUS, making this
+ * co-prime to VSID_MODULUS, making this a 1:1 scrambling function.
- * a 1:1 scrambling function.  Because the modulus is 2^n-1 we can
+ * Because the modulus is 2^n-1 we can compute it efficiently without
- * compute it efficiently without a divide or extra multiply (see
+ * a divide or extra multiply (see below).
- * below).
 *
 * This scheme has several advantages over older methods:
 *