[ARM PATCH] 2046/1: fix nwfpe for double arithmetic on big-endian platforms

Patch from Lennert Buytenhek Hi, I need the patch below (against 2.6.8-rc1-ds1) to make nwfpe properly emulate arithmetic with doubles on a big endian ARM platform. From reading the mailing list archives and from helpful comments I've received from people on this list, I gather that this has come up in the past, but it appears that Russell King was never really convinced as to why this patch is needed. I think I understand what's going on, and will try to explain. On little endian ARM, the double value 1.0 looks like this when stored in memory in FPA word ordering: bytes: 0x00 0x00 0xf0 0x3f 0x00 0x00 0x00 0x00 u32s: 0x3ff00000 0x00000000 u64: 0x000000003ff00000 On big endian, it looks like this: bytes: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 u32s: 0x3ff00000 0x00000000 u64: 0x3ff0000000000000 It appears to be this way because once upon a time, somebody decided that the sub-words of a double will use native endian word ordering within themselves, but the two separate words will always be stored with the most significant one first. God knows why they did it this way, but they did. Anyway. The key observation is that nwfpe internally stores double values in the type 'float64', which is basically just a typedef for unsigned long long. It never accesses 'float64's on the byte level by casting pointers around or anything like that, it just uses direct u64 arithmetic primitives (add, shift, or, and) for float64 manipulations and that's it. So. For little endian platforms, 1.0 looks like: 0x00 0x00 0xf0 0x3f 0x00 0x00 0x00 0x00 But since nwfpe treats it as a u64, it wants it to look like: 0x00 0x00 0x00 0x00 0x00 0x00 0xf0 0x3f So, that's why the current code swaps the words around when getting doubles from userspace and putting them back (see fpa11_cpdt.c, loadDouble and storeDouble.) On big endian, 1.0 looks like: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 Since nwfpe treats it as a u64, it wants it to look like: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 Hey! That's exactly the same. So in this case, it shouldn't be swapping the halves around. However, it currently does that swapping unconditionally, and that's why floating point emulation messes up. This is how I understand things -- hope it makes sense to other people too. cheers, Lennert

[ARM PATCH] 2046/1: fix nwfpe for double arithmetic on big-endian platforms
Patch from Lennert Buytenhek Hi, I need the patch below (against 2.6.8-rc1-ds1) to make nwfpe properly emulate arithmetic with doubles on a big endian ARM platform. From reading the mailing list archives and from helpful comments I've received from people on this list, I gather that this has come up in the past, but it appears that Russell King was never really convinced as to why this patch is needed. I think I understand what's going on, and will try to explain. On little endian ARM, the double value 1.0 looks like this when stored in memory in FPA word ordering: bytes: 0x00 0x00 0xf0 0x3f 0x00 0x00 0x00 0x00 u32s: 0x3ff00000 0x00000000 u64: 0x000000003ff00000 On big endian, it looks like this: bytes: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 u32s: 0x3ff00000 0x00000000 u64: 0x3ff0000000000000 It appears to be this way because once upon a time, somebody decided that the sub-words of a double will use native endian word ordering within themselves, but the two separate words will always be stored with the most significant one first. God knows why they did it this way, but they did. Anyway. The key observation is that nwfpe internally stores double values in the type 'float64', which is basically just a typedef for unsigned long long. It never accesses 'float64's on the byte level by casting pointers around or anything like that, it just uses direct u64 arithmetic primitives (add, shift, or, and) for float64 manipulations and that's it. So. For little endian platforms, 1.0 looks like: 0x00 0x00 0xf0 0x3f 0x00 0x00 0x00 0x00 But since nwfpe treats it as a u64, it wants it to look like: 0x00 0x00 0x00 0x00 0x00 0x00 0xf0 0x3f So, that's why the current code swaps the words around when getting doubles from userspace and putting them back (see fpa11_cpdt.c, loadDouble and storeDouble.) On big endian, 1.0 looks like: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 Since nwfpe treats it as a u64, it wants it to look like: 0x3f 0xf0 0x00 0x00 0x00 0x00 0x00 0x00 Hey! That's exactly the same. So in this case, it shouldn't be swapping the halves around. However, it currently does that swapping unconditionally, and that's why floating point emulation messes up. This is how I understand things -- hope it makes sense to other people too. cheers, Lennert
6349ff20 · Lennert Buytenhek · Russell King · f43750ec · 6349ff20
Commit 6349ff20 authored Aug 25, 2004 by Lennert Buytenhek Committed by Russell King Aug 25, 2004
Show whitespace changes
Inline Side-by-side

Showing with 10 additions and 0 deletions

arch/arm/nwfpe/fpa11_cpdt.c arch/arm/nwfpe/fpa11_cpdt.c +10 -0

No files found.
--- a/arch/arm/nwfpe/fpa11_cpdt.c
+++ b/arch/arm/nwfpe/fpa11_cpdt.c
@@ -42,8 +42,13 @@ static inline void loadDouble(const unsigned int Fn, const unsigned int __user *
 	unsigned int *p;
 	p = (unsigned int *) &fpa11->fpreg[Fn].fDouble;
 	fpa11->fType[Fn] = typeDouble;
+#ifdef __ARMEB__
+	get_user(p[0], &pMem[0]);	/* sign & exponent */
+	get_user(p[1], &pMem[1]);
+#else
 	get_user(p[0], &pMem[1]);
 	get_user(p[1], &pMem[0]);	/* sign & exponent */
+#endif
 }

 #ifdef CONFIG_FPE_NWFPE_XP
@@ -140,8 +145,13 @@ static inline void storeDouble(const unsigned int Fn, unsigned int __user *pMem)
 		val.f = fpa11->fpreg[Fn].fDouble;
 	}

+#ifdef __ARMEB__
+	put_user(val.i[0], &pMem[0]);	/* msw */
+	put_user(val.i[1], &pMem[1]);	/* lsw */
+#else
 	put_user(val.i[1], &pMem[0]);	/* msw */
 	put_user(val.i[0], &pMem[1]);	/* lsw */
+#endif
 }

 #ifdef CONFIG_FPE_NWFPE_XP