Commit b2944c72 authored by Luck, Tony's avatar Luck, Tony Committed by Kamal Mostafa

EDAC/sb_edac: Fix computation of channel address

commit eb1af3b7 upstream.

Large memory Haswell-EX systems with multiple DIMMs per channel were
sometimes reporting the wrong DIMM.

Found three problems:

 1) Debug printouts for socket and channel interleave were not interpreting
    the register fields correctly. The socket interleave field is a 2^X
    value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2,
    2=3. 3=4).

 2) Actual use of the socket interleave value didn't interpret as 2^X

 3) Conversion of address to channel address was complicated, and wrong.
Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
Acked-by: default avatarAristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
parent 9dc52359
...@@ -1077,8 +1077,8 @@ static void get_memory_layout(const struct mem_ctl_info *mci) ...@@ -1077,8 +1077,8 @@ static void get_memory_layout(const struct mem_ctl_info *mci)
edac_dbg(0, "TAD#%d: up to %u.%03u GB (0x%016Lx), socket interleave %d, memory interleave %d, TGT: %d, %d, %d, %d, reg=0x%08x\n", edac_dbg(0, "TAD#%d: up to %u.%03u GB (0x%016Lx), socket interleave %d, memory interleave %d, TGT: %d, %d, %d, %d, reg=0x%08x\n",
n_tads, gb, (mb*1000)/1024, n_tads, gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L, ((u64)tmp_mb) << 20L,
(u32)TAD_SOCK(reg), (u32)(1 << TAD_SOCK(reg)),
(u32)TAD_CH(reg), (u32)TAD_CH(reg) + 1,
(u32)TAD_TGT0(reg), (u32)TAD_TGT0(reg),
(u32)TAD_TGT1(reg), (u32)TAD_TGT1(reg),
(u32)TAD_TGT2(reg), (u32)TAD_TGT2(reg),
...@@ -1356,7 +1356,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci, ...@@ -1356,7 +1356,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
} }
ch_way = TAD_CH(reg) + 1; ch_way = TAD_CH(reg) + 1;
sck_way = TAD_SOCK(reg) + 1; sck_way = 1 << TAD_SOCK(reg);
if (ch_way == 3) if (ch_way == 3)
idx = addr >> 6; idx = addr >> 6;
...@@ -1413,7 +1413,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci, ...@@ -1413,7 +1413,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
n_tads, n_tads,
addr, addr,
limit, limit,
(u32)TAD_SOCK(reg), sck_way,
ch_way, ch_way,
offset, offset,
idx, idx,
...@@ -1428,18 +1428,12 @@ static int get_memory_error_data(struct mem_ctl_info *mci, ...@@ -1428,18 +1428,12 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
offset, addr); offset, addr);
return -EINVAL; return -EINVAL;
} }
addr -= offset;
/* Store the low bits [0:6] of the addr */ ch_addr = addr - offset;
ch_addr = addr & 0x7f; ch_addr >>= (6 + shiftup);
/* Remove socket wayness and remove 6 bits */ ch_addr /= ch_way * sck_way;
addr >>= 6; ch_addr <<= (6 + shiftup);
addr = div_u64(addr, sck_xch); ch_addr |= addr & ((1 << (6 + shiftup)) - 1);
#if 0
/* Divide by channel way */
addr = addr / ch_way;
#endif
/* Recover the last 6 bits */
ch_addr |= addr << 6;
/* /*
* Step 3) Decode rank * Step 3) Decode rank
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment