Skip to content
Commit 8489b17c authored by Qiuxu Zhuo's avatar Qiuxu Zhuo Committed by Borislav Petkov
Browse files

EDAC, sb_edac: Fix reporting for patrol scrubber errors



sb_edac sometimes reports the wrong DIMM for a memory error found by
the patrol scrubber. That is because the hardware provides only a 4KB
page-aligned address for the error case.

This means that the EDAC driver will point at the DIMM matching offset
0x0 in the 4KB page, but because of interleaving across channels and
ranks, the actual DIMM involved may be different if the error is on some
other cache line within the page.

Therefore, reconstruct the socket/iMC/channel information from the "mce"
structure passed to the EDAC driver. The DIMM cannot be determined, so
pass "dimm=-1" to the EDAC core. It will report that all the DIMMs on
that channel may be affected.

Signed-off-by: default avatarQiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180907230828.13901-3-tony.luck@intel.com


[ Improve comments on the functions to convert bank number
  to memory controller number. Minor cleanup to commit message. ]
Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
[ Massage commit message more. ]
Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
parent dcc960b2
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment