Commit 5f8cf647 authored by Oliver O'Halloran's avatar Oliver O'Halloran Committed by Michael Ellerman

selftests/powerpc: Squash spurious errors due to device removal

For drivers that don't have the error handling callbacks we implement
recovery by removing the device and re-probing it. This causes the sysfs
directory for the PCI device to be removed which causes the following
spurious error to be printed when checking the PE state:

Breaking 0005:03:00.0...
./eeh-basic.sh: line 13: can't open /sys/bus/pci/devices/0005:03:00.0/eeh_pe_state: no such file
0005:03:00.0, waited 0/60
0005:03:00.0, waited 1/60
0005:03:00.0, waited 2/60
0005:03:00.0, waited 3/60
0005:03:00.0, waited 4/60
0005:03:00.0, waited 5/60
0005:03:00.0, waited 6/60
0005:03:00.0, waited 7/60
0005:03:00.0, Recovered after 8 seconds

We currently try to avoid this by checking if the PE state file exists
before reading from it. This is however inherently racy so re-work the
state checking so that we only read from the file once, and we squash any
errors that occur while reading.

Fixes: 85d86c8a ("selftests/powerpc: Add basic EEH selftest")
Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727010127.23698-1-oohall@gmail.com
parent c27f2fd1
...@@ -5,12 +5,17 @@ pe_ok() { ...@@ -5,12 +5,17 @@ pe_ok() {
local dev="$1" local dev="$1"
local path="/sys/bus/pci/devices/$dev/eeh_pe_state" local path="/sys/bus/pci/devices/$dev/eeh_pe_state"
if ! [ -e "$path" ] ; then # if a driver doesn't support the error handling callbacks then the
# device is recovered by removing and re-probing it. This causes the
# sysfs directory to disappear so read the PE state once and squash
# any potential error messages
local eeh_state="$(cat $path 2>/dev/null)"
if [ -z "$eeh_state" ]; then
return 1; return 1;
fi fi
local fw_state="$(cut -d' ' -f1 < $path)" local fw_state="$(echo $eeh_state | cut -d' ' -f1)"
local sw_state="$(cut -d' ' -f2 < $path)" local sw_state="$(echo $eeh_state | cut -d' ' -f2)"
# If EEH_PE_ISOLATED or EEH_PE_RECOVERING are set then the PE is in an # If EEH_PE_ISOLATED or EEH_PE_RECOVERING are set then the PE is in an
# error state or being recovered. Either way, not ok. # error state or being recovered. Either way, not ok.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment