Commit c872a1f9 authored by Kamenee Arumugam's avatar Kamenee Arumugam Committed by Doug Ledford

IB/Hfi1: Read CCE Revision register to verify the device is responsive

When Hfi1 device is unresponsive, reading the RcvArrayCnt register
will return all 1's. This value is then used to remap chip's RcvArray.
The incorrect all ones value used in remapping RcvArray
will cause warn on as shown by trace below:

[<ffffffff81685eac>] dump_stack+0x19/0x1b
[<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
[<ffffffff810858bc>] warn_slowpath_fmt+0x5c/0x80
[<ffffffff81065c29>] __ioremap_caller+0x279/0x320
[<ffffffff8142873c>] ? _dev_info+0x6c/0x90
[<ffffffffa021d155>] ? hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffff81065d62>] ioremap_wc+0x32/0x40
[<ffffffffa021d155>] hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffffa0204851>] hfi1_init_dd+0x1d1/0x2440 [hfi1]
[<ffffffff813503dc>] ? pci_write_config_word+0x1c/0x20

Read CCE revision register first to verify that WFR device is
responsive. If the read return "all ones", bail out from init
and fail the driver load.
Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: default avatarKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
parent a74d5307
......@@ -15038,13 +15038,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
if (ret < 0)
goto bail_cleanup;
/* verify that reads actually work, save revision for reset check */
dd->revision = read_csr(dd, CCE_REVISION);
if (dd->revision == ~(u64)0) {
dd_dev_err(dd, "cannot read chip CSRs\n");
ret = -EINVAL;
goto bail_cleanup;
}
dd->majrev = (dd->revision >> CCE_REVISION_CHIP_REV_MAJOR_SHIFT)
& CCE_REVISION_CHIP_REV_MAJOR_MASK;
dd->minrev = (dd->revision >> CCE_REVISION_CHIP_REV_MINOR_SHIFT)
......
......@@ -183,6 +183,14 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
return -ENOMEM;
}
dd_dev_info(dd, "UC base1: %p for %x\n", dd->kregbase1, RCV_ARRAY);
/* verify that reads actually work, save revision for reset check */
dd->revision = readq(dd->kregbase1 + CCE_REVISION);
if (dd->revision == ~(u64)0) {
dd_dev_err(dd, "Cannot read chip CSRs\n");
goto nomem;
}
dd->chip_rcv_array_count = readq(dd->kregbase1 + RCV_ARRAY_CNT);
dd_dev_info(dd, "RcvArray count: %u\n", dd->chip_rcv_array_count);
dd->base2_start = RCV_ARRAY + dd->chip_rcv_array_count * 8;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment