• Gavin Shan's avatar
    powerpc/eeh: Fix crash on converting OF node to edev · 1e38b714
    Gavin Shan authored
    The kernel crash was reported by Alexy. He was testing some feature
    with private kernel, in which Alexy added some code in pci_pm_reset()
    to read the CSR after writting it. The bug could be reproduced on
    Fiber Channel card (Fibre Channel: Emulex Corporation Saturn-X:
    LightPulse Fibre Channel Host Adapter (rev 03)) by the following
    commands.
    
    	# echo 1 > /sys/devices/pci0004:01/0004:01:00.0/reset
    	# rmmod lpfc
    	# modprobe lpfc
    
    The history behind the test case is that those additional config
    space reading operations in pci_pm_reset() would cause EEH error,
    but we didn't detect EEH error until "modprobe lpfc". For the case,
    all the PCI devices on PCI bus (0004:01) were removed and added after
    PE reset. Then the EEH devices would be figured out again based on
    the OF nodes. Unfortunately, there were some child OF nodes under
    PCI device (0004:01:00.0), but they didn't have attached PCI_DN since
    they're invisible from PCI domain. However, we were still trying to
    convert OF node to EEH device without checking on the attached PCI_DN.
    Eventually, it caused the kernel crash as follows:
    
    Unable to handle kernel paging request for data at address 0x00000030
    Faulting instruction address: 0xc00000000004d888
    cpu 0x0: Vector: 300 (Data Access) at [c000000fc797b950]
        pc: c00000000004d888: .eeh_add_device_tree_early+0x78/0x140
        lr: c00000000004d880: .eeh_add_device_tree_early+0x70/0x140
        sp: c000000fc797bbd0
       msr: 8000000000009032
       dar: 30
     dsisr: 40000000
      current = 0xc000000fc78d9f70
      paca    = 0xc00000000edb0000   softe: 0        irq_happened: 0x00
        pid   = 2951, comm = eehd
    enter ? for help
    [c000000fc797bc50] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
    [c000000fc797bcd0] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140
    [c000000fc797bd50] c000000000051b54 .pcibios_add_pci_devices+0x34/0x190
    [c000000fc797bde0] c00000000004fb10 .eeh_reset_device+0x100/0x160
    [c000000fc797be70] c0000000000502dc .eeh_handle_event+0x19c/0x300
    [c000000fc797bf00] c000000000050570 .eeh_event_handler+0x130/0x1a0
    [c000000fc797bf90] c000000000020138 .kernel_thread+0x54/0x70
    
    The patch changes of_node_to_eeh_dev() and just returns NULL if the
    passed OF node doesn't have attached PCI_DN.
    
    Cc: stable@vger.kernel.org
    Reported-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
    Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    1e38b714
eeh.c 26.3 KB