• Kenji Kaneshige's avatar
    PCI: fix kernel oops on bridge removal · 7ae0567f
    Kenji Kaneshige authored
    Fix the following kernel oops problem that happens when removing PCI
    bridge with pciehp loaded. It should also occur with other hotplug
    driver that is implemented as a bridge's driver.
    
    [  459.997257] pciehp 0000:2f:04.0:pcie24: unloading service driver pciehp
    [  459.997495] general protection fault: 0000 [#1] SMP
    [  459.997737] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:2e:00.0/0000:2f:04.0/remove
    [  459.997964] CPU 4
    [  459.998129] Modules linked in: pciehp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc battery ac parport_pc lp parport mptspi mptscsih mptbase scsi_transport_spi e1000e sg sr_mod cdrom button serio_raw i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
    [  459.998129] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1 PRIMERGY
    [  459.998129] RIP: 0010:[<ffffffff803bf047>]  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
    [  459.998129] RSP: 0018:ffff88083b3bf9e0  EFLAGS: 00010246
    [  459.998129] RAX: ffff88083adc5158 RBX: ffff880836c1bc80 RCX: 6b6b6b6b6b6b6b6b
    [  459.998129] RDX: 0000000000000000 RSI: ffffffff803a77f0 RDI: ffff880836c1bc48
    [  459.998129] RBP: ffff88083b3bfa00 R08: 0000000000000002 R09: 0000000000000000
    [  459.998129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880836c1bc48
    [  459.998129] R13: ffff880836c1bc20 R14: ffff880836c1bc48 R15: ffff880836d1ec38
    [  459.998129] FS:  0000000000000000(0000) GS:ffff88083ccc3770(0000) knlGS:0000000000000000
    [  459.998129] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    [  459.998129] CR2: 00007f1562f1d558 CR3: 0000000838090000 CR4: 00000000000006e0
    [  459.998129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  459.998129] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [  459.998129] Process events/4 (pid: 56, threadinfo ffff88083b3be000, task ffff88083b3b3e40)
    [  459.998129] Stack:
    [  459.998129]  ffff880836c1bc80 ffff880836c1bc48 ffffffff80793320 ffff88083b0d0960
    [  459.998129]  ffff88083b3bfa30 ffffffff803a788a ffff880836c1bc80 ffffffff803a77f0
    [  459.998129]  ffff880836c1bc20 ffff880836d1ec38 ffff88083b3bfa50 ffffffff803a8ce7
    [  459.998129] Call Trace:
    [  459.998129]  [<ffffffff803a788a>] kobject_release+0x9a/0x290
    [  459.998129]  [<ffffffff803a77f0>] ? kobject_release+0x0/0x290
    [  459.998129]  [<ffffffff803a8ce7>] kref_put+0x37/0x80
    [  459.998129]  [<ffffffff803a76f7>] kobject_put+0x27/0x60
    [  459.998129]  [<ffffffff803bebcc>] ? pci_destroy_slot+0x3c/0xc0
    [  459.998129]  [<ffffffff803bebd5>] pci_destroy_slot+0x45/0xc0
    [  459.998129]  [<ffffffff803c797d>] pci_hp_deregister+0x13d/0x210
    [  459.998129]  [<ffffffffa031141d>] cleanup_slots+0x2d/0x80 [pciehp]
    [  459.998129]  [<ffffffffa0311735>] pciehp_remove+0x15/0x30 [pciehp]
    [  459.998129]  [<ffffffff803c4c99>] pcie_port_remove_service+0x69/0x90
    [  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
    [  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
    [  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
    [  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
    [  459.998129]  [<ffffffff803c4d90>] ? remove_iter+0x0/0x40
    [  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
    [  459.998129]  [<ffffffff803c4dbf>] remove_iter+0x2f/0x40
    [  459.998129]  [<ffffffff8043ddf3>] device_for_each_child+0x33/0x60
    [  459.998129]  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
    [  459.998129]  [<ffffffff803c4d30>] pcie_port_device_remove+0x30/0x80
    [  459.998129]  [<ffffffff803c55a1>] pcie_portdrv_remove+0x11/0x20
    [  459.998129]  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
    [  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
    [  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
    [  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
    [  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
    [  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
    [  459.998129]  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
    [  459.998129]  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
    [  459.998129]  [<ffffffff803c10d9>] remove_callback+0x29/0x40
    [  459.998129]  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
    [  459.998129]  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
    [  459.998129]  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
    [  459.998129]  [<ffffffff8025846f>] worker_thread+0x9f/0x100
    [  459.998129]  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
    [  459.998129]  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
    [  459.998129]  [<ffffffff8025b89d>] kthread+0x4d/0x80
    [  459.998129]  [<ffffffff8020d4ba>] child_rip+0xa/0x20
    [  459.998129]  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
    [  459.998129]  [<ffffffff8025b850>] ? kthread+0x0/0x80
    [  459.998129]  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
    [  459.998129] Code: 56 49 89 fe 41 55 4c 8d 6f d8 41 54 53 74 09 f6 05 b8 05 c7 00 08 75 72 49 8b 45 00 48 8b 48 28 eb 05 66 90 48 89 f1 49 8b 45 00 <48> 8b 31 48 83 c0 28 0f 18 0e 48 39 c1 74 1c 8b 41 38 41 0f b6
    [  459.998129] RIP  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
    [  459.998129]  RSP <ffff88083b3bf9e0>
    [  460.018595] ---[ end trace 5a08d2095374aedc ]---
    
    The pci_remove_bus_device() removes all buses and devices under the
    bridge, and then removes the bridge. So the remove() callback of the
    hotplug drivers implemented as a bridge's driver is executed after the
    struct pci_bus of the bridge's secondary bus is removed. The remove()
    callback of those driver unregisters the slot using pci_destroy_slot(),
    and slot's release callback refers to the the struct pci_bus that was
    already freed. This is the cause of the kernel oops.
    
    This patch solves the problem by stopping bus drivers before removing the
    bridge and its child bus and devices.
    Acked-by: default avatarAlex Chiang <achiang@hp.com>
    Signed-off-by: default avatarKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
    Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
    7ae0567f
remove.c 3.62 KB