• Danielle Ratson's avatar
    mlxsw: pci: Add shutdown method in PCI driver · c1020d3c
    Danielle Ratson authored
    On an arm64 platform with the Spectrum ASIC, after loading and executing
    a new kernel via kexec, the following trace [1] is observed. This seems
    to be caused by the fact that the device is not properly shutdown before
    executing the new kernel.
    
    Fix this by implementing a shutdown method which mirrors the remove
    method, as recommended by the kexec maintainer [2][3].
    
    [1]
    BUG: Bad page state in process devlink pfn:22f73d
    page:fffffe00089dcf40 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
    flags: 0x2ffff00000000000()
    raw: 2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000
    raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
    page dumped because: nonzero _refcount
    Modules linked in:
    CPU: 1 PID: 16346 Comm: devlink Tainted: G B 5.8.0-rc6-custom-273020-gac6b365b1bf5 #44
    Hardware name: Marvell Armada 7040 TX4810M (DT)
    Call trace:
     dump_backtrace+0x0/0x1d0
     show_stack+0x1c/0x28
     dump_stack+0xbc/0x118
     bad_page+0xcc/0xf8
     check_free_page_bad+0x80/0x88
     __free_pages_ok+0x3f8/0x418
     __free_pages+0x38/0x60
     kmem_freepages+0x200/0x2a8
     slab_destroy+0x28/0x68
     slabs_destroy+0x60/0x90
     ___cache_free+0x1b4/0x358
     kfree+0xc0/0x1d0
     skb_free_head+0x2c/0x38
     skb_release_data+0x110/0x1a0
     skb_release_all+0x2c/0x38
     consume_skb+0x38/0x130
     __dev_kfree_skb_any+0x44/0x50
     mlxsw_pci_rdq_fini+0x8c/0xb0
     mlxsw_pci_queue_fini.isra.0+0x28/0x58
     mlxsw_pci_queue_group_fini+0x58/0x88
     mlxsw_pci_aqs_fini+0x2c/0x60
     mlxsw_pci_fini+0x34/0x50
     mlxsw_core_bus_device_unregister+0x104/0x1d0
     mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48
     devlink_reload+0x44/0x158
     devlink_nl_cmd_reload+0x270/0x290
     genl_rcv_msg+0x188/0x2f0
     netlink_rcv_skb+0x5c/0x118
     genl_rcv+0x3c/0x50
     netlink_unicast+0x1bc/0x278
     netlink_sendmsg+0x194/0x390
     __sys_sendto+0xe0/0x158
     __arm64_sys_sendto+0x2c/0x38
     el0_svc_common.constprop.0+0x70/0x168
     do_el0_svc+0x28/0x88
     el0_sync_handler+0x88/0x190
     el0_sync+0x140/0x180
    
    [2]
    https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1195432.html
    
    [3]
    https://patchwork.kernel.org/project/linux-scsi/patch/20170212214920.28866-1-anton@ozlabs.org/#20116693
    
    Cc: Eric Biederman <ebiederm@xmission.com>
    Signed-off-by: default avatarDanielle Ratson <danieller@nvidia.com>
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c1020d3c
pci.c 54.2 KB