• Vitor Soares's avatar
    can: mcp251xfd: fix infinite loop when xmit fails · d8fb63e4
    Vitor Soares authored
    When the mcp251xfd_start_xmit() function fails, the driver stops
    processing messages, and the interrupt routine does not return,
    running indefinitely even after killing the running application.
    
    Error messages:
    [  441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16
    [  441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3).
    ... and repeat forever.
    
    The issue can be triggered when multiple devices share the same SPI
    interface. And there is concurrent access to the bus.
    
    The problem occurs because tx_ring->head increments even if
    mcp251xfd_start_xmit() fails. Consequently, the driver skips one TX
    package while still expecting a response in
    mcp251xfd_handle_tefif_one().
    
    Resolve the issue by starting a workqueue to write the tx obj
    synchronously if err = -EBUSY. In case of another error, decrement
    tx_ring->head, remove skb from the echo stack, and drop the message.
    
    Fixes: 55e5b97f ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarVitor Soares <vitor.soares@toradex.com>
    Link: https://lore.kernel.org/all/20240517134355.770777-1-ivitro@gmail.com
    [mkl: use more imperative wording in patch description]
    Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
    d8fb63e4
mcp251xfd-core.c 56.3 KB