• Aaron Young's avatar
    ldmvsw: tx queue stuck in stopped state after LDC reset · 8778b276
    Aaron Young authored
    The following patch fixes an issue with the ldmvsw driver where
    the network connection of a guest domain becomes non-functional after
    the guest domain has panic'd and rebooted.
    
    The root cause was determined to be from the following series of
    events:
    
    1. Guest domain panics - resulting in the guest no longer processing
       network packets (from ldmvsw driver)
    2. The ldmvsw driver (in the control domain) eventually exerts flow
       control due to no more available tx drings and stops the tx queue
       for the guest domain
    3. The LDC of the network connection for the guest is reset when
       the guest domain reboots after the panic.
    4. The LDC reset event is received by the ldmvsw driver and the ldmvsw
       responds by clearing the tx queue for the guest.
    5. ldmvsw waits indefinitely for a DATA ACK from the guest - which is
       the normal method to re-enable the tx queue. But the ACK never comes
       because the tx queue was cleared due to the LDC reset.
    
    To fix this issue, in addition to clearing the tx queue, re-enable the
    tx queue on a LDC reset. This prevents the ldmvsw from getting caught in
    this deadlocked state of waiting for a DATA ACK which will never come.
    Signed-off-by: default avatarAaron Young <Aaron.Young@oracle.com>
    Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    8778b276
sunvnet_common.c 42.9 KB