• Serge Semin's avatar
    net: phy: Fix deadlocking in phy_error() invocation · a0e026e7
    Serge Semin authored
    Since commit 91a7cda1 ("net: phy: Fix race condition on link status
    change") all the phy_error() method invocations have been causing the
    nested-mutex-lock deadlock because it's normally done in the PHY-driver
    threaded IRQ handlers which since that change have been called with the
    phydev->lock mutex held. Here is the calls thread:
    
    IRQ: phy_interrupt()
         +-> mutex_lock(&phydev->lock); <--------------------+
             drv->handle_interrupt()                         | Deadlock due
             +-> ERROR: phy_error()                          + to the nested
                        +-> phy_process_error()              | mutex lock
                            +-> mutex_lock(&phydev->lock); <-+
                                phydev->state = PHY_ERROR;
                                mutex_unlock(&phydev->lock);
             mutex_unlock(&phydev->lock);
    
    The problem can be easily reproduced just by calling phy_error() from any
    PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock
    mutex lock from the phy_process_error() method and printing a nasty error
    message to the system log if the mutex isn't held in the caller execution
    context.
    
    Note for the fix to work correctly in the PHY-subsystem itself the
    phydev->lock mutex locking must be added to the phy_error_precise()
    function.
    
    Link: https://lore.kernel.org/netdev/20230816180944.19262-1-fancer.lancer@gmail.com
    Fixes: 91a7cda1 ("net: phy: Fix race condition on link status change")
    Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
    Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
    Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    a0e026e7
phy.c 41.1 KB