• Claudiu Beznea's avatar
    net: ravb: Keep reverse order of operations in ravb_remove() · edf9bc39
    Claudiu Beznea authored
    On RZ/G3S SMARC Carrier II board having RGMII connections b/w Ethernet
    MACs and PHYs it has been discovered that doing unbind/bind for ravb
    driver in a loop leads to wrong speed and duplex for Ethernet links and
    broken connectivity (the connectivity cannot be restored even with
    bringing interface down/up). Before doing unbind/bind the Ethernet
    interfaces were configured though systemd. The sh instructions used to
    do unbind/bind were:
    
    $ cd /sys/bus/platform/drivers/ravb/
    $ while :; do echo 11c30000.ethernet > unbind ; \
      echo 11c30000.ethernet > bind; done
    
    It has been discovered that there is a race b/w IOCTLs initialized by
    systemd at the response of success binding and the
    "ravb_write(ndev, CCC_OPC_RESET, CCC)" call in ravb_remove() as
    follows:
    
    1/ as a result of bind success the user space open/configures the
       interfaces tough an IOCTL; the following stack trace has been
       identified on RZ/G3S:
    
    Call trace:
    dump_backtrace+0x9c/0x100
    show_stack+0x20/0x38
    dump_stack_lvl+0x48/0x60
    dump_stack+0x18/0x28
    ravb_open+0x70/0xa58
    __dev_open+0xf4/0x1e8
    __dev_change_flags+0x198/0x218
    dev_change_flags+0x2c/0x80
    devinet_ioctl+0x640/0x708
    inet_ioctl+0x1e4/0x200
    sock_do_ioctl+0x50/0x108
    sock_ioctl+0x240/0x358
    __arm64_sys_ioctl+0xb0/0x100
    invoke_syscall+0x50/0x128
    el0_svc_common.constprop.0+0xc8/0xf0
    do_el0_svc+0x24/0x38
    el0_svc+0x34/0xb8
    el0t_64_sync_handler+0xc0/0xc8
    el0t_64_sync+0x190/0x198
    
    2/ this call may execute concurrently with ravb_remove() as the
       unbind/bind operation was executed in a loop
    3/ if the operation mode is changed to RESET (through
       ravb_write(ndev, CCC_OPC_RESET, CCC) call in ravb_remove())
       while the above ravb_open() is in progress it may lead to MAC
       (or PHY, or MAC-PHY connection, the right point hasn't been identified
       at the moment) to be broken, thus the Ethernet connectivity fails to
       restore.
    
    The simple fix for this is to move ravb_write(ndev, CCC_OPC_RESET, CCC))
    after unregister_netdev() to avoid resetting the controller while the
    netdev interface is still registered.
    
    To avoid future issues in ravb_remove(), the patch follows the proper order
    of operations in ravb_remove(): reverse order compared with ravb_probe().
    This avoids described races as the IOCTLs as well as unregister_netdev()
    (called now at the beginning of ravb_remove()) calls rtnl_lock() before
    continuing and IOCTLs check (though devinet_ioctl()) if device is still
    registered just after taking the lock:
    
    int devinet_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr)
    {
    	// ...
    
            rtnl_lock();
    
            ret = -ENODEV;
            dev = __dev_get_by_name(net, ifr->ifr_name);
            if (!dev)
                    goto done;
    
    	// ...
    done:
            rtnl_unlock();
    out:
            return ret;
    }
    
    Fixes: c156633f ("Renesas Ethernet AVB driver proper")
    Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
    Signed-off-by: default avatarClaudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
    Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    edf9bc39
ravb_main.c 79.4 KB