• Neil Horman's avatar
    netpoll: Close race condition between poll_one_napi and napi_disable · 2d8bff12
    Neil Horman authored
    Drivers might call napi_disable while not holding the napi instance poll_lock.
    In those instances, its possible for a race condition to exist between
    poll_one_napi and napi_disable.  That is to say, poll_one_napi only tests the
    NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
    the following may happen:
    
    CPU0				CPU1
    ndo_tx_timeout			napi_poll_dev
     napi_disable			 poll_one_napi
      test_and_set_bit (ret 0)
    				  test_bit (ret 1)
       reset adapter		   napi_poll_routine
    
    If the adapter gets a tx timeout without a napi instance scheduled, its possible
    for the adapter to think it has exclusive access to the hardware  (as the napi
    instance is now scheduled via the napi_disable call), while the netpoll code
    thinks there is simply work to do.  The result is parallel hardware access
    leading to corrupt data structures in the driver, and a crash.
    
    Additionaly, there is another, more critical race between netpoll and
    napi_disable.  The disabled napi state is actually identical to the scheduled
    state for a given napi instance.  The implication being that, if a napi instance
    is disabled, a netconsole instance would see the napi state of the device as
    having been scheduled, and poll it, likely while the driver was dong something
    requiring exclusive access.  In the case above, its fairly clear that not having
    the rings in a state ready to be polled will cause any number of crashes.
    
    The fix should be pretty easy.  netpoll uses its own bit to indicate that that
    the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
    We can just gate disabling on that bit as well as the sched bit.  That should
    prevent netpoll from conducting a napi poll if we convert its set bit to a
    test_and_set_bit operation to provide mutual exclusion
    
    Change notes:
    V2)
    	Remove a trailing whtiespace
    	Resubmit with proper subject prefix
    
    V3)
    	Clean up spacing nits
    Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
    CC: "David S. Miller" <davem@davemloft.net>
    CC: jmaxwell@redhat.com
    Tested-by: jmaxwell@redhat.com
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    2d8bff12
netdevice.h 123 KB