• Sunil Muthuswamy's avatar
    hvsock: fix epollout hang from race condition · cb359b60
    Sunil Muthuswamy authored
    Currently, hvsock can enter into a state where epoll_wait on EPOLLOUT will
    not return even when the hvsock socket is writable, under some race
    condition. This can happen under the following sequence:
    - fd = socket(hvsocket)
    - fd_out = dup(fd)
    - fd_in = dup(fd)
    - start a writer thread that writes data to fd_out with a combination of
      epoll_wait(fd_out, EPOLLOUT) and
    - start a reader thread that reads data from fd_in with a combination of
      epoll_wait(fd_in, EPOLLIN)
    - On the host, there are two threads that are reading/writing data to the
      hvsocket
    
    stack:
    hvs_stream_has_space
    hvs_notify_poll_out
    vsock_poll
    sock_poll
    ep_poll
    
    Race condition:
    check for epollout from ep_poll():
    	assume no writable space in the socket
    	hvs_stream_has_space() returns 0
    check for epollin from ep_poll():
    	assume socket has some free space < HVS_PKT_LEN(HVS_SEND_BUF_SIZE)
    	hvs_stream_has_space() will clear the channel pending send size
    	host will not notify the guest because the pending send size has
    		been cleared and so the hvsocket will never mark the
    		socket writable
    
    Now, the EPOLLOUT will never return even if the socket write buffer is
    empty.
    
    The fix is to set the pending size to the default size and never change it.
    This way the host will always notify the guest whenever the writable space
    is bigger than the pending size. The host is already optimized to *only*
    notify the guest when the pending size threshold boundary is crossed and
    not everytime.
    
    This change also reduces the cpu usage somewhat since hv_stream_has_space()
    is in the hotpath of send:
    vsock_stream_sendmsg()->hv_stream_has_space()
    Earlier hv_stream_has_space was setting/clearing the pending size on every
    call.
    Signed-off-by: default avatarSunil Muthuswamy <sunilmut@microsoft.com>
    Reviewed-by: default avatarDexuan Cui <decui@microsoft.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    cb359b60
hyperv_transport.c 22.9 KB