• Jim Schutt's avatar
    libceph: avoid truncation due to racing banners · 6d4221b5
    Jim Schutt authored
    Because the Ceph client messenger uses a non-blocking connect, it is
    possible for the sending of the client banner to race with the
    arrival of the banner sent by the peer.
    
    When ceph_sock_state_change() notices the connect has completed, it
    schedules work to process the socket via con_work().  During this
    time the peer is writing its banner, and arrival of the peer banner
    races with con_work().
    
    If con_work() calls try_read() before the peer banner arrives, there
    is nothing for it to do, after which con_work() calls try_write() to
    send the client's banner.  In this case Ceph's protocol negotiation
    can complete succesfully.
    
    The server-side messenger immediately sends its banner and addresses
    after accepting a connect request, *before* actually attempting to
    read or verify the banner from the client.  As a result, it is
    possible for the banner from the server to arrive before con_work()
    calls try_read().  If that happens, try_read() will read the banner
    and prepare protocol negotiation info via prepare_write_connect().
    prepare_write_connect() calls con_out_kvec_reset(), which discards
    the as-yet-unsent client banner.  Next, con_work() calls
    try_write(), which sends the protocol negotiation info rather than
    the banner that the peer is expecting.
    
    The result is that the peer sees an invalid banner, and the client
    reports "negotiation failed".
    
    Fix this by moving con_out_kvec_reset() out of
    prepare_write_connect() to its callers at all locations except the
    one where the banner might still need to be sent.
    
    [elder@inktak.com: added note about server-side behavior]
    Signed-off-by: default avatarJim Schutt <jaschut@sandia.gov>
    Reviewed-by: default avatarAlex Elder <elder@inktank.com>
    6d4221b5
messenger.c 70.9 KB