• Sage Weil's avatar
    libceph: fix messenger retry · 5bdca4e0
    Sage Weil authored
    In ancient times, the messenger could both initiate and accept connections.
    An artifact if that was data structures to store/process an incoming
    ceph_msg_connect request and send an outgoing ceph_msg_connect_reply.
    Sadly, the negotiation code was referencing those structures and ignoring
    important information (like the peer's connect_seq) from the correct ones.
    
    Among other things, this fixes tight reconnect loops where the server sends
    RETRY_SESSION and we (the client) retries with the same connect_seq as last
    time.  This bug pretty easily triggered by injecting socket failures on the
    MDS and running some fs workload like workunits/direct_io/test_sync_io.
    Signed-off-by: default avatarSage Weil <sage@inktank.com>
    5bdca4e0
messenger.c 63.7 KB