• Sage Weil's avatar
    ceph: avoid reopening osd connections when address hasn't changed · 87b315a5
    Sage Weil authored
    We get a fault callback on _every_ tcp connection fault.  Normally, we
    want to reopen the connection when that happens.  If the address we have
    is bad, however, and connection attempts always result in a connection
    refused or similar error, explicitly closing and reopening the msgr
    connection just prevents the messenger's backoff logic from kicking in.
    The result can be a console full of
    
    [ 3974.417106] ceph: osd11 10.3.14.138:6800 connection failed
    [ 3974.423295] ceph: osd11 10.3.14.138:6800 connection failed
    [ 3974.429709] ceph: osd11 10.3.14.138:6800 connection failed
    
    Instead, if we get a fault, and have outstanding requests, but the osd
    address hasn't changed and the connection never successfully connected in
    the first place, do nothing to the osd connection.  The messenger layer
    will back off and retry periodically, because we never connected and thus
    the lossy bit is not set.
    
    Instead, touch each request's r_stamp so that handle_timeout can tell the
    request is still alive and kicking.
    Signed-off-by: default avatarSage Weil <sage@newdream.net>
    87b315a5
osd_client.c 39.5 KB