• Alex Elder's avatar
    libceph: change how "safe" callback is used · 26be8808
    Alex Elder authored
    An osd request currently has two callbacks.  They inform the
    initiator of the request when we've received confirmation for the
    target osd that a request was received, and when the osd indicates
    all changes described by the request are durable.
    
    The only time the second callback is used is in the ceph file system
    for a synchronous write.  There's a race that makes some handling of
    this case unsafe.  This patch addresses this problem.  The error
    handling for this callback is also kind of gross, and this patch
    changes that as well.
    
    In ceph_sync_write(), if a safe callback is requested we want to add
    the request on the ceph inode's unsafe items list.  Because items on
    this list must have their tid set (by ceph_osd_start_request()), the
    request added *after* the call to that function returns.  The
    problem with this is that there's a race between starting the
    request and adding it to the unsafe items list; the request may
    already be complete before ceph_sync_write() even begins to put it
    on the list.
    
    To address this, we change the way the "safe" callback is used.
    Rather than just calling it when the request is "safe", we use it to
    notify the initiator the bounds (start and end) of the period during
    which the request is *unsafe*.  So the initiator gets notified just
    before the request gets sent to the osd (when it is "unsafe"), and
    again when it's known the results are durable (it's no longer
    unsafe).  The first call will get made in __send_request(), just
    before the request message gets sent to the messenger for the first
    time.  That function is only called by __send_queued(), which is
    always called with the osd client's request mutex held.
    
    We then have this callback function insert the request on the ceph
    inode's unsafe list when we're told the request is unsafe.  This
    will avoid the race because this call will be made under protection
    of the osd client's request mutex.  It also nicely groups the setup
    and cleanup of the state associated with managing unsafe requests.
    
    The name of the "safe" callback field is changed to "unsafe" to
    better reflect its new purpose.  It has a Boolean "unsafe" parameter
    to indicate whether the request is becoming unsafe or is now safe.
    Because the "msg" parameter wasn't used, we drop that.
    
    This resolves the original problem reportedin:
        http://tracker.ceph.com/issues/4706Reported-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
    Signed-off-by: default avatarAlex Elder <elder@inktank.com>
    Reviewed-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
    Reviewed-by: default avatarSage Weil <sage@inktank.com>
    26be8808
file.c 23.9 KB