Commit 5f684a49 authored by Kirill Smelkov's avatar Kirill Smelkov

wcfs: Server.stop: Make sure to remove mount entry even if we had to use FUSE abort

Server.stop currently tries to unmount, and if that fails invokes FUSE
abort and kills wcfs.go . However it does not call unmount the second
time after such abort, and this way the filesystem remains mounted (in
ENOTCONN state) and rmdir(mountpoint) fails.

-> Fix it by calling unmount the second time if we had to abort FUSE
connection. In that second try use lazy unmounting, because regular
unmount can still fail with "Device or resource busy" since there
could be still client file descriptors left pointing to the mounted
filesystem. With lazy mode unmounting + followup rmdir, hopefully,
always succeeds.

Here is example test run where one test timed out, FUSE connection was
aborted, but neither the filesystem was unmounted, nor mountpoint
directory was deleted, which led to all followup tests failing in setup
assert that testmountpoint does not exist:

https://nexedijs.erp5.net/#/test_result_module/20211112-1ACEA62D/22

This patch should fix those followup failures + fix another leakage of
WCFS mounts in real services.
parent 54f6e741
...@@ -417,7 +417,7 @@ def __stop(wcsrv, ctx, _onstuck): ...@@ -417,7 +417,7 @@ def __stop(wcsrv, ctx, _onstuck):
# unmount and wait for wcfs to exit # unmount and wait for wcfs to exit
# kill wcfs and abort FUSE connection if clean unmount fails # kill wcfs and abort FUSE connection if clean unmount fails
# at the end make sure mountpoint directory is removed # at the end make sure mount entry and mountpoint directory are removed
def _(): def _():
# when stop runs: # when stop runs:
...@@ -426,6 +426,15 @@ def __stop(wcsrv, ctx, _onstuck): ...@@ -426,6 +426,15 @@ def __stop(wcsrv, ctx, _onstuck):
_rmdir_ifexists(wcsrv.mountpoint) _rmdir_ifexists(wcsrv.mountpoint)
defer(_) defer(_)
def _():
# second unmount, if first unmount failed and we had to abort FUSE connection
# -z (lazy) because this one has to succeed, but there could be still
# client file descriptors left pointing to the mounted filesystem.
if _is_mountpoint(wcsrv.mountpoint):
log.warn("-> unmount -z ...")
_fuse_unmount(wcsrv.mountpoint, "-z")
defer(_)
def _(): def _():
if wcsrv._fuseabort is not None: if wcsrv._fuseabort is not None:
wcsrv._fuseabort.close() wcsrv._fuseabort.close()
...@@ -539,9 +548,11 @@ def _rmdir_ifexists(path): ...@@ -539,9 +548,11 @@ def _rmdir_ifexists(path):
raise raise
# _fuse_unmount calls `fusermount -u` + logs details if unmount failed. # _fuse_unmount calls `fusermount -u` + logs details if unmount failed.
#
# Additional options to fusermount can be passed via optv.
@func @func
def _fuse_unmount(mntpt): def _fuse_unmount(mntpt, *optv):
ret, out = _sysproccallout(["fusermount", "-u", mntpt]) ret, out = _sysproccallout(["fusermount", "-u"] + list(optv) + [mntpt])
if ret != 0: if ret != 0:
# unmount failed, usually due to "device is busy". # unmount failed, usually due to "device is busy".
# Log which files are still opened and reraise # Log which files are still opened and reraise
...@@ -560,7 +571,10 @@ def _fuse_unmount(mntpt): ...@@ -560,7 +571,10 @@ def _fuse_unmount(mntpt):
defer(_) defer(_)
out = out.rstrip() # kill trailing \n\n out = out.rstrip() # kill trailing \n\n
emsg = "fuse_unmount %s: failed: %s" % (mntpt, out) opts = ' '.join(optv)
if opts != '':
opts += ' '
emsg = "fuse_unmount %s%s: failed: %s" % (opts, mntpt, out)
log.warn(emsg) log.warn(emsg)
raise RuntimeError("%s\n(more details logged)" % emsg) raise RuntimeError("%s\n(more details logged)" % emsg)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment