Commit 171a66f2 authored by Kirill Smelkov's avatar Kirill Smelkov

runUnitTest: Close ZODB DB before shutdown

We are already closing Storage on shutdown (see nearby Storage.close call in
the patch), but the DB handle was not closed. With classic ZODB it does not
really matter in practice, because not closing DB is only leaking RAM resources
and the program is anyway terminated soon.

However with wendelin.core 2 things are different: in addition to ZODB storage
server, there is also synthetic WCFS filesystem from which files are opened
and memory-mmaped. In runUnitTest we start both ZODB and WCFS servers and we
also shut down them both in the end. The filesystem server can be cleanly
unmounted and shutdown only when there are no opened files left on it.

Wendelin.core 2 client works by complementing each ZODB connection (zconn) with
WCFS-level connection (wconn) to WCFS server. Those two zconn and wconn are
kept in sync by wendelin.core 2 client logic: whenever zconn adjusts its view
of the database, so does wconn. And whenever zconn is garbage collected, so
does wconn is closed to release resources and close corresponding files opened
on WCFS. In addition to garbage-collection, wconn is also closed when zconn.db
- the ZODB DB handle via which zconn was created - is closed. This is needed to
be able to reliably trigger freeing WCFS resources, because even after DB is
closed, zconn can still stay alive forever being referenced from some python
object - e.g. a frame or traceback or something else.

The latter scenario actually happens during runUnitTest run. As the result it
leads to inability to unmount and stop WCFS server cleanly:

    $ ./runUnitTest --load --save --with_wendelin_core -v erp5_core_test:testWendelinCore
    ...
    test (erp5.component.test.erp5_version.testWendelinCore.TestWendelinCoreBasic) ... ok

    ----------------------------------------------------------------------
    Ran 1 test in 0.105s

    OK
    F0618 19:05:46.359140   35468 wcfs/client/wcfs.cpp:486] CRITICAL: pinner: wcfs /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: wlink25: recvReq: link is down
    F0618 19:05:46.359173   35468 wcfs/client/wcfs.cpp:487] CRITICAL: wcfs server will likely kill us soon.
    CRITICAL: pinner: wcfs /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: wlink25: recvReq: link is down
    CRITICAL: wcfs server will likely kill us soon.
    Traceback (most recent call last):
      File ".../bin/runUnitTest", line 312, in <module>
        sys.exit(runUnitTest.main())
      File ".../parts/erp5/Products/ERP5Type/tests/runUnitTest.py", line 926, in main
        run_only=run_only,
      File ".../parts/erp5/Products/ERP5Type/tests/runUnitTest.py", line 709, in runUnitTestList
        wcfs_server.stop()
      ...
      File ".../parts/wendelin.core/wcfs/__init__.py", line 543, in _fuse_unmount
        raise RuntimeError("%s\n(more details logged)" % emsg)
    RuntimeError: fuse_unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: failed: fusermount: failed to unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: Device or resource busy
    (more details logged)

    # logs
    2021-06-18 19:05:45.978 INFO root wcfs: unmount/stop wcfs pid32981 @ /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2
    2021-06-18 19:05:46.068 WARNING root fuse_unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: failed: fusermount: failed to unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: Device or resource busy
    2021-06-18 19:05:46.068 WARNING root # lsof /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2
    2021-06-18 19:05:46.357 WARNING root COMMAND     PID       USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    runUnitTe 32175 slapuser34   24r   REG   0,48      111    4 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/.wcfs/zurl
    runUnitTe 32175 slapuser34   25u   REG   0,48        0    7 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/head/watch
    runUnitTe 32175 slapuser34   26r   REG   0,48  2097152    9 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/head/bigfile/00000000000078b4

    2021-06-18 19:05:46.358 WARNING root -> kill -TERM wcfs.go ...
    2021-06-18 19:05:46.358 WARNING root -> abort FUSE connection ...

I've debugged things a bit and even with
ZODB@bbd03b3a ZODB connection stays alive
being referenced from some frame objects.

-> Fix this problem by explicitly closing ZODB DB nearby tests shutdown before call to wcfs_server.stop
parent 530e8b4e
...@@ -604,7 +604,8 @@ def runUnitTestList(test_list, verbosity=1, debug=0, run_only=None): ...@@ -604,7 +604,8 @@ def runUnitTestList(test_list, verbosity=1, debug=0, run_only=None):
from Products.ERP5Type.tests.utils import DbFactory from Products.ERP5Type.tests.utils import DbFactory
root_db_name, = cfg.dbtab.databases.keys() root_db_name, = cfg.dbtab.databases.keys()
DbFactory(root_db_name).addMountPoint('/') db_factory = DbFactory(root_db_name)
db_factory.addMountPoint('/')
TestRunner = unittest.TextTestRunner TestRunner = unittest.TextTestRunner
...@@ -683,6 +684,7 @@ def runUnitTestList(test_list, verbosity=1, debug=0, run_only=None): ...@@ -683,6 +684,7 @@ def runUnitTestList(test_list, verbosity=1, debug=0, run_only=None):
raise raise
finally: finally:
ProcessingNodeTestCase.unregisterNode() ProcessingNodeTestCase.unregisterNode()
db_factory.close()
Storage.close() Storage.close()
if node_pid_list is not None: if node_pid_list is not None:
# Wait that child processes exit. Stop ZEO storage (if any) after all # Wait that child processes exit. Stop ZEO storage (if any) after all
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment