- 20 Dec, 2021 4 commits
-
-
Jérome Perrin authored
Even if we don't want to spawn full sshd to verify that ptys work, we can still minimally test that inside user/group database is setup as expected. /reviewed-by @kirr /reviewed-on !13
-
Kirill Smelkov authored
Even though libc.opentty stopped to insist on chown(group=tty) for /dev/pts/*, openssh still wants to do it and fails, preventing sshd from working. Fix it by spawning test workload with sole current user and group being present in the password database. We anyway don't have IDs for users/groups different from current uid/gid mapped from current namespace, and the kernel, when seeing a file owned by those, maps them to "nobody/nogroup" for existing files, and rejects chown to those original IDs obtained from parent's namespace /etc/{passwd,group} as EINVAL. For the same reason we don't try to mount our own /dev/pts instance, because we have only current uid/gid mapped to parent namespace and gid=5 maps to nogroup in parent. With existing /dev/pts mount entries are only listed as having nogroup, and from outside they _are_ owned by parent's tty group. If we would mount /dev/pts anew, parent won't see our /dev/pts/* at all which moves us a bit more far from desired behaviour. Still keep root and nobody/nogroup as @jerome reports that without those users Go tests fail on Debian9: nexedi/slapos!1095 (comment 147177) nexedi/slapos!1095 (comment 147201) See added comment about all this for more details. This patch fixes sshd to work under tryn.py(*) No test because libc.openpty works both with and without this patch, and it would need too spawn whole sshd under test to verify this. (*) the diff for sshd 1) running successfully under regular user, and 2) previously failing under trun.py without this patch is below: diff --git a/regular-nopam.txt b/trun-nopam.txt index 378ccb6..5b96c08 100644 --- a/regular-nopam.txt +++ b/trun-nopam.txt @@ -1,4 +1,4 @@ -(neo) (z-dev) (g.env) kirr@deca:~/tmp/trashme/sshd$ /sbin/sshd -d -p 2222 -h `pwd`/ssh_host_rsa_key -o UsePAM=no +kirr@deca:~/tmp/trashme/sshd$ /sbin/sshd -d -p 2222 -h `pwd`/ssh_host_rsa_key -o UsePAM=no debug1: sshd version OpenSSH_8.4, OpenSSL 1.1.1k 25 Mar 2021 debug1: private host key #0: ssh-rsa SHA256:y+ujVDqqFBXTclDM2NLy4GME7wReutLcUYOWAeriXdc debug1: setgroups() failed: Operation not permitted @@ -91,35 +91,13 @@ debug1: session_input_channel_req: session 0 req pty-req debug1: Allocating pty. debug1: session_new: session 0 debug1: SELinux support disabled -Attempt to write login records by non-root user (aborting) -debug1: session_pty_req: session 0 alloc /dev/pts/2 -debug1: server_input_channel_req: channel 0 request env reply 0 -debug1: session_by_channel: session 0 channel 0 -debug1: session_input_channel_req: session 0 req env -debug1: server_input_channel_req: channel 0 request shell reply 1 -debug1: session_by_channel: session 0 channel 0 -debug1: session_input_channel_req: session 0 req shell -Starting session: shell on pts/2 for kirr from 127.0.0.1 port 44106 id 0 -debug1: Setting controlling tty using TIOCSCTTY. - -debug1: Received SIGCHLD. -debug1: session_by_pid: pid 693948 -debug1: session_exit_message: session 0 channel 0 pid 693948 -debug1: session_exit_message: release channel 0 -debug1: session_by_tty: session 0 tty /dev/pts/2 -debug1: session_pty_cleanup2: session 0 release /dev/pts/2 -Attempt to write login records by non-root user (aborting) -debug1: session_by_channel: session 0 channel 0 -debug1: session_close_by_channel: channel 0 child 0 -Close session: user kirr from 127.0.0.1 port 44106 id 0 -debug1: channel 0: free: server-session, nchannels 1 -Received disconnect from 127.0.0.1 port 44106:11: disconnected by user -Disconnected from user kirr 127.0.0.1 port 44106 +chown(/dev/pts/2, 1000, 5) failed: Invalid argument debug1: do_cleanup debug1: temporarily_use_uid: 1000/1000 (e=1000/1000) debug1: restore_uid: (unprivileged) +debug1: session_pty_cleanup2: session 0 release /dev/pts/2 +Attempt to write login records by non-root user (aborting) +debug1: audit_event: unhandled event 12 debug1: do_cleanup debug1: temporarily_use_uid: 1000/1000 (e=1000/1000) debug1: restore_uid: (unprivileged) -debug1: audit_event: unhandled event 12 (see nexedi/slapos!1095 (comment 147018)) /helped-and-reviewed-by @jerome /reviewed-on nexedi/nxdtest!13
-
Kirill Smelkov authored
In the previous patch we asked unshare to keep capabilities so that FUSE mounting works from under regular user. However full set of capabilities is too much, and in particular if cap_dac_override is present(*), it means that writes to files that have read-only permission, is not rejected by kernel. -> Adjust trun to retain only those capabilities that we actually need = CAP_SYS_ADMIN to mount things. This should fix the following Go build failure: --- FAIL: TestReadOnlyWriteFile (0.00s) ioutil_test.go:90: Expected an error when writing to read-only file /tmp/TestReadOnlyWriteFile3940340549/blurp.txt FAIL FAIL io/ioutil 0.053s P.S. And if we would unshare to root instead (unshare -Umr) it should be still a good idea to drop extra capabilities, as we still want to reject writes to read-only files. (*) see https://man7.org/linux/man-pages/man7/capabilities.7.html /helped-and-reviewed-by @jerome /reviewed-on nexedi/nxdtest!13
-
Kirill Smelkov authored
and detect leaked temporary files and mount entries after each test run. Background Currently we have several testing-related problems that are all connected to /tmp and similar directories: Problem 1: many tests create temporary files for each run. Usually tests are careful to remove them on teardown, but due to bugs, many kind of tests, test processes being hard-killed (SIGKILL, or SIGSEGV) and other reasons, in practice this cleanup does not work 100% reliably and there is steady growth of files leaked on /tmp on testnodes. Problem 2: due to using shared /tmp and /dev/shm, the isolation in between different test runs of potentially different users is not strong. For example @jerome reports that due to leakage of faketime's shared segments separate test runs affect each other and fail: https://erp5.nexedi.net/bug_module/20211125-1C8FE17 Problem 3: many tests depend on /tmp being a tmpfs instance. This are for example wendelin.core tests which are intensively writing to database, and, if /tmp is resided on disk, timeout due to disk IO stalls in fsync on every commit. The stalls are as much as >30s and lead to ~2.5x overall slowdown for test runs. However the main problem is spike of increased latency which, with close to 100% probability, always render some test as missing its deadline. This topic is covered in https://erp5.com/group_section/forum/Using-tmpfs-for--tmp-on-testnodes-JTocCtJjOd -------- There are many ways to try to address each problem separately, but they all come with limitations and drawbacks. We discussed things with @tomo and @jerome, and it looks like that all those problems can be addressed in one go if we run tests under user namespaces with private mounts for /tmp and /dev/shm. Even though namespaces is generally no-go in Nexedi, they seem to be ok to use in tests. For example they are already used via private_tmpfs option in SlapOS: https://lab.nexedi.com/nexedi/slapos/blob/1876c150/slapos/recipe/librecipe/execute.py#L87-103 https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo-input-schema.json#L121-124 https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L11-16 https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L30-34 https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L170-177 ... https://lab.nexedi.com/nexedi/slapos/blob/1876c150/stack/erp5/instance-zope.cfg.in#L227-230 Thomas says that using private tmpfs for each test would be a better solution than implementing tmpfs for whole /tmp on testnodes. He also reports that @jp is OK to use namespaces for test as long as there is a fallback if namespaces aren't available. -> So let's do that: teach nxdtest to run each test case in its own private environment with privately-mounted /tmp and /dev/shm if we can detect that user namespaces are available. In an environment where user namespaces are indeed available this addresses all 3 problems because isolation and being-tmpfs are there by design, and even if some files will leak, the kernel will free everything when test terminates and the filesystem is automatically unmounted. We also detect such leakage and report a warning so that such problems do not go completely unnoticed. Implementation We leverage unshare(1) for simplicity. I decided to preserve uid/gid instead of becoming uid=0 (= `unshare -Umr`) for better traceability, so that it is clear from test output under which real slapuser a test is run(*). Not changing uid requires to activate ambient capabilities so that mounting filesystems, including FUSE-based needed by wendelin.core, continue to work under regular non-zero uid. Please see https://git.kernel.org/linus/58319057b784 for details on this topic. And please refer to added trun.py for details on how per-test namespace is setup. Using FUSE inside user namespaces requires Linux >= 4.18 (see https://git.kernel.org/linus/da315f6e0398 and https://git.kernel.org/linus/8cb08329b080), so if we are really to use this patch we'll have to upgrade kernel on our testnodes, at least where wendelin.core is used in tests. "no namespaces" detection is implemented via first running `unshare ... true` with the same unshare options that are going to be used to create and enter new user namespace for real. If that fails, we fallback into "no namespaces" mode where no private /tmp and /dev/shm are mounted(%). (*) for example nxdtest logs information about the system on startup: date: Mon, 29 Nov 2021 17:27:04 MSK xnode: slapuserX@test.node ... (%) Here is how nxdtest is run in fallback mode on my Debian 11 with user namespaces disabled via `sysctl kernel.unprivileged_userns_clone=0` (neo) (z-dev) (g.env) kirr@deca:~/src/wendelin/nxdtest$ nxdtest date: Thu, 02 Dec 2021 14:04:30 MSK xnode: kirr@deca.navytux.spb.ru uname: Linux deca 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 cpu: Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz >>> pytest $ python -m pytest # user namespaces not available. isolation and many checks will be deactivated. <--- NOTE ===================== test session starts ====================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.10.0, pluggy-0.13.1 rootdir: /home/kirr/src/wendelin/nxdtest plugins: timeout-1.4.2 collected 23 items nxdtest/nxdtest_pylint_test.py .... [ 17%] nxdtest/nxdtest_pytest_test.py ... [ 30%] nxdtest/nxdtest_test.py ......xx [ 65%] nxdtest/nxdtest_unittest_test.py ........ [100%] ============= 21 passed, 2 xfailed in 2.67 seconds ============= ok pytest 3.062s # 23t 0e 0f 0s # ran 1 test case: 1·ok /helped-by @tomo /helped-and-reviewed-by @jerome /reviewed-on nexedi/nxdtest!13
-
- 09 Dec, 2021 2 commits
-
-
Kirill Smelkov authored
Also log if master told us that we have nothing to do, and if the mode to run is local. This should make it a bit more clear what is going on just by looking at nxdtest log. See previous patch for more details and context. For the reference: here is how updated output looks like in the normal case: date: Thu, 09 Dec 2021 04:20:37 CET xnode: slapuser7@rapidspace-testnode-001 uname: Linux rapidspace-testnode-001 4.9.0-16-amd64 #1 SMP Debian 4.9.272-1 (2021-06-21) x86_64 cpu: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz # running for test_result_module/20211209-170FD3998 >>> pytest $ python -m pytest ============================= test session starts ============================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.9.0, pluggy-0.13.1 rootdir: /srv/slapgrid/slappart7/t/dfp/soft/47cc86af27d234f0464630f2a0d22a6f/parts/zodbtools-dev collected 46 items zodbtools/test/test_analyze.py . [ 2%] zodbtools/test/test_commit.py .. [ 6%] zodbtools/test/test_dump.py ... [ 13%] zodbtools/test/test_restore.py .. [ 17%] zodbtools/test/test_tidrange.py ............................. [ 80%] zodbtools/test/test_zodb.py ......... [100%] ========================== 46 passed in 9.15 seconds =========================== ok pytest 12.433s # 46t 0e 0f 0s # ran 1 test case: 1·ok /reviewed-by @jerome /reviewed-on nexedi/nxdtest!15
-
Kirill Smelkov authored
Nxdtest logs system information (bd91f6f1 "Include system information into log output") and run summary at the end (9f413221 "Emit run summary at the end"). However all that information currently is printed only if master is successfully connected and actually tells to run the tests. This behaviour is not very useful, because if nxdtest log output on testnode is empty, it is not clear - whether it was "we have to do nothing", or nxdtest stuck somewhere or something else. For example https://nexedijs.erp5.net/#/test_result_module/20211208-47D165B7/12 is currently marked as Running for many long hours already. And the log on testnode regarding nxdtest run is just: 2021-12-08 15:42:17,314 INFO $ PATH=/srv/slapgrid/slappart13/srv/slapos/soft/2956f419073cb2249ed953507fa6b173/bin:/opt/slapos/parts/bison/bin:/opt/slapos/parts/bzip2/bin:/opt/slapos/parts/gettext/bin:/opt/slapos/parts/glib/bin:/opt/slapos/parts/libxml2/bin:/opt/slapos/parts/libxslt/bin:/opt/slapos/parts/m4/bin:/opt/slapos/parts/ncurses/bin:/opt/slapos/parts/openssl/bin:/opt/slapos/parts/pkgconfig/bin:/opt/slapos/parts/python2.7/bin:/opt/slapos/parts/readline/bin:/opt/slapos/parts/sqlite3/bin:/opt/slapos/parts/swig/bin:/opt/slapos/bin:/opt/slapos/parts/patch/bin:/opt/slapos/parts/socat/bin:/usr/bin:/usr/sbin:/sbin:/bin SLAPOS_TEST_LOG_DIRECTORY=/srv/slapgrid/slappart13/var/log/testnode/dgd-xStX9safSG SLAPOS_TEST_SHARED_PART_LIST=/srv/slapgrid/slappart13/srv/shared:/srv/slapgrid/slappart13/t/dgd/shared /bin/sh /srv/slapgrid/slappart13/t/dgd/i/0/bin/runTestSuite --master_url $DISTRIBUTOR_URL --revision slapos=13977-ec686a708633f689382426063c21efbe3b2eab04,slapos.core=8698-91edab77ed36c160da8017cfdc1673fe7a8e10de --test_node_title rapidspace-testnode-008-3Nodes-DEPLOYTASK0 --test_suite SLAPOS-SR-TEST --test_suite_title SlapOS.SoftwareReleases.IntegrationTest-kirr.Python2 --project_title 'Rapid.Space Project' without anything else. With this patch nxdtest would print system information and report how many tests it had run, if its invocation did not stuck. In this patch we only move code that calls system_info and defer summary log before code that connects to master. In the following patch we'll add more logging around connecting to master. /reviewed-by @jerome /reviewed-on nexedi/nxdtest!15
-
- 11 Nov, 2021 1 commit
-
-
Kirill Smelkov authored
Sometimes there is zero testcases to be executed on testnodes, and log output from nxdtest is just date: Wed, 10 Nov 2021 12:31:50 MSK xnode: kirr@deca.navytux.spb.ru uname: Linux deca 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 cpu: Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz it is not clear from such output did the run ended or the test got stuck. After this patch it becomes date: Wed, 10 Nov 2021 12:31:50 MSK xnode: kirr@deca.navytux.spb.ru uname: Linux deca 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 cpu: Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz # ran 0 test cases. And in general, when there are several testcases to be run, it is helpful to indicate end of such run and to print brief summary of result status for all ran test cases. Example output: wendelin.core$ nxdtest -k test.wcfs date: Wed, 10 Nov 2021 12:35:34 MSK xnode: kirr@deca.navytux.spb.ru uname: Linux deca 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 cpu: Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz >>> test.wcfs/fs:1 ... ok test.wcfs/fs:1 25.035s # 35t 0e 0f 0s >>> test.wcfs/fs:2 ... ok test.wcfs/fs:2 21.033s # 35t 0e 0f 0s >>> test.wcfs/fs: ... ok test.wcfs/fs: 21.056s # 35t 0e 0f 0s # ran 3 test cases: 3·ok /reviewed-by @jerome /reviewed-on !12
-
- 13 Aug, 2021 2 commits
-
-
Jérome Perrin authored
This parses pylint output with a simple regexp and counts one failure per message reported. This has not been tested yet, but we decided to apply this commit already. /acked-by @kirr /reviewed-on nexedi/nxdtest!11
-
Jérome Perrin authored
When using compile with the actual file path, we can have better tracebacks in case of errors. before: Traceback (most recent call last): File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/bin/nxdtest", line 22, in <module> sys.exit(nxdtest.main()) File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/parts/nxdtest/nxdtest/__init__.py", line 142, in main tenv = loadNXDTestFile('.nxdtest') File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/parts/nxdtest/nxdtest/__init__.py", line 75, in loadNXDTestFile six.exec_(src, g) File "<string>", line 77, in <module> NameError: name 'Pylint' is not defined after: Traceback (most recent call last): File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/bin/nxdtest", line 22, in <module> sys.exit(nxdtest.main()) File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/parts/nxdtest/nxdtest/__init__.py", line 142, in main tenv = loadNXDTestFile('.nxdtest') File "/srv/slapgrid/slappart3/srv/runner/software/9544feb19475590d240ba2d32743c0a0/parts/nxdtest/nxdtest/__init__.py", line 75, in loadNXDTestFile six.exec_(compile(src, os.path.realpath(path), 'exec'), g) File "/srv/slapgrid/slappart3/srv/runner/instance/slappart8/var/nxdtest/.nxdtest", line 77, in <module> summaryf=Pylint.summary, NameError: name 'Pylint' is not defined /reviewed-by @kirr /reviewed-in nexedi/nxdtest!10
-
- 12 Aug, 2021 1 commit
-
-
Kirill Smelkov authored
For every TestCase nxdtest spawns test process to run with stdout/stderr redirected to pipes that nxdtest reads. Nxdtest, in turn, tees those pipes to its stdout/stderr until the pipes become EOF. If the test process, in turn, spawns other processes, those other processes will inherit opened pipes, and so the pipes won't become EOF untill _all_ spawned test processes (main test process + other processes that it spawns) exit. Thus, if there will be any process, that the main test process spawned, but did not terminated upon its own exit, nxdtest will get stuck waiting for pipes to become EOF which won't happen at all if a spawned test subprocess persists not to terminate. I hit this problem for real on a Wendelin.core 2 test - there the main test processes was segfaulting and so did not instructed other spawned processes (ZEO, WCFS, ...) to terminate. As the result the whole test was becoming stuck instead of being promptly reported as failed: runTestSuite: Makefile:175: recipe for target 'test.wcfs' failed runTestSuite: make: *** [test.wcfs] Segmentation fault runTestSuite: wcfs: 2021/08/09 17:32:09 zlink [::1]:52052 - [::1]:23386: recvPkt: EOF runTestSuite: E0809 17:32:09.376800 38082 wcfs.go:2574] zwatch zeo://localhost:23386: zlink [::1]:52052 - [::1]:23386: recvPkt: EOF runTestSuite: E0809 17:32:09.377431 38082 wcfs.go:2575] zwatcher failed -> switching filesystem to EIO mode (TODO) <LONG WAIT> runTestSuite: PROCESS TOO LONG OR DEAD, GOING TO BE TERMINATED -> Fix it. /reviewed-by @jerome /reviewed-on nexedi/nxdtest!9
-
- 01 Dec, 2020 1 commit
-
-
Jérome Perrin authored
re.match only find matches where the pattern appears at the beginning of the string, whereas re.search matches if the pattern appears anywhere in the string. This is behavior is consistent with pytest, go test and ERP5's runUnitTest For more details, see the discussion from nexedi/nxdtest!6 (comment 121409) /reviewed-on: nexedi/nxdtest!8
-
- 26 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
The reason is that with threading.Thread if exception happens in that spawned thread, this error is not propagated to main driver, while with sync.WorkGroup an exception from any spawned worker is propagated back to main. For example with the following injected error --- a/nxdtest/__init__.py +++ b/nxdtest/__init__.py @@ -267,6 +267,7 @@ def main(): # tee, similar to tee(1) utility, copies data from fin to fout appending them to buf. def tee(ctx, fin, fout, buf): + 1/0 while 1: before this patch nxdtest behaves like ... (neo) (z4-dev) (g.env) kirr@deco:~/src/wendelin/nxdtest$ nxdtest date: Tue, 24 Nov 2020 14:55:08 MSK xnode: kirr@deco.navytux.spb.ru uname: Linux deco 5.9.0-2-amd64 #1 SMP Debian 5.9.6-1 (2020-11-08) x86_64 cpu: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz >>> pytest $ python -m pytest Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/home/kirr/src/wendelin/nxdtest/nxdtest/__init__.py", line 270, in tee 1/0 ZeroDivisionError: integer division or modulo by zero Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/home/kirr/src/wendelin/nxdtest/nxdtest/__init__.py", line 270, in tee 1/0 ZeroDivisionError: integer division or modulo by zero error pytest 0.583s # 1t 1e 0f 0s (neo) (z4-dev) (g.env) kirr@deco:~/src/wendelin/nxdtest$ echo $? 0 Here the error in another thread is only printed, but nxdtest is not aborted. Above it reported "error", but e.g. when testing pygolang/py3 and raising an error in tee it even reported it was succeeding ( nexedi/nxdtest!6 (comment 121393) ): slapuser34@vifibcloud-rapidspace-hosting-007:~/srv/runner/instance/slappart0$ ./bin/runTestSuite date: Tue, 24 Nov 2020 12:51:23 MSK xnode: slapuser34@vifibcloud-rapidspace-hosting-007 uname: Linux vifibcloud-rapidspace-hosting-007 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 cpu: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz >>> thread $ python -m pytest Exception in thread Thread-1: Traceback (most recent call last): File "/srv/slapgrid/slappart34/srv/runner/shared/python3/5497998c60d97cbbf748337ccce21db2/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/srv/slapgrid/slappart34/srv/runner/shared/python3/5497998c60d97cbbf748337ccce21db2/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/srv/slapgrid/slappart34/srv/runner/software/44fe7dd3f13ecd100894c6368a35c055/parts/nxdtest/nxdtest/__init__.py", line 268, in tee fout.write(data) TypeError: write() argument must be str, not bytes ok thread 9.145s # 1t 0e 0f 0s >>> gevent $ gpython -m pytest Exception in thread Thread-3: Traceback (most recent call last): File "/srv/slapgrid/slappart34/srv/runner/shared/python3/5497998c60d97cbbf748337ccce21db2/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/srv/slapgrid/slappart34/srv/runner/shared/python3/5497998c60d97cbbf748337ccce21db2/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/srv/slapgrid/slappart34/srv/runner/software/44fe7dd3f13ecd100894c6368a35c055/parts/nxdtest/nxdtest/__init__.py", line 268, in tee fout.write(data) TypeError: write() argument must be str, not bytes ok gevent 21.980s # 1t 0e 0f 0s After this patch nxdtest correctly handles and propagates an error originating in spawned thread back to main driver: (neo) (z4-dev) (g.env) kirr@deco:~/src/wendelin/nxdtest$ nxdtest date: Tue, 24 Nov 2020 14:54:19 MSK xnode: kirr@deco.navytux.spb.ru uname: Linux deco 5.9.0-2-amd64 #1 SMP Debian 5.9.6-1 (2020-11-08) x86_64 cpu: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz >>> pytest $ python -m pytest Traceback (most recent call last): File "/home/kirr/src/wendelin/venv/z4-dev/bin/nxdtest", line 11, in <module> load_entry_point('nxdtest', 'console_scripts', 'nxdtest')() File "/home/kirr/src/wendelin/nxdtest/nxdtest/__init__.py", line 230, in main wg.wait() File "golang/_sync.pyx", line 237, in golang._sync.PyWorkGroup.wait pyerr_reraise(pyerr) File "golang/_sync.pyx", line 217, in golang._sync.PyWorkGroup.go.pyrunf f(pywg._pyctx, *argv, **kw) File "/home/kirr/src/wendelin/nxdtest/nxdtest/__init__.py", line 270, in tee 1/0 ZeroDivisionError: integer division or modulo by zero (neo) (z4-dev) (g.env) kirr@deco:~/src/wendelin/nxdtest$ echo $? 1 NOTE sync.WorkGroup requires every worker to handle context cancellation, so that whenever there is an error, all other workers are canceled. We add such cancellation handling to tee but only lightly: before going to block in read/write syscalls we check for whether ctx is canceled or not. However the proper handling would be to switch file descriptors into non-block mode and to select at every IO point on both potential IO events and potential cancellation. This is left as TODO for the future. /reviewed-on nexedi/nxdtest!7
-
- 24 Nov, 2020 5 commits
-
-
Jérome Perrin authored
These are the necessary changes to run `SlapOS.Eggs.UnitTest-*` and `SlapOS.SoftwareReleases.IntegrationTest-*` using nxdtest See merge request nexedi/nxdtest!6 /reviewed-by @kirr
-
Jérome Perrin authored
If test program output on stderr (which is unbuffered), the output of the test program will appear before output from nxdtest advertising the program that is about to be executed, because nxdtest stdout is buffered (testnode does not set PYTHONUNBUFFERED, and eventhough nxdtest sets PYTHONUNBUFFERED in its own environ, this only applies to sub processes)
-
Jérome Perrin authored
-
Jérome Perrin authored
While pytest sends everything in stdout, some other programs send on stderr.
-
Jérome Perrin authored
While treating output as text would not really be impossible, treating it as bytes seems a better choice because: - we don't have to make assumptions about what output encoding the test program is using for output - `tee` can just read stream output bytes by bytes without having to worry about multi-bytes characters - testnode protocol uses xmlrpc.client.Binary, which uses bytes. Because using bufsize=1 implies reading subprocess output as text, we use bufsize=0 instead in the subprocess.Popen call, to prevent buffering. To make manipulation of strings and bytes easier, we add a dependency on pygolang, so that we can use its strings utility functions. Also add a few tests to verify general functionality.
-
- 09 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
For example wendelin.core 2 uses pytest_unconfigure to unmount wcfs servers that it spawned: https://lab.nexedi.com/kirr/wendelin.core/blob/1be4d730/conftest.py#L27-53 This usually leads to some test as shown in added test. -> Skip that text backwards and actually detect pytest summary line instead of returning empty dict, because empty dict forces Nexedi ERP5 to treat such test run to have UNKNOWN status, e.g. https://erp5.nexedi.net/test_result_module/20201108-C170850/11 /cc @jerome /reviewed-on nexedi/nxdtest!5
-
- 02 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
This allows to establish tests for Nxdtest itself under Nexedi testing infrastructure. /cc @jerome /reviewed-on nexedi/nxdtest!4
-
- 30 Oct, 2020 2 commits
-
-
Kirill Smelkov authored
lab.nexedi.com/kirr/nxdtest -> lab.nexedi.com/nexedi/nxdtest nexedi/slapos!839 (comment 119120) nexedi/slapos!839 (comment 119142) /cc @jerome /reviewed-on nexedi/nxdtest!3
-
Kirill Smelkov authored
/cc @jerome /reviewed-on !1
-
- 28 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
Verify pytest summary-parsing functionality.
-
- 20 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
Create real nxdtest module, so that things could be imported from there. As the result switch `nxdtest` program from scripts to entry_point.
-
- 08 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
Here is e.g. corresponding summary: ===== 12 failed, 14 passed, 4 xfailed, 12 error in 19.58 seconds =====
-
- 07 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
When checking logs on testnode, it is very useful to know which kernel version is running inside (wendelin.core uses FUSE and depends on having patched FUSE module for kernels < 5.2), which CPU is there, etc. Format of printed output follows go benchmarking format and coincides with start of `neotest info-local` (see footers on figures in http://navytux.spb.ru/~kirr/neo.html and here: https://lab.nexedi.com/kirr/neo/blob/1f26678b/go/neo/t/neotest#L585-874)
-
- 05 Oct, 2020 5 commits
-
-
Kirill Smelkov authored
- To list which tests are in there, - To execute only selected tests. Apply only to local mode. Very handy for debugging.
-
Kirill Smelkov authored
We started to print test result summary line for a testcase run since 0153635b (Teach nxdtest to run tests locally). However it is not present in log when nxdtest is run under master. -> Include summary lines everywhere for uniformity, with reason similar to bd1333bb (log += title and argv for ran testcase).
-
Kirill Smelkov authored
This reverts commit 34e96b1d. Reason for revert is that since bd1333bb (log += title and argv for ran testcase) we always emit details of to-be-run command to log in the beginning of testcase run.
-
Kirill Smelkov authored
We started to display command and envadj in the log in the previous patch. However only command - without envadj - was reported to master for test result. -> Make it uniform: include envadj into details everywhere.
-
Kirill Smelkov authored
When there are several or many testcases, it is hard to understand - by seeing just log or console output - which part of the test suite was running. It also helps to see the exact command that was spawned. Example output for pygolang. Before: (neo) (z-dev) (g.env) kirr@deco:~/src/tools/go/pygolang$ nxdtest ============================= test session starts ============================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.9.0, pluggy-0.13.1 rootdir: /home/kirr/src/tools/go/pygolang collected 112 items golang/_gopath_test.py .. [ 1%] golang/context_test.py .. [ 3%] golang/cxx_test.py .. [ 5%] golang/errors_test.py ........ [ 12%] golang/fmt_test.py ... [ 15%] golang/golang_test.py ................................................ [ 58%] golang/io_test.py . [ 58%] golang/strconv_test.py .. [ 60%] golang/strings_test.py ..... [ 65%] golang/sync_test.py ............. [ 76%] golang/time_test.py ........ [ 83%] golang/pyx/build_test.py ... [ 86%] golang/pyx/runtime_test.py . [ 87%] gpython/gpython_test.py ssssss.sssssss [100%] ==================== 99 passed, 13 skipped in 5.42 seconds ===================== ok thread 5.656s # 112t 0e 0f 13s ============================= test session starts ============================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.9.0, pluggy-0.13.1 rootdir: /home/kirr/src/tools/go/pygolang collected 112 items golang/_gopath_test.py .. [ 1%] golang/context_test.py .. [ 3%] golang/cxx_test.py .. [ 5%] golang/errors_test.py ........ [ 12%] golang/fmt_test.py ... [ 15%] golang/golang_test.py ................................................ [ 58%] golang/io_test.py . [ 58%] golang/strconv_test.py .. [ 60%] golang/strings_test.py ..... [ 65%] golang/sync_test.py ............. [ 76%] golang/time_test.py ........ [ 83%] golang/pyx/build_test.py ... [ 86%] golang/pyx/runtime_test.py . [ 87%] gpython/gpython_test.py .............. [100%] ========================= 112 passed in 17.35 seconds ========================== ok gevent 17.768s # 112t 0e 0f 0s After: (neo) (z-dev) (g.env) kirr@deco:~/src/tools/go/pygolang$ nxdtest >>> thread $ python -m pytest ============================= test session starts ============================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.9.0, pluggy-0.13.1 rootdir: /home/kirr/src/tools/go/pygolang collected 112 items golang/_gopath_test.py .. [ 1%] golang/context_test.py .. [ 3%] golang/cxx_test.py .. [ 5%] golang/errors_test.py ........ [ 12%] golang/fmt_test.py ... [ 15%] golang/golang_test.py ................................................ [ 58%] golang/io_test.py . [ 58%] golang/strconv_test.py .. [ 60%] golang/strings_test.py ..... [ 65%] golang/sync_test.py ............. [ 76%] golang/time_test.py ........ [ 83%] golang/pyx/build_test.py ... [ 86%] golang/pyx/runtime_test.py . [ 87%] gpython/gpython_test.py ssssss.sssssss [100%] ==================== 99 passed, 13 skipped in 5.27 seconds ===================== ok thread 5.508s # 112t 0e 0f 13s >>> gevent $ gpython -m pytest ============================= test session starts ============================== platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.9.0, pluggy-0.13.1 rootdir: /home/kirr/src/tools/go/pygolang collected 112 items golang/_gopath_test.py .. [ 1%] golang/context_test.py .. [ 3%] golang/cxx_test.py .. [ 5%] golang/errors_test.py ........ [ 12%] golang/fmt_test.py ... [ 15%] golang/golang_test.py ................................................ [ 58%] golang/io_test.py . [ 58%] golang/strconv_test.py .. [ 60%] golang/strings_test.py ..... [ 65%] golang/sync_test.py ............. [ 76%] golang/time_test.py ........ [ 83%] golang/pyx/build_test.py ... [ 86%] golang/pyx/runtime_test.py . [ 87%] gpython/gpython_test.py .............. [100%] ========================= 112 passed in 17.32 seconds ========================== ok gevent 17.729s # 112t 0e 0f 0s
-
- 01 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
failure is assertion failure. error is any other error. Source: https://docs.python.org/3/library/unittest.html#organizing-test-code -> "If the test fails, an exception will be raised ... identify the test case as a failure. Any other exceptions will be treated as errors" While searching ERP5 code for failure_count to clarify its meaning, I've also discovered expected_failures and unexpected_successes (*) so they are also added to the table. (*) see e.g. https://lab.nexedi.com/nexedi/erp5/blob/1e31c091/product/ERP5Type/tests/ERP5TypeTestSuite.py#L6-8
-
- 29 Sep, 2020 9 commits
-
-
Kirill Smelkov authored
-
Kirill Smelkov authored
Instead of just traceback saying that e.g. "No such command" without information which command it was tried to spawn, stderr will now include what was tried to be run. This makes diagnostic easier.
-
Kirill Smelkov authored
Many project that use pytest for testing won't need to redefine it over and over again in their .nxdtest file.
-
Kirill Smelkov authored
It is very handy to debug both .nxdtest and nxdtest tool itself, and to be able to run tests locally during development instead of (in addition to) waiting for the test result from test nodes. -> Teach nxdtest to run and report tests locally if no --master_url was provided. Idea is taken from runTestSuite.in from NEO, Cython+, Buildout and Rina: https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/neoppod/runTestSuite.in#L95-105 https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/neoppod/runTestSuite.in#L55-75 https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/neoppod/stress-testing/runTestSuite.in#L17-37 https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/cython-test/runTestSuite.in#L38-58 https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/buildout-testing/runTestSuite.in#L21-39 https://lab.nexedi.com/nexedi/slapos/blob/3b14e028/software/build-rina/runTestSuite.in#L29-49 Instead of copying code over and over again, let's stick to UNIX approach and make it to be in one tool only that does its job (hopefully) good.
-
Kirill Smelkov authored
Previously, if --master_url was not provided, the program would crash with UnboundLocalError for status, since for that case list of tests was empty. Fix it by moving status initialization higher by one level. See next patch where we'll add meaning for nxdtest run with no --master_url set.
-
Kirill Smelkov authored
It is frequently needed to set some environment variable without completely resetting everything else in the environment, because that everything else can carry settings like PATH, PYTHONPATH, RPATH, etc for the program to be able to run at all. See nexedi/pygolang@b938af8b for recent example.
-
Kirill Smelkov authored
After previous patch that summary function is no-longer used, is specific to NEO/py and so should be defined in .nxdtest file in that project. lab.nexedi.com/kirr/neo will be adjusted correspondingly.
-
Kirill Smelkov authored
Make the tool generic: load list of testcases to run from .nxdtest file and process the correspondingly. See added top-level documentation for details.
-
Kirill Smelkov authored
Amends e9cc7e07 (Add minimal packaging setup)
-