- 25 Sep, 2024 9 commits
-
-
Kirill Smelkov authored
When verifying WCFS we are checking that mtime of particular bigfile corresponds to last transaction that modified its data. However there is a peculiarity that py and go construct time from TID a bit differently with the difference going up to 1µs(*). With that wcfs_test.py, when asserting on timestamps, tries to verify mtime returned by wcfs to tid of corresponding transaction with that 1µs of tolerance. There is a bug, however, in the implementation of tolerance logic. In that logic, when tid is converted to timestamp on py side it is rounded to 6th decimal digit. And it used to work on py2 only due to the fact that on py2 current time is measured to that same 6th decimal digit - to the microsecond precision. That fact is relevant here because ZODB constructs transaction IDs from current time, and if the time precision is only 1µs so will tid, when converted back to time, also have that 1µs precision. However py3, after https://github.com/python/cpython/commit/ad95c2d25c5f, started to use clock_gettime which has 1ns precision instead of 1µs precision of gittimeofday used previously. This results in the breakage of the above logic as demonstrated by the following failure: @func def test_wcfs_basic(): t = tDB(); zf = t.zfile defer(t.close) # >>> lookup non-BigFile -> must be rejected with raises(OSError) as exc: t.wc._stat("head/bigfile/%s" % h(t.nonzfile._p_oid)) assert exc.value.errno == EINVAL # >>> file initially empty f = t.open(zf) f.assertCache([]) > f.assertData ([], mtime=t.at0) wcfs_test.py:1337: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ t = <wcfs.wcfs_test.tFile object at 0x7f674f0d7490>, dataokv = [], mtime = @at0 (03fb8b696044aebb) def assertData(t, dataokv, mtime=None): st = os.fstat(t.f.fileno()) assert st.st_blksize == t.blksize assert st.st_size == len(dataokv)*t.blksize if mtime is not None: > assert st.st_mtime == tidtime(mtime) E assert 1727274802.5628808 == 1727274802.562881 E + where 1727274802.5628808 = os.stat_result(st_mode=33060, st_ino=10, st_dev=71, st_nlink=1, st_uid=1000, st_gid=1000, st_size=0, st_atime=0, st_mtime=1727274802, st_ctime=1727274802).st_mtime E + and 1727274802.562881 = tidtime(@at0 (03fb8b696044aebb)) Here t.at0 is 03fb8b696044aebb which corresponds to 1727274802.5628808 timestamp with 7 digits after point. However tidtime, due to round(·, 6) returns another float with only 6 digits after point which results in the check failure. -> Fix that by adjusting the assert to explicitly verify that the difference in between st_mtime returned by wcfs and tidtime of corresponding transaction ≤ 1µs. (*) see neo@9112f21e for details
-
Kirill Smelkov authored
WCFS tests follow bytestring model from the beginning because it operates with binary data and binary messages to/from WCFS server. It all works ok on py2, but running tests on py3 yielded several problems. For example str and bytes were mixed in the innermost part of _assertBlk def _(ctx, ev): assert t.cached()[blk] == cached ev.append('read pre') # access data with released GIL so that the thread that reads data from # head/watch can receive pin message. Be careful to handle cancellation, # so that on error in another worker we don't get stuck and the # error can be propagated to wait and reported. # # we handle cancellation by spawning read in another thread and # waiting for either ctx cancel, or read thread to complete. This # way on ctx cancel (e.g. assertion failure in another worker), the # read thread can remain running even after _assertBlk returns, and # in particular till the point where the whole test is marked as # failed and shut down. But on test shutdown .fmmap is unmapped for # all opened tFiles, and so read will hit SIGSEGV. Prepare to catch # that SIGSEGV here. have_read = chan(1) def _(): try: b = read_exfault_nogil(blkview[0:1]) except SegmentationFault: b = 'FAULT' t._blkaccess(blk) have_read.send(b) go(_) _, _rx = select( ctx.done().recv, # 0 have_read.recv, # 1 ) if _ == 0: raise ctx.err() b = _rx > ev.append('read ' + b) E TypeError: can only concatenate str (not "bytes") to str bytes input was split with str delimiter def _loadStats(t): # -> {} stats = {} for l in t.wc._read(".wcfs/stats").splitlines(): # key : value > k, v = l.split(':') E TypeError: a bytes-like object is required, not 'str' and str object rejected when assigning to C `char*`: Traceback (most recent call last): File "golang/_golang.pyx", line 156, in golang._golang.__goviac File "wcfs/internal/wcfs_test.pyx", line 58, in wendelin.wcfs.internal.wcfs_test._tWCFS._abort_ontimeout TypeError: expected bytes, str found -> Fix all those overlooks by consistently using bytestrings everywhere. On py3 the implementation depends on nexedi/pygolang!21, but on py2 it works both with and without pygolang bstr patches. Preliminary history: vnmabus/wendelin.core@af56bd31 but it takes the reverse approach and mixes in pytest.approx for timestamp assert which is unrelated to the topic and is not correct as default pytest.approx behaviour is to use 1e-6 relative precision which results in ~1700s seconds tolerance instead of intended 1µs: In [1]: import pytest In [2]: import time In [3]: t = time.time() In [4]: t Out[4]: 1727271692.0636573 In [5]: t == t Out[5]: True In [6]: t == pytest.approx(t+1000) Out[6]: True In [7]: t == pytest.approx(t+2000) Out[7]: False So using pytest.approx resulted in tests to become accepting faulty wcfs behaviour. Timestamp assert will be handled properly in the next patch. Co-authored-by: Carlos Ramos Carreño <carlos.ramos@nexedi.com>
-
Kirill Smelkov authored
wcfs/client/_wcfs.pyx provides Cython wrapper over C++ WCFS client that works with bytes-based std::string messages. On py2 everything works ok, but on py3, due to this, it rejects str given as input argument, e.g. as follows: ```python _____________________________ test_join_autostart ______________________________ @func def test_join_autostart(): zurl = testzurl with raises(RuntimeError, match="wcfs: join .*: server not running"): wcfs.join(zurl, autostart=False) assert wcfs._wcregistry == {} def _(): assert wcfs._wcregistry == {} defer(_) > wc = wcfs.join(zurl, autostart=True) wcfs/wcfs_test.py:164: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ wcfs/__init__.py:225: in join wc = WCFS(mntpt, fwcfs, wcsrv) ../../venvs/wendelin.core/lib/python3.9/site-packages/decorator.py:232: in fun return caller(func, *(extras + args), **kw) ../pygolang/golang/__init__.py:125: in _ return f(*argv, **kw) wcfs/__init__.py:167: in __init__ wc.mountpoint = mountpoint wcfs/client/_wcfs.pyx:44: in wendelin.wcfs.client._wcfs.PyWCFS.mountpoint.__set__ def __set__(PyWCFS pywc, string v): _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E TypeError: expected bytes, str found ``` because by default Cython treats std::string as related to bytes on py side. -> Fix it by accepting both str and bytes as input for all methods arguments related to strings. For returned strings care to return them as strings, not bytes, to which Cython converts std::string by default because calling code expects returned messages to have string semantic. Though we return the data as bytestring, not unicode, as the rest of the testsuite also assumes binary messages reception from WCFS server. NOTE even though it was me to originally suggest in private to use cython: c_string_type=str, c_string_encoding=utf8 so that str type is accepted as input, later, when having a broader look, I realized that there are two problems with the above directives. First the directives affect not only the input, but also any std::string returned becomes returned as unicode instead of bytes/bytestr previously. However as explained above the higher level expects binary semantic from returned messages. And second if WCFS sends a message with invalid UTF-8 data, it will result in exception thrown on the client instead of actually returning sent data to the caller. This makes debugging more difficult and last thing I want to happen is, when WCFS sends some garbage, to get a UnicodeDecodeError instead of actually seeing the message and higher level assert saying that that message is unexpected with providing details. So do all the in- and out- conversions by hand instead with controlling desired semantics ourselves. On py3 the implementation depends on nexedi/pygolang!21, but on py2 it works both with and without pygolang bstr patches. Preliminary history: vnmabus/wendelin.core@47c27b03Co-authored-by: Carlos Ramos Carreño <carlos.ramos@nexedi.com>
-
Kirill Smelkov authored
Treegen.py was using bytes valv and iterating it to get tree keys to be set, but on py3 iterating bytes yields integer not bytes and so with py3 e.g. TestΔBTail was failing as --- FAIL: TestΔBTail (1.21s) panic: root['treegen/values']: key ['a']: expected str, got int64 [recovered] panic: file:///tmp/TestΔBTail1569960846/001/1.fs: @03fb8a9c91bba733: get blktab: root['treegen/values']: key ['a']: expected str, got int64 [recovered] -> Fix it by reworking treegen.py to operate in strings domain and adjusting treeenv.go loader to use pickle.AsString correspondingly. On py3 the implementation depends on nexedi/pygolang!21, but on py2 it works both with and without pygolang bstr patches. Preliminary history: vnmabus/wendelin.core@eca099bc vnmabus/wendelin.core@5a2b639c vnmabus/wendelin.core@fadc5a07 but there we were operating in mixed string/bytes mode with cluttering the code with .encode('ascii') calls and potentially missing to handle some logic right because on py3 str == bytes returns False, not raises, and with mixed types it is easy to miss some logic where conversion is needed if the outcome depends on values comparison. Switching to work in strings domain uniformly avoids those problem in principle and by doing only one thing locally. Co-authored-by: Carlos Ramos Carreño <carlos.ramos@nexedi.com>
-
Kirill Smelkov authored
Previously TestZBlk was xfailing as e.g. --- FAIL: TestZBlk/py3_pickle3 (0.00s) panic: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes [recovered] panic: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes because on py3 ZBlk data is saved as bytes while zblk.go code was trying to decode it as bytestring (str from py2). -> Fix that by adjusting ZBlk/go to follow py3 model with ZBlk data being considered to be bytes but also accepting bytestr from py2 for that. Now on go side loading ZBlk works for all py2_pickle{1,2,3} and py3_pickle3 data.
-
Kirill Smelkov authored
This brings ZODB/go updates to support Python3 from neo!8 that we will need to support Python3(*). For now do not add any py3 support on wcfs side - just update neo/go with preserving wcfs/py2 in working state. Note that this ZODB/go update brings backward incompatible changes to activate StrictUnicode and PyDict modes(+) on unpickling and also changes API of zodb.Map to follow that PyDict mode. We thus need to update go bits inside wcfs because else compilation and tests, expectedly, start to break in the following ways # lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/xbtree/xbtreetest internal/xbtree/xbtreetest/treeenv.go:292:24: zroot.Data undefined (type *zodb.Map has no field or method Data) # lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata [lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata.test] ./δftail_test.go:737:16: zroot.Data undefined (type *zodb.Map has no field or method Data) --- FAIL: TestZBlk (0.00s) --- FAIL: TestZBlk/py2_pickle1 (0.00s) panic: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.ByteString [recovered] panic: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.ByteString --- FAIL: TestΔFtail (0.98s) panic: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString [recovered] panic: file:///tmp/TestΔFtail4275282346/001/1.fs: @03fb8a25642d1e88: get blktab: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString [recovered] panic: file:///tmp/TestΔFtail4275282346/001/1.fs: @03fb8a25642d1e88: get blktab: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString --- FAIL: TestΔBTail (1.13s) panic: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString [recovered] panic: file:///tmp/TestΔBTail3238267558/001/1.fs: @03fb8a212cd06933: get blktab: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString [recovered] panic: file:///tmp/TestΔBTail3238267558/001/1.fs: @03fb8a212cd06933: get blktab: root['treegen/values']: key ["a"]: expected str, got ogórek.ByteString The changes are minimal just to keep wcfs/py2 working ok. We will do more changes in the follow-up patches to support py3. All the changes in go.mod and go.sum are produced automatically by sole `go get lab.nexedi.com/kirr/neo/go@t+ypy3`. (*) see neo@efde5253...6235fb60 for full list of changes. (+) see neo@078b6262, neo@1e87fe5b, https://github.com/kisielk/og-rek/pull/68 and https://github.com/kisielk/og-rek/pull/75 for details.
-
Kirill Smelkov authored
Our "as int64" code was put into ogórek as generalized utility in https://github.com/kisielk/og-rek/commit/010fbd2e.
-
Kirill Smelkov authored
Going from v1.2.0 to e691997e3596 brings in StrictUnicode + PyDict modes and other improvements: https://github.com/kisielk/og-rek/compare/v1.2.0...e691997e3596. We will need those improvements in the next patches.
-
Kirill Smelkov authored
Previously we were testing ZBlk loading on Go side only with data generated by python2 and pickle protocol=2. However even on py2 there are more pickle protocols that are in use, and also there is python3. -> Modernize testdata/zblk_test_gen.py to use run_with_all_zodb_pickle_kinds that was recently added as part of nexedi/zodbtools@f9d36ba7 and generate test data with both python2 and python3. It is handy to use py2py3-venv(*) to prepare python environment to do that. Adjust tests on Go side to verify how ZBlk is loaded for all generated zkinds. py2_pickle1, py2_pickle2 and py2_pickle3 are handled well. ZBlk test for py3_pickle3 currently fails with --- FAIL: TestZBlk/py3_pickle3 (0.01s) panic: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes [recovered] and so is marked with "xfail". We will fix tests for py3_pickle3 in follow-up patches. (*) see nexedi/zodbtools@fac2f190
-
- 19 Sep, 2024 2 commits
-
-
Levin Zimmermann authored
/reviewed-by @kirr /reviewed-on nexedi/wendelin.core!32
-
Levin Zimmermann authored
This patch updates NEO/go to include a patch that fixes non-deterministic crashs for 'wendelin.core'. Its tradeoff is however a moderate memory leak. This leak is going to be fixed later when a new golang version is released. See more details about this here: neo@ee23551d /reviewed-by @kirr /reviewed-on nexedi/wendelin.core!32
-
- 18 Sep, 2024 1 commit
-
-
Kirill Smelkov authored
sendReq has two phases: a) send request, and b) read reply. When there is an error on the first phase, e.g. client does not read what wcfs is trying to send, it returns an error like pin #2 @03fb63abd6d65b33: sendReq: send .2: context deadline exceeded however when there is an error on the second phase, e.g. client does not reply to wcfs request, it currently returns an error like pin #2 @03fb63abd6d65b33: sendReq: context deadline exceeded which is not clear to interpret about which part was problematic. After this patch the error for the second case becomes pin #2 @03fb63abd6d65b33: sendReq: waiting for reply: context deadline exceeded which is easier to interpret. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!31
-
- 17 Sep, 2024 28 commits
-
-
Levin Zimmermann authored
This patch updates: - github.com/golang/glog: we already wanted to do so in nexedi/wendelin.core!23, but we deferred it to keep go 1.18 support. However in recent patches we already dropped go 1.18 support and we can therefore update glog now. - lab.nexedi.com/kirr/neo/go: add fix in handshake, see here for more information: neo@d75f4ac2 and neo@03db1d8a This patch doesn't update: - github.com/hanwen/go-fuse: This was updated upstream and Kirill already reviewed and integrated patches in custom branch. However when updating go-fuse to v2.4.3-0.20240904154523-9546fc238dc6 (this is go-fuse@9546fc23), WCFS tests fail on my machine [1] => let's defer update - github.com/kisielk/og-rek: there are new patches that will be needed in the future, but we didn't update NEO/go og-rek dependency yet, so let's defer the update in wendelin.core until we updated og-rek in NEO/go - github.com/johncgriffin/overflow: no update on upstream - github.com/pkg/errors: no update on upstream - github.com/stretchr/testify: This was already updated with nexedi/wendelin.core@c559ec1a 'testify' seems to have a major release in the future which may break some of our test code, but for now major version 1 is still the stable release. ---- kirr: I confirm that go-fuse@9546fc23 brings in regression to WCFS tests. It seems I missed some error in that go-fuse update and it will need to be bisected and debugged. --- [1] Test failure log: ========================================== FAILURES ========================================== ______________________________________ test_wcfs_basic _______________________________________ @func def test_wcfs_basic(): t = tDB(); zf = t.zfile defer(t.close) # >>> lookup non-BigFile -> must be rejected with raises(OSError) as exc: t.wc._stat("head/bigfile/%s" % h(t.nonzfile._p_oid)) assert exc.value.errno == EINVAL # >>> file initially empty f = t.open(zf) f.assertCache([]) f.assertData ([], mtime=t.at0) # >>> (@at1) commit data -> we can see it on wcfs at1 = t.commit(zf, {2:'c1'}) f.assertCache([0,0,0]) # initially not cached f.assertData (['','','c1'], mtime=t.head) # >>> (@at2) commit again -> we can see both latest and snapshotted states # NOTE blocks e(4) and f(5) will be accessed only in the end at2 = t.commit(zf, {2:'c2', 3:'d2', 5:'f2'}) # f @head > f.assertCache([1,1,0,0,0,0]) wcfs/wcfs_test.py:1341: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ t = <wcfs.wcfs_test.tFile instance at 0x7ff61457b960>, incorev = [1, 1, 0, 0, 0, 0] def assertCache(t, incorev): > assert t.cached() == incorev E assert [0, 0, 0, 0, 0, 0] == [1, 1, 0, 0, 0, 0] E At index 0 diff: 0 != 1 E Use -v to get the full diff wcfs/wcfs_test.py:791: AssertionError ------------------------------------ Captured stdout call ------------------------------------ M: commit -> @at0 (03fb5dfbe3c1cd55) M: commit -> @at1 (03fb5dfbe4936a66) M: f<0000000000000002> [2] M: commit -> @at2 (03fb5dfbe4d01166) M: f<0000000000000002> [2, 3, 5] >>> Change history by file: f<0000000000000002>: 0 1 2 3 4 5 6 7 a b c d e f g h @at0 (03fb5dfbe3c1cd55) @at1 (03fb5dfbe4936a66) 2 @at2 (03fb5dfbe4d01166) 2 3 5 ------------------------------------ Captured stderr call ------------------------------------ I0917 12:43:53.392222 124283 wcfs.go:2752] start "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs" I0917 12:43:53.392282 124283 wcfs.go:2758] (built with go1.21.13) W0917 12:43:53.392404 124283 storage.go:232] zodb: FIXME: open file:///tmp/testdb_fs.z5ZoMH/1.fs: raw cache is not ready for invalidations -> NoCache forced W0917 12:43:53.567807 124283 wcfs.go:2331] /head/bigfile: lookup "0000000000000001": bigfopen 0000000000000001 @03fb5dfbe3c1cd55: invalid argument: ZODB.Broken("persistent.Persistent") is not a ZBigFile I0917 12:43:53.710208 124283 wcfs.go:2933] stop "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs" ------------------------------------- Captured log call -------------------------------------- WARNING ZODB.FileStorage:FileStorage.py:412 Ignoring index for /tmp/testdb_fs.z5ZoMH/1.fs _________________________________ test_wcfs_watch_vs_access __________________________________ @func def test_wcfs_watch_vs_access(): t = tDB(); zf = t.zfile; at0=t.at0 defer(t.close) f = t.open(zf) at1 = t.commit(zf, {2:'c1'}) at2 = t.commit(zf, {2:'c2', 3:'d2', 5:'f2'}) at3 = t.commit(zf, {0:'a3', 2:'c3', 5:'f3'}) f.assertData(['a3','','c3','d2','x','x']) f.assertCache([1,1,1,1,0,0]) # watched + commit -> read -> receive pin messages. # read vs pin ordering is checked by assertBlk. # # f(5) is kept not accessed to check later how wcfs.go handles δFtail # rebuild after it sees not yet accessed ZBlk that has change history. wl3 = t.openwatch(); w3 = wl3.watch(zf, at3); assert at3 == t.head assert w3.at == at3 assert w3.pinned == {} wl3_ = t.openwatch(); w3_ = wl3_.watch(zf, at3) assert w3_.at == at3 assert w3_.pinned == {} wl2 = t.openwatch(); w2 = wl2.watch(zf, at2) assert w2.at == at2 assert w2.pinned == {0:at0, 2:at2} # w_assertPin asserts on state of .pinned for {w3,w3_,w2} def w_assertPin(pinw3, pinw3_, pinw2): assert w3.pinned == pinw3 assert w3_.pinned == pinw3_ assert w2.pinned == pinw2 f.assertCache([1,1,1,1,0,0]) at4 = t.commit(zf, {1:'b4', 2:'c4', 5:'f4', 6:'g4'}) > f.assertCache([1,0,0,1,0,0,0]) wcfs/wcfs_test.py:1702: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ t = <wcfs.wcfs_test.tFile instance at 0x7ff614512050>, incorev = [1, 0, 0, 1, 0, 0, ...] def assertCache(t, incorev): > assert t.cached() == incorev E assert [0, 0, 0, 0, 0, 0, ...] == [1, 0, 0, 1, 0, 0, ...] E At index 0 diff: 0 != 1 E Use -v to get the full diff wcfs/wcfs_test.py:791: AssertionError ------------------------------------ Captured stdout call ------------------------------------ M: commit -> @at0 (03fb5dfc0fd82300) M: commit -> @at1 (03fb5dfc10b92ecc) M: f<0000000000000049> [2] M: commit -> @at2 (03fb5dfc10cee9dd) M: f<0000000000000049> [2, 3, 5] M: commit -> @at3 (03fb5dfc1100c999) M: f<0000000000000049> [0, 2, 5] C: setup watch f<0000000000000049> @at3 (03fb5dfc1100c999) C: setup watch f<0000000000000049> @at3 (03fb5dfc1100c999) C: setup watch f<0000000000000049> @at2 (03fb5dfc10cee9dd) M: commit -> @at4 (03fb5dfc120ed611) M: f<0000000000000049> [1, 2, 5, 6] >>> Change history by file: f<0000000000000049>: 0 1 2 3 4 5 6 7 a b c d e f g h @at0 (03fb5dfc0fd82300) @at1 (03fb5dfc10b92ecc) 2 @at2 (03fb5dfc10cee9dd) 2 3 5 @at3 (03fb5dfc1100c999) 0 2 5 @at4 (03fb5dfc120ed611) 1 2 5 6 ------------------------------------ Captured stderr call ------------------------------------ I0917 12:44:03.733037 125217 wcfs.go:2752] start "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs" I0917 12:44:03.733126 125217 wcfs.go:2758] (built with go1.21.13) W0917 12:44:03.733418 125217 storage.go:232] zodb: FIXME: open file:///tmp/testdb_fs.z5ZoMH/1.fs: raw cache is not ready for invalidations -> NoCache forced I0917 12:44:04.475273 125217 wcfs.go:2933] stop "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs" ============================ 2 failed, 42 passed in 55.81 seconds ============================ I0917 12:44:17.882140 125540 wcfs.go:2933] stop "/dev/shm/wcfs/c4d833a0bdea4c51decf5425b8ad2cc4d017280f" "file:///tmp/testdb_fs.bvHBy9/1.fs" make: *** [Makefile:174: test.wcfs] Error 1 /reviewed-by @kirr /reviewed-on nexedi/wendelin.core!30
-
Kirill Smelkov authored
The WCFS documentation specifies [1]: - - - 8> - - - 8> - - - If a client, on purpose or due to a bug or being stopped, is slow to respond with ack to file invalidation notification, it creates a problem because the server will become blocked waiting for pin acknowledgments, and thus all other clients, that try to work with the same file, will get stuck. [...] Lacking OS primitives to change address space of another process and not being able to work it around with ptrace in userspace, wcfs takes approach to kill a slow client on 30 seconds timeout by default. - - - <8 - - - <8 - - - But before, this protection wasn't implemented yet: one faulty client could therefore freeze the whole system. With this work this protection is implemented now: faulty clients are killed after the timeout or any other misbehaviour in their pin handlers. Working on this topic also resulted in several fixes and improvements around isolation protocol implementation on the server side. See individual patches for details. [1] https://lab.nexedi.com/nexedi/wendelin.core/blob/38dde766/wcfs/wcfs.go#L186-208Co-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com> /reviewed-on nexedi/wendelin.core!18
-
Levin Zimmermann authored
I would only suggest one very tiny change. In go.mod we have: module lab.nexedi.com/nexedi/wendelin.core/wcfs go 1.14 I think this needs to be updated to go 1.19 due to atomic.Int64. And maybe we just need general go mod tidy update. /reviewed-by @kirr /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
The WCFS documentation specifies [1]: - - - 8> - - - 8> - - - If a client, on purpose or due to a bug or being stopped, is slow to respond with ack to file invalidation notification, it creates a problem because the server will become blocked waiting for pin acknowledgments, and thus all other clients, that try to work with the same file, will get stuck. [...] Lacking OS primitives to change address space of another process and not being able to work it around with ptrace in userspace, wcfs takes approach to kill a slow client on 30 seconds timeout by default. - - - <8 - - - <8 - - - But before this patch, this protection wasn't implemented yet: one faulty client could therefore freeze the whole system. With this patch this protection is implemented now: faulty clients are killed after the timeout or any other misbehaviour in their pin handlers. [1] https://lab.nexedi.com/nexedi/wendelin.core/blob/38dde766/wcfs/wcfs.go#L186-208 Preliminary history: levin.zimmermann/wendelin.core@24904e82 levin.zimmermann/wendelin.core@b02dcadcCo-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com> /discussed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
If a pin misbehaves or there is IO error or anything else, we want to stop all communication on the watchlink, cancel on in-flight pin handlers, and (TODO) kill the client with SIGBUS. This patch organizes WatchLink shutdown on any pin error. This functionality is indirectly tested by test_Wcfs_watch_robust and will be also indirectly tested by faultyprotection tests. It would be good to have dedicated tests probably, but that is, hopefully, TODO. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Pinning is critical operation whose failure will soon lead to client being killed with SIGBUS. WCFS correctness also depend fundamentally on pin operation, if started, to be handled by the client. -> rework the READ handler not to cancel pin if a READ interrupt comes in from the OS client. Do this via organizing WatchLink.serveCtx and running pins under this context instead of under READ context. Later we will adjust pins to also cancel this context on any error. Test is, hopefully, TODO. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
When serve is completing and going to exit, it sends an error message to the client without any timeout. So if the client is not reading from the channel, wcfs will get stuck waiting for the message to be consumed. -> Fix that by trying to send that last error only during 1 second and ignoring errors if any Test is, hopefully, TODO. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Bring in more structure: - final watchlink cleanup is done in its own block - cancelling spawned handlers is done in another block - add more comments explaining things /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Previously we were using .sk.CloseRead() to interrupt sk.Read(), but that is not necessary since .sk, relying on xio.Pipe, implements xio.Reader natively with full support for cancellation. The original code to cancel via CloseRead comes from mid 2019 and predates kirr/go123@7ad867a3 kirr/go123@0e368363 kirr/go123@0bdac628 kirr/go123@9db4dfac kirr/go123@d2dc6c09 And in kirr/wendelin.core@b17aeb8c and 6f0cdaff (wcfs: Provide isolation to clients), it seems, I missed to update WatchLink.serve code to that. Do that now because it simplifies code flow organization a bit. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
So far we were testing only against faulty client that reads pin notification ok, but does not reply to the notification. But there could be more problems: 1) a client does not read pin notification at all 2) a client closes watchlink abruptly after reading pin notification 3) a client replies to pin notification but the reply is not "ack" The first problem, if not handled leads to whole set of clients to become stuck on reading the same block as the faulty client. The other problems also indicate breakage of the isolation protocol from the client side and that wcfs can no longer be sure that it provides good uncorrupted data to the client. In the first case, similarly to "no reply" situation we need to kill the client to make progress while maintaining safety as well. In the cases 2 and 3 we cannot maintain safety if the faulty client remains in the set of live and served clients, so it is also logical to send SIGBUS/SIGKILL to it. Killing a client with SIGBUS is similar to how OS kernel sends SIGBUS when a memory-mapped file is accessed and loading file data results in EIO. It is also similar to wendelin.core 1 where SIGBUS is raised if loading file block results in an error. Extend tests to cover all explained scenarios. /reviewed-by @levin.zimmermann /reviewed-on !18
-
Kirill Smelkov authored
wcfs: tests: Add test to exercies faulty client that does not reply to pin triggered by readPinWatchers Levin writes: This patch extends the test scope of 'test_wcfs_pintimeout_kill'. Before this patch, the test only ensured that a client that does not respond to pin requests during the initial watch request [1] is killed. Now it also tests that a faulty client is killed when a block is invalidated. Since there are no other situations where the WCFS server sends pin requests to a client, the tests now cover all situations where a faulty client might not respond. This patch therefore aims to increase the security that WCFS is not blocked by a faulty client. [1] See nexedi/wendelin.core!18 Preliminary history: 9d42efffCo-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com> /discussed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
We will need to use this utilitin from several places in the next patch. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Currently assertBlk uses default timeout() to wait for READ operation to complete. That works well everywhere except that in faulty protection tests wcfs server will first need to wait for its own pintimeout time to kill the faulty client and only then return read result to all non-faulty clients. This way corresponding test, when one client fails to handle pin notification well triggered due to READ operations, will need to use adjusted longer timeout for the good client when doing assertBlk. Adjust assertBlk to allow specifying custom timeout as preparatory step for that. /reviewed-by @levin.zimmermann /reviewed-on !18
-
Kirill Smelkov authored
And make sure that that good client can setup its watch ok even through there simultaneously is a faulty client that should get killed. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
If we don't the whole testing process will become killed when wcfs becomes taught to kill clients that do not handle pin notifications well. Use multiprocessing to do so and to be able to interoperate with spawned test process by sending/receiving objects to/from it. Preliminary history: aef0f0e1Co-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com> /discussed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
If wcfs kills client that did not respond to pin notification in pintimeout time, we need to wait strictly _more_ than that time to detect whether client was killed or not. And in practice, due to noise in operating system load and other factors, that waiting time should be significantly greater to detect lack of expected event. However we were waiting for exactly 1·pintimeout time and were claiming that there was no pinkill event right after that. -> Wait for 2·pintimeout instead of 1·pintimeout to make pinkill detection robust. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
The default "pin timeout" is 30s and we are going to add many tests that exercise pinkilling functionality soon. If every such test takes 2·pintimeout time = 60s, it will result in significant time increase needed to run WCFS tests. Avoid that by adjusting pin timeout to one order of magnitude smaller pintimeout=3s during faulty protection tests. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
This testing helper limits whole test time to detect FUSE-related deadlocks via aborting FUSE connection on timeout. It is working good so far. But soon we will need pinkill-related tests, where timeout will need to be detected independently of FUSE connection. Expose tWCFS.ctx for tests to be able to use this context and do things limited in time. Adjust FUSE aborting to correlate exactly with this context cancellation. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
We are going to add more tests on this topic + supporting infrastructure. It makes sense to move everything related to dedicated test file first as a preparatory step because wcfs_test.py feels already overloaded. Plain code movement. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
WCFS allows issuing simultaneous watch requests and when two watch requests are simultaneously issued for the same file there was a race in their handling: the code was relying on w.atMu.W to protect setupWatch from concurrent readPinWatcher, and also, seemingly from another setupWatch running on the same file. But there is a bug about that: lacking atomic primitive to downgrade RWMutex from wlock to rlock, atMu.W was first fully unlocked and then rlocked again. The code prepare wrt readPinWatcher to start running in that unlock->rlock time window, but it was not prepared wrt another setupWatch starting to run on the same file in that pause time. -> Fix that via using dedicated Watch.setupMu lock that protects setupWatch from setupWatch. Test is, hopefully, TODO. My mistake from 6f0cdaff (wcfs: Provide isolation to clients) /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Inside readPinWatchers: https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1536-1591 if δFtail.BlkRevAt would return an error, then f.watchMu was not RUnlocked back, and wg.Wait was not called at all. -> Fix that by scheduling unlock and wg wait right after f.watchMu is rlocked and workgroup is created. Test is, hopefully, TODO. My mistake from 6f0cdaff (wcfs: Provide isolation to clients) /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
The code was already behaving like that but there was XXX to do it. Add test to verify it is actually done. Opened WatchLink handle is released after RELEASE because read in WatchLink.serve, after RELEASE, returns EOF and then the code inside WCFS does all necessary WatchLink-related cleanup: https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1828-1872 /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
This was marked as TODO in server code and not implemented. Without this cleanup zheadSockTab was growing indefinitely after every open/close and leaking memory. -> Fix it via registering RELEASE handler to FUSE and removing corresponding zheadSockTab entry from there. /reviewed-by @levin.zimmermann /reviewed-on !18
-
Kirill Smelkov authored
Report there number of inside-WCFS instances, e.g. number of tracked BigFiles, WatchLinks etc, and also number of counted events, for example how many times a pin event happened. Soon we will need this statistics to implement tests e.g. for pinkilling and other functionalities, and it might be also useful to have in general. /reviewed-by @levin.zimmermann /reviewed-on !18
-
Kirill Smelkov authored
ZWatcher says it does not need to lock wlinkMu because it is already holding zheadMu and setupWatch runs with zheadMu locked. That is indeed true, but the mistake here is that it i not only setupWatch that makes access to wlinkTab. For example WatchNode.Open registers new entries there only under wlinkMu: https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1819-1822 -> Fix it by always using wlinkMu when accessing wlinkTab. My mistake from 6f0cdaff (wcfs: Provide isolation to clients) Test is, hopefully, TODO. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Previously we were protecting access to zheadSockTab with zheadMu because this table was accessed from only two places: when opening .wcfs/zhead and in zwatcher. Soon we are going to add another place that will access this table and still using big zheadMu seem less and less logical. -> Switch to using dedicated lock to protect table of .wcfs/zhead opens as preparatory step for that. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
Currently zwatcher failure leads to wcfs starting to provide stale data instead of uptodate data. Fix that by detecting zwatcher failures and explicitly switching the filesystem to a mode where any access to anything returns "input/output error". Zwatcher can fail on e.g. failure to retrieve transactions from ZODB storage or any other failure. With this patch we make sure this does not go unnoticed. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-
Kirill Smelkov authored
go-fuse added functionality to handle Init.MaxPages in https://github.com/hanwen/go-fuse/commit/265a39266958. /reviewed-by @levin.zimmermann /reviewed-on nexedi/wendelin.core!18
-