wcfs: Implement protection against faulty client + related fixes and improvements
The WCFS documentation specifies [1]: - - - 8> - - - 8> - - - If a client, on purpose or due to a bug or being stopped, is slow to respond with ack to file invalidation notification, it creates a problem because the server will become blocked waiting for pin acknowledgments, and thus all other clients, that try to work with the same file, will get stuck. [...] Lacking OS primitives to change address space of another process and not being able to work it around with ptrace in userspace, wcfs takes approach to kill a slow client on 30 seconds timeout by default. - - - <8 - - - <8 - - - But before, this protection wasn't implemented yet: one faulty client could therefore freeze the whole system. With this work this protection is implemented now: faulty clients are killed after the timeout or any other misbehaviour in their pin handlers. Working on this topic also resulted in several fixes and improvements around isolation protocol implementation on the server side. See individual patches for details. [1] https://lab.nexedi.com/nexedi/wendelin.core/blob/38dde766/wcfs/wcfs.go#L186-208Co-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com> /reviewed-on nexedi/wendelin.core!18
Showing
wcfs/wcfs_faultyprot_test.py
0 → 100644
Please register or sign in to comment