• Kirill Smelkov's avatar
    wcfs: tests: Extend faulty protection tests with more kinds of faulty clients · c91fb14e
    Kirill Smelkov authored
    So far we were testing only against faulty client that reads pin
    notification ok, but does not reply to the notification. But there could
    be more problems:
    
    1) a client does not read pin notification at all
    2) a client closes watchlink abruptly after reading pin notification
    3) a client replies to pin notification but the reply is not "ack"
    
    The first problem, if not handled leads to whole set of clients to
    become stuck on reading the same block as the faulty client. The other
    problems also indicate breakage of the isolation protocol from the client
    side and that wcfs can no longer be sure that it provides good
    uncorrupted data to the client.
    
    In the first case, similarly to "no reply" situation we need to kill the
    client to make progress while maintaining safety as well. In the cases 2
    and 3 we cannot maintain safety if the faulty client remains in the set
    of live and served clients, so it is also logical to send SIGBUS/SIGKILL
    to it.
    
    Killing a client with SIGBUS is similar to how OS kernel sends SIGBUS when
    a memory-mapped file is accessed and loading file data results in EIO. It is
    also similar to wendelin.core 1 where SIGBUS is raised if loading file block
    results in an error.
    
    Extend tests to cover all explained scenarios.
    
    /reviewed-by @levin.zimmermann
    /reviewed-on nexedi/wendelin.core!18
    c91fb14e
wcfs_test.py 73.7 KB