• sjaakola's avatar
    MDEV-20928 mtr test galera.galera_var_innodb_disallow_writes test failure · c9928cc0
    sjaakola authored
    The sporadic test hangs happen because of mutex dealock between innodb
    background threads and two test connection executions.
    The test sets variable innodb_disallow_writes, which blocks all writes
    to filesyste. The test logic is to execute an INSERT, which should hang
    because of filesytstem writes are blocked, and through another session
    verify by SELECT that this hanging happens. The SELECT session will then
    release innodb_disallow_writes blocking.
    
    However, filesystem write  blocking affects also innodb background threads
    and they may hang while keeping some other resources locked.
    As an example, in one test hang situation, buffer pool access was blocked.
    And, if buffer pool is blocked, the test connections will be blocked as well,
    and the SELECT session will not be able to continue to release the
    innodb_disallow_writes.
    
    The fix in this commit is refactoring of the test logic.
    The test will now set first innodb_disallow_writes blocking, and then record
    a hash of data directory's filesystem contents. This works as checksum of the
    state of data on the datadirectory.
    
    Then some SQL load is tried on both nodes, these sessions will be blocking
    due to frozen file system state. The test will have a short sleep to allow
    innodb background threads to loop and possibly encounter innodb_disallow_writes
    blocking as well.
    
    After the sleep, the test will record file system checksun for the second time,
    and then release the innodb_disallow-writes blocking.
    
    Finally, the two checksums are compared, they should be identical to verify that
    nothing was written on datadirectory during the test execution.
    
    The checksum is implemented by md5sum hash over all files found in datadirectory
    by find command. all these file hashes are hashed together by one more md5sum.
    
    The test therefore depends on md5sum and find. find may work differently with some
    OS distributions, e.g. freebsd may be problematic.
    c9928cc0
galera_var_innodb_disallow_writes.result 838 Bytes