• Kirill Smelkov's avatar
    Propagate cancellation to spawned test jobs · 938b5455
    Kirill Smelkov authored
    A user might cancel test result in ERP5 UI if e.g. some misbehaviour is
    detected and a new revision is ready to be tested. This works by
    test_result.start() returning None - indicating that there is no more
    test_result_lines to exercise. Master also indicates this cancellation
    via test_result.isAlive() returning False, but until now we were not
    using that information and were always waiting for completion of current
    test job that is already spawned.
    
    This works well in practice if individual tests are not long, but e.g.
    for SlapOS.SoftwareReleases.IntegrationTest-* it is not good, because
    there an individual test might takes _hours_ to execute.
    
    -> Fix it by first setting global context to where we'll propagate
    cancellation from test_result.isAlive, and by using that context as the
    base for all other activities. This should terminate spawned test
    process if test_result is canceled.
    
    The interval to check is picked up as 5 minutes not to overload master.
    @jerome says that
    
        We now have 341 active test nodes, but sometimes we are using
        more, we did in the past to stress test some new machines.
    
        For the developer, if we reduce the waiting time from a few hours to 1
        minutes or 5 minutes seems more or less equivalent.
    
    For 350 testnodes and each nxdtest checking its test_result status via
    isAlive query to master every 5 minutes, it results in ~ 1 isAlive
    request/second to master on average.
    
    Had to change time to golang.time to use time.after().
    Due to that time() and sleep() are changed to time.now() and
    time.sleep() correspondingly.
    
    /helped-and-reviewed-by @jerome
    /reviewed-on nexedi/nxdtest!14
    938b5455
__init__.py 20.9 KB