-
Jon Olav Hauglid authored
The problem was that a statement could cause an assert if it was aborted by KILL QUERY while it waited on a metadata lock. This assert checks that a statement either sends OK or an error to the client. If the bug was triggered on release builds, it caused OK to be sent to the client instead of ER_QUERY_INTERRUPTED. The root cause of the problem was that there are two separate ways to tell if a statement is killed: thd->killed and mysys_var->abort. KILL QUERY causes both to be set, thd->killed before mysys_var->abort. Also, both values are reset at the end of statement execution. This means that it is possible for KILL QUERY to first set thd->killed, then have the killed statement reset both thd->killed and mysys_var->abort and finally have KILL QUERY set mysys_var->abort. This means that the connection with the killed statement will start executing the next statement with the two values out of sync - i.e. thd->killed not set but mysys_var->abort set. Since mysys_var->abort is used to check if a wait for a metadata lock should be aborted, the next statement would immediately abort any such waiting. When waiting is aborted, no OK message is sent and thd->killed is checked to see if ER_QUERY_INTERRUPTED should be sent to the client. But since the->killed had been reset, neither OK nor an error message was sent to the client. This then triggered the assert. This patch fixes the problem by changing the metadata lock waiting code to check thd->killed. No test case added as reproducing the assert is dependent on very exact timing of two (or more) threads. The patch has been checked using RQG and the grammar posted on the bug report.
13109514