MDEV-20010 Equal on two RANK window functions create wrong result
The problematic query outlined a bug in window functions sorting optimization. When multiple window functions are present in a query, we sort the sorting key (as defined by PARTITION BY and ORDER BY) from generic to specific. SELECT RANK() OVER (ORDER BY const_col) as r1, RANK() OVER (ORDER BY const_col, a) as r2, RANK() OVER (PARTITION BY c) as r3, RANK() OVER (PARTITION BY c ORDER BY b) as r4 FROM table; For these functions, the sorting we need to do for window function computations are: [(const_col), (const_col, a)] and [(c), (c, b)]. Instead of doing 4 different sort order, the sorts grouped within [] are compatible and we can use the most *specific* sort to cover both window functions. The bug was caused by an incorrect flagging of which sort is most specific for a compatible group of functions. In our specific test case, instead of picking (const_col, a) as the most specific sort, it would only sort by (const_col), which lead to wrong results for rank function. By ensuring that we pick the last sort key before an "incompatible sort" flag is met in our "ordered array of sorting specifications", we guarantee correct results.
Showing
Please register or sign in to comment