• Alexander Barkov's avatar
    MDEV-10306 Wrong results with combination of CONCAT, SUBSTR and CONVERT in subquery · 3a37afec
    Alexander Barkov authored
    The bug happens because of a combination of unfortunate circumstances:
    
    1. Arguments args[0] and args[2] of Item_func_concat point recursively
    (through Item_direct_view_ref's) to the same Item_func_conv_charset.
    Both args[0]->args[0]->ref[0] and args[2]->args[0]->ref[0] refer to
    this Item_func_conv_charset.
    
    2. When Item_func_concat::args[0]->val_str() is called,
    Item_func_conv_charset::val_str() writes its result to
    Item_func_conc_charset::tmp_value.
    
    3. Then, for optimization purposes (to avoid copying),
    Item_func_substr::val_str() initializes Item_func_substr::tmp_value
    to point to the buffer fragment owned by Item_func_conv_charset::tmp_value
    Item_func_substr::tmp_value is returned as a result of
    Item_func_concat::args[0]->val_str().
    
    4. Due to optimization to avoid memory reallocs,
    Item_func_concat::val_str() remembers the result of args[0]->val_str()
    in "res" and further uses "res" to collect the return value.
    
    5. When Item_func_concat::args[2]->val_str() is called,
    Item_func_conv_charset::tmp_value gets overwritten (see #1),
    which effectively overwrites args[0]'s Item_func_substr::tmp_value (see #3),
    which effectively overwrites "res" (see #4).
    
    This patch does the following:
    
    a. Changes Item_func_conv_charset::val_str(String *str) to use
       tmp_value and str the other way around. After this change tmp_value
       is used to store a temporary result, while str is used to return the value.
       The fixes the second problem (without SUBSTR):
         SELECT CONCAT(t2,'-',t2) c2
           FROM (SELECT CONVERT(t USING latin1) t2 FROM t1) sub;
       As Item_func_concat::val_str() supplies two different buffers when calling
       args[0]->val_str() and args[2]->val_str(), in the new reduction the result
       created during args[0]->val_str() does not get overwritten by
       args[2]->val_str().
    
    b. Fixing the same problem in val_str() for similar classes
    
       Item_func_to_base64
       Item_func_from_base64
       Item_func_weight_string
       Item_func_hex
       Item_func_unhex
       Item_func_quote
       Item_func_compress
       Item_func_uncompress
       Item_func_des_encrypt
       Item_func_des_decrypt
       Item_func_conv_charset
       Item_func_reverse
       Item_func_soundex
       Item_func_aes_encrypt
       Item_func_aes_decrypt
       Item_func_buffer
    
    c. Fixing Item_func::val_str_from_val_str_ascii() the same way.
       Now Item_str_ascii_func::ascii_buff is used for temporary value,
       while the parameter passed to val_str() is used to return the result.
       This fixes the same problem when conversion (from ASCII to e.g. UCS2)
       takes place. See the ctype_ucs.test for example queries that returned
       wrong results before the fix.
    
    d. Some Item_func descendand classes had temporary String buffers
       (tmp_value and tmp_str), but did not really use them.
       Removing these temporary buffers from:
    
       Item_func_decode_histogram
       Item_func_format
       Item_func_binlog_gtid_pos
       Item_func_spatial_collection:
    
    e. Removing Item_func_buffer::tmp_value, because it's not used any more.
    
    f. Renaming Item_func_[un]compress::buffer to "tmp_value",
       for consistency with other classes.
    
    Note, this patch does not fix the following classes
    (although they have a similar problem):
    
       Item_str_conv
       Item_func_make_set
       Item_char_typecast
    
    They have a complex implementations and simple swapping between "tmp_value"
    and "str" won't work. These classes will be fixed separately.
    3a37afec
item_geofunc.cc 41.5 KB