• Sergei Petrunia's avatar
    MDEV-32113: utf8mb3_key_col=utf8mb4_value cannot be used for ref · 4941ac91
    Sergei Petrunia authored
    (Variant#3: Allow cross-charset comparisons, use a special
    CHARSET_INFO to create lookup keys. Review input addressed.)
    
    Equalities that compare utf8mb{3,4}_general_ci strings, like:
    
      WHERE ... utf8mb3_key_col=utf8mb4_value    (MB3-4-CMP)
    
    can now be used to construct ref[const] access and also participate
    in multiple-equalities.
    This means that utf8mb3_key_col can be used for key-lookups when
    compared with an utf8mb4 constant, field or expression using '=' or
    '<=>' comparison operators.
    
    This is controlled by optimizer_switch='cset_narrowing=on', which is
    OFF by default.
    
    IMPLEMENTATION
    Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci.
    This is valid as any utf8mb3 value is also an utf8mb4 value.
    
    When making index lookup value for utf8mb3_key_col, we do "Charset
    Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are
    copied as-is, as they can be represented in utf8mb3. Characters that are
    outside the BMP cannot be represented in utf8mb3 and are replaced
    with U+FFFD, the "Replacement Character".
    
    In utf8mb4_general_ci, the Replacement Character compares as equal to any
    character that's not in BMP. Because of this, the constructed lookup value
    will find all index records that would be considered equal by the original
    condition (MB3-4-CMP).
    Approved-by: default avatarMonty <monty@mariadb.org>
    4941ac91
mysqld--help.result 78.6 KB