• Alexander Barkov's avatar
    Bug#57737 Character sets: search fails with like, contraction, index · e3dee8a7
    Alexander Barkov authored
    Problem: LIKE over an indexed column optimized away good results,
    because my_like_range_utf32/utf16 returned wrong ranges for contractions.
    Contraction related code was missing in my_like_range_utf32/utf16,
    but did exist in my_like_range_ucs2/utf8.
    It was forgotten in utf32/utf16 versions (during mysql-6.0 push/revert mess).
    
    Fix:
    The patch removes individual functions my_like_range_ucs2,
    my_like_range_utf16, my_like_range_utf32 and introduces a single function
    my_like_range_generic() instead. The new function handles contractions
    correctly. It can handle any character set with cs->min_sort_char and
    cs->max_sort_char represented in Unicode code points.
    
    added:
      @ mysql-test/include/ctype_czech.inc
      @ mysql-test/include/ctype_like_ignorable.inc
      @ mysql-test/r/ctype_like_range.result
      @ mysql-test/t/ctype_like_range.test
      Adding tests
    
    
    modified:
    
      @ include/m_ctype.h
      - Adding helper functions for contractions.
      - Prototypes: removing ucs2,utf16,utf32 functions, adding generic function.
      @ mysql-test/r/ctype_uca.result
      @ mysql-test/r/ctype_utf16_uca.result
      @ mysql-test/r/ctype_utf32_uca.result
      @ mysql-test/t/ctype_uca.test
      @ mysql-test/t/ctype_utf16_uca.test
      @ mysql-test/t/ctype_utf32_uca.test
      - Adding tests.
    
      @ strings/ctype-mb.c
      - Pad function did not put the last character.
      - Implementing my_like_range_generic() - an universal replacement
        for three separate functions
        my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32(),
        with correct contraction handling.
    
      @ strings/ctype-ucs2.c
      - my_fill_mb2 did not put the high byte, as previously
        it was used to put only characters in ASCII range.
        Now it puts high byte as well
        (needed to pupulate cs->max_sort_char correctly).
      - Adding DBUG_ASSERT()
      - Removing character set specific functions:
        my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32().
      - Using my_like_range_generic() instead of the old functions.
    
      @ strings/ctype-uca.c
      - Using generic function instead of the old character set specific ones.
    
      @ sql/item_create.cc
      @ sql/item_strfunc.cc
      @ sql/item_strfunc.h
      - Adding SQL functions LIKE_RANGE_MIN and LIKE_RANGE_MAX,
        available only in debug build to make sure like_range()
        works correctly for all character sets and collations. 
    e3dee8a7
ctype-uca.c 534 KB