• unknown's avatar
    BUG#19580 - FULLTEXT search produces wrong results on UTF-8 columns · 528e85a4
    unknown authored
    The problem was that MySQL hadn't true ctype implementation. As a
    result many multibyte punctuation/whitespace characters were
    treated as word characters.
    
    This fix uses recently added CTYPE table for unicode character sets
    (WL1386) to detect unicode punctuation/whitespace characters
    correctly.
    
    Note: this is incompatible change since it changes parser behavior.
    One will have to use REPAIR TABLE statement to rebuild fulltext
    indexes.
    
    
    mysql-test/r/fulltext2.result:
      Testcase for BUG#19580.
    mysql-test/t/fulltext2.test:
      Testcase for BUG#19580.
    storage/myisam/ft_parser.c:
      Use WL1386 "CTYPE table for unicode character sets" functionality.
    storage/myisam/ft_update.c:
      Use WL1386 "CTYPE table for unicode character sets" functionality.
      
      Reverse fix for BUG#16489 "utf8 + fulltext leads to corrupt index
      file.". It is not needed anymore, since we have true ctype
      implementation.
    storage/myisam/ftdefs.h:
      Use WL1386 "CTYPE table for unicode character sets" functionality.
      
      Rework true_word_char macro so it accepts ctype instead of charset
      as first param. It doesn't use my_isalnum anymore, but instead
      directly checks ctype.
      Obsolete word_char macro removed.
    528e85a4
fulltext2.test 8.12 KB