• Sujatha's avatar
    MDEV-19925: Column ... cannot be converted from type 'varchar(20)' to type 'varchar(20)' · e7b71e0d
    Sujatha authored
    Cherry picking:
    Bug#25135304: RBR: WRONG FIELD LENGTH IN ERROR MESSAGE
    commit 47bd3f7cf3c8518f62b1580ec65af2ba7ac13b95
    
    Description:
    ============
    In row based replication, when replicating from a table with a field with
    character set set to UTF8mb3 to the same table with the same field set to
    character set UTF8mb4 I get a confusing error message:
    
    For VARCHAR: VARCHAR(1) 'utf8mb3' to VARCHAR(1) 'utf8mb4'
    "Column 0 of table 'test.t1' cannot be converted from type 'varchar(3)' to
    type 'varchar(1)'"
    
    Similar issue with CHAR type as well.
    
    Issue with respect to BLOB types:
    
    For BLOB: LONGBLOB to TINYBLOB - Error message displays incorrect blob type.
    "Column 0 of table 'test.t1' cannot be converted from type 'tinyblob' to type
    'tinyblob'"
    
    For BINARY to BINARY - Error message displays incorrect type for master side
    field.
    "Column 0 of table 'test.t' cannot be converted from type 'char(1)' to type
    'binary(10)'"
    Similar issue exists for VARBINARY type. It is displayed as 'VARCHAR'.
    
    Analysis:
    =========
    In Row based replication charset information is not sent as part of metadata
    from master to slave.
    
    For VARCHAR field its character length is converted into equivalent
    octets/bytes and stored internally. At the time of displaying the data to user
    it is converted back to original character length.
    
    For example:
    VARCHAR(2)- utf8mb3 is stored as:2*3 = VARCHAR(6)
    At the time of displaying it to user
    VARCHAR(6)- charset utf8mb3:6/3= VARCHAR(2).
    
    At present the internally converted octect length is sent from master to slave
    with out providing the charset information. On slave side if the type
    conversion fails 'show_sql_type' function is used to get the type specific
    information from metadata. Since there is no charset information is available
    the filed type is displayed as VARCHAR(6).
    
    This results in confused error message.
    
    For CHAR fields
    CHAR(1)- utf8mb3 - CHAR(3)
    CHAR(1)- utf8mb4 - CHAR(4)
    
    'show_sql_type' function which retrieves type information from metadata uses
    (bytes/local charset length) to get actual character length. If slave's chaset
    is 'utf8mb4' then
    
    CHAR(3/4)-->CHAR(0)
    CHAR(4/4)-->CHAR(1).
    
    This results in confused error message.
    
    Analysis for BLOB type issue:
    
    BLOB's length is represented in two forms.
    1. Actual length
    i.e
      (length < 256) type= MYSQL_TYPE_TINY_BLOB;
      (length < 65536) type= MYSQL_TYPE_BLOB; ...
    
    2. packlength - The number of bytes used to represent the length of the blob
      1- tinyblob
      2- blob ...
    
    In row based replication only the packlength is written in the binary log. On
    the slave side this packlength is interpreted as actual length of the blob.
    Hence the length is always < 256 and the type is displayed as tiny blob.
    
    Analysis for BINARY to BINARY type issue:
    The character set information is needed to identify a filed's type as char or
    binary. Since master side character set information is not available on the
    slave side both binary and char fields are displayed as char.
    
    Fix:
    ===
    For CHAR and VARCHAR fields display their length in octets for both source and
    target fields. For target field display the charset information if it is
    relevant.
    
    For blob type changed the code to use the packlength and display appropriate
    blob type in error message.
    
    For binary and varbinary fields use the slave side character set as reference
    to map them to binary or varbinary fields.
    e7b71e0d
field.h 146 KB