- 13 Apr, 2021 1 commit
-
-
sjaakola authored
After the merging of MDEV-24915, 10.6 branch has regressions with handling of concurrent write load against two or more cluster nodes. These regressions may surface as cluster hanging, node crashes or data inconsistency. With some test scenarios, the only visible symptom could be that the BF victim aborting happens only by innodb lock wait timeout expiration. This would result only to poor performance (by default 50 sec hang for each BF conflict), and could be somewhat difficult to diagnose. This pull request has following fixes to handle concurrent write load from multiple nodes: In lock_wait_wsrep_kill(), the victim trx was expected to be only in TRX_STATE_ACTIVE state. With the delayed BF conflict handling, it can happen that victim has advanced into pre commit state. This was fixed by choosing victim both in TRX_STATE_ACTIVE and TRX_STATE_PREPARED states. Victim transaction may be in several different states at the time of detected lock conflict, and due to delayed BF aborting practice in MDEV-24915, the victim may advance further before the actual BF aborting takes place. The BF aborting in MDEV-24915 did not wake the victim, if it was in the state of waiting for some other lock (than the one that was blocking the high priority thread). This anomaly caused the innodb lock wait timeout expiration delays and poor performance symptom. To fix this, lock_wait_wsrep_kill() now looks if victim is in lock waiting state, and uses lock_cancel_waiting_and_release() to cancel this lock wait. wsrep_bf_abort() checks if the victim has active transaction (in wsrep-lib), and starts a new transaction if there was no active transaction before. Due to late BF aborting, the victim may have e.g. failed in certification and is already aborting or has aborted at this stage. This has caused problems in testing where BF aborter tries to BF abort himself. The fix in wsrep_bf_abort() now skips the BF abort, if victim is aborting or has aborted. Victim may not have started transaction yet in wsrep context, but it may have acquired MDL locks (due to DDL execution), and this has caused BF conflict. Such case does not require aborting in wsrep or replication provider state. BF aborting could cause BF-BF conflict scenario, if victim was already aborted and changed to replayer having high priority as well. This BF-BF conflict scenario is now avoided in lock_wait_wsrep() where we now check if blocking lock holder is also high priority and is ordered before, caller should wait for the lock in this situation. The natural innodb deadlock resolving algorithm could pick BF thread as deadlock victim. This is fixed by giving max weigh to BF threads in Deadlock::report(). MDEV-24341 has changed excution paths in do_command() and this affects BF aborted victim execution. This PR fixes one assert in do_command(): DBUG_ASSERT(!thd->async_state.pending_ops()) Which fired if the thd was BF aborted earlier. This assert is now changed to allow pending_ops() if thd was BF aborted before. With these fixes, long term highly conflicting write load could be run against to node cluster. If binlogging is configured, log_slave_updates should be also set.
-
- 11 Apr, 2021 1 commit
-
-
Otto Kekäläinen authored
Touch attribute file to fix errors like: Can't open ./demoCA/index.txt.attr for reading, No such file or directory 140553384993216:error:02001002:system library: fopen:No such file or directory:../crypto/bio/bss_file.c:72: fopen('./demoCA/index.txt.attr','r') 140553384993216:error:2006D080:BIO routines: BIO_new_file:no such file:../crypto/bio/bss_file.c:79: Check that the request matches the signature
-
- 10 Apr, 2021 1 commit
-
-
Daniel Black authored
This reverts commit 51630d59. Monty has more comprehensive fix coming in his branch.
-
- 09 Apr, 2021 2 commits
-
-
Daniel Black authored
Move skip_locked near other bools
-
Marko Mäkelä authored
In commit 8ea923f5 (MDEV-24818) when we optimized multi-statement INSERT transactions into empty tables, we would roll back the entire transaction on any error. But, we would fail to invalidate any SAVEPOINT that had been requested in the past. trx_t::savepoints_discard(): Renamed from trx_roll_savepoints_free(). row_mysql_handle_errors(): If we were in bulk insert, invoke trx_t::savepoints_discard(). In this way, a future attempt of ROLLBACK TO SAVEPOINT will return an error.
-
- 08 Apr, 2021 15 commits
-
-
Daniel Black authored
This fixes the MariaDB-client RPM to conflict with < MariaDB-server-10.6 as a number of files where moved from the server to client component. see commits 09202b2e -> fa0ad2fb
-
Marko Mäkelä authored
In commit 8ea923f5 (MDEV-24818) when we optimized multi-statement INSERT into an empty table, we would sometimes wrongly enable bulk insert into a table that is actually already using row-level locking and undo logging. trx_has_lock_x(): New predicate, to check if the transaction of the current thread is holding an exclusive lock on a table. trx_undo_report_row_operation(): Only invoke trx_mod_table_time_t::start_bulk_insert() if trx_has_lock_x() holds.
-
Sujatha authored
Step 3: ====== Preserve worker pool information on either STOP SLAVE/Error. In case STOP SLAVE is executed worker threads will be gone, hence worker threads will be unavailable. Querying the table at this stage will give empty rows. To address this case when worker threads are about to stop, due to an error or forced stop, create a backup pool and preserve the data which is relevant to populate performance schema table. Clear the backup pool upon slave start.
-
Sujatha authored
Step2: ===== Add two extra columns mentioned below. --------------------------------------------------------------------------- |Column Name: | Description: | |-------------------------------------------------------------------------| | | | |WORKER_IDLE_TIME | Total idle time in seconds that the worker | | | thread has spent waiting for work from | | | co-ordinator thread | | | | |LAST_TRANS_RETRY_COUNT | Total number of retries attempted by last | | | transaction | ---------------------------------------------------------------------------
-
Sujatha authored
Step1: ===== Backport 'replication_applier_status_by_worker' from upstream. Iterate through rpl_parallel_thread_pool and display slave worker thread specific information as part of 'replication_applier_status_by_worker' table. --------------------------------------------------------------------------- |Column Name: | Description: | |-------------------------------------------------------------------------| | | | |CHANNEL_NAME | Name of replication channel through which the | | | transaction is received. | | | | |THREAD_ID | Thread_Id as displayed in 'performance_schema. | | | threads' table for thread with name | | | 'thread/sql/rpl_parallel_thread' | | | | | | THREAD_ID will be NULL when worker threads are | | | stopped due to an error/force stop | | | | |SERVICE_STATE | Thread is running or not | | | | |LAST_SEEN_TRANSACTION | Last GTID executed by worker | | | | |LAST_ERROR_NUMBER | Last Error that occured on a particular worker | | | | |LAST_ERROR_MESSAGE | Last error specific message | | | | |LAST_ERROR_TIMESTAMP | Time stamp of last error | | | | --------------------------------------------------------------------------- CHANNEL_NAME will be empty when the worker has not processed any transaction. Channel_name points to valid source channel_name when it is processing a transaction/event group.
-
Marko Mäkelä authored
In commit e71e6133 (MDEV-24671), lock_sys.wait_mutex was moved above lock_sys.mutex (which was later replaced with lock_sys.latch) in the latching order. In commit 7cf4419f (MDEV-24789), a potential hang was introduced to Galera. The function lock_wait() would hold lock_sys.wait_mutex while invoking wsrep_is_BF_lock_timeout(), which in turn could acquire LockMutexGuard for some diagnostic printout. wsrep_is_BF_lock_timeout(): Do not invoke trx_print_latched() or LockMutexGuard. lock_sys_t::wr_lock(), lock_sys_t::rd_lock(): Assert that the current thread is not holding lock_sys.wait_mutex. Unfortunately, RW-locks are not covered by SAFE_MUTEX. Reviewed by: Jan Lindström
-
Marko Mäkelä authored
-
Daniel Black authored
-
Martin Hansson authored
Imported the following tests from Oracle MySQL: * mysql-test/t/locking_clause.test * mysql-test/suite/innodb/t/skip_locked_nowait.test * mysql-test/t/locking_part.test Ported to MariaDB by Daniel Black, changes include: * Removed 'FOR SHARE OF /FOR UPDATE OF' tests that are part of MDEV-17514 (not yet implemented) * mysql-test/t/locking_clause.test removes broken the limit test (outstanding MDEV-25244) bug. * mysql-test/suite/innodb/t/skip_locked_nowait.test - removed high priority transactions * mysql-test/t/locking_part.test imported to mysql-test/suite/innodb/t/partition_locking.test changed ER_LOCK_NOWAIT -> ER_LOCK_WAIT_TIMEOUT consistent with already implemented WAIT/NOWAIT feature. Changed "FOR SHARE SKIP LOCKED" -> "LOCK IN SHARE MODE SKIP LOCKED".
-
Daniel Black authored
Adds an implementation for SELECT ... FOR UPDATE SKIP LOCKED / SELECT ... LOCK IN SHARED MODE SKIP LOCKED This is implemented only InnoDB at the moment, not in RockDB yet. This adds a new hander flag HA_CAN_SKIP_LOCKED than will be used when the storage engine advertises the flag. When a storage engine indicates this flag it will get TL_WRITE_SKIP_LOCKED and TL_READ_SKIP_LOCKED transaction types. The Lex structure has been updated to store both the FOR UPDATE/LOCK IN SHARE as well as the SKIP LOCKED so the SHOW CREATE VIEW implementation is simplier. "SELECT FOR UPDATE ... SKIP LOCKED" combined with CREATE TABLE AS or INSERT.. SELECT on the result set is not safe for STATEMENT based replication. MIXED replication will replicate this as row based events." Thanks to guidance from Facebook commit https://github.com/facebook/mysql-5.6/commit/193896c466d43fd905a62a60f1d73fd9c551a6e4 This helped verify basic test case, and components that need implementing (even though every part was implemented differently). Thanks Marko for guidance on simplier InnoDB implementation. Reviewers: Marko, Monty
-
Daniel Black authored
Use < TL_FIRST_WRITE for determining a READ transaction. Use TL_FIRST_WRITE as the relative operator replacing TL_WRITE_ALLOW_WRITE as the minimium WRITE lock type.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
- 07 Apr, 2021 6 commits
-
-
Monty authored
MDEV-25334 FTWRL/Backup blocks DDL on temporary tables with binlog enabled, assertion fails in Diagnostics_area::set_error_status Fixed by adding a MDL_BACKUP_COMMIT lock before altering temporary tables whose creation was logged to binary log (in which case the ALTER TABLE must also be logged)
-
Marko Mäkelä authored
A consistency check for fil_space_t::name is causing recovery failures in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field altogether. fil_space_t::name was more or less a copy of dict_table_t::name (except for some special cases), and it was not being used for anything useful. There used to be a name_hash, but it had been removed already in commit a75dbfd7 (MDEV-12266). We will also remove os_normalize_path(), OS_PATH_SEPARATOR, OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and / roughly in the same way. The intention is that for per-table tablespaces, the filenames will always follow the pattern prefix/databasename/tablename.ibd. (Any \ in the prefix must not be converted.) ut_basename_noext(): Remove (unused function). read_link_file(): Replaces RemoteDatafile::read_link_file(). We will ensure that the last two path component separators are forward slashes (converting up to 2 trailing backslashes on Microsoft Windows), so that everywhere else we can assume that data file names end in "/databasename/tablename.ibd". Note: On Microsoft Windows, path names that start with \\?\ must not contain / as path component separators. Previously, such paths did work in the DATA DIRECTORY argument of InnoDB tables. Reviewed by: Vladislav Vaintroub
-
Alexander Barkov authored
Problem: The problem happened because of a conceptual flaw in the server code: a. The table level CHARSET/COLLATE clause affected all data types, including numeric and temporal ones: CREATE TABLE t1 (a INT) CHARACTER SET utf8 [COLLATE utf8_general_ci]; In the above example, the Column_definition_attributes (and then the FRM record) for the column "a" erroneously inherited "utf8" as its character set. b. The "ALTER TABLE t1 CONVERT TO CHARACTER SET csname" statement also erroneously affected Column_definition_attributes::charset for numeric and temporal data types and wrote "csname" as their character set into FRM files. So now we have arbitrary non-relevant charset ID values for numeric and temporal data types in all FRM files in the world :) The code in the server and the other engines did not seem to be affected by this flaw. Only InnoDB inplace ALTER was affected. Solution: Fixing the code in the way that only character string data types (CHAR,VARCHAR,TEXT,ENUM,SET): - inherit the table level CHARSET/COLLATE clause - get the charset value according to "CONVERT TO CHARACTER SET csname". Numeric and temporal data types now always get &my_charset_numeric in Column_definition_attributes::charset and always write its ID into FRM files: - no matter what the table level CHARSET/COLLATE clause is, and - no matter what "CONVERT TO CHARACTER SET" says. Details: 1. Adding helper classes to pass small parts of HA_CREATE_INFO into Type_handler methods: - Column_derived_attributes - to pass table level CHARSET/COLLATE, so columns that do not have explicit CHARSET/COLLATE clauses can derive them from the table level, e.g. CREATE TABLE t1 (a VARCHAR(1), b CHAR(1)) CHARACTER SET utf8; - Column_bulk_alter_attributes - to pass bulk attribute changes generated by the ALTER related code. These bulk changes affect multiple columns at the same time: ALTER TABLE ... CONVERT TO CHARACTER SET csname; Note, passing the whole HA_CREATE_INFO directly to Type_handler would not be good: HA_CREATE_INFO is huge and would need not desired dependencies in sql_type.h and sql_type.cc. The Type_handler API should use smallest possible data types! 2. Type_handler::Column_definition_prepare_stage1() is now responsible to set Column_definition::charset properly, according to the data type, for example: - For string data types, Column_definition_attributes::charset is set from the table level CHARSET/COLLATE clause (if not specified explicitly in the column definition). - For numeric and temporal fields, Column_definition_attributes::charset is set to &my_charset_numeric, no matter what the table level CHARSET/COLLATE says. - For GEOMETRY, Column_definition_attributes::charset is set to &my_charset_bin, no matter what the table level CHARSET/COLLATE says. Previously this code (setting `charset`) was outside of of Column_definition_prepare_stage1(), namely in mysql_prepare_create_table(), and was erroneously called for all data types. 3. Adding Type_handler::Column_definition_bulk_alter(), to handle "ALTER TABLE .. CONVERT TO". Previously this code was inside get_sql_field_charset() and was erroneously called for all data types. 4. Removing the Schema_specification_st parameter from Type_handler::Column_definition_redefine_stage1(). Column_definition_attributes::charset is now fully properly initialized by Column_definition_prepare_stage1(). So we don't need access to the table level CHARSET/COLLATE clause in Column_definition_redefine_stage1() any more. 5. Other changes: - Removing global function get_sql_field_charset() - Moving the part of the former get_sql_field_charset(), which was responsible to inherit the table level CHARSET/COLLATE clause to new methods: -- Column_definition_attributes::explicit_or_derived_charset() and -- Column_definition::prepare_charset_for_string(). This code is only needed for string data types. Previously it was erroneously called for all data types. - Moving another part, which was responsible to apply the "CONVERT TO" clause, to Type_handler_general_purpose_string::Column_definition_bulk_alter(). - Replacing the call for get_sql_field_charset() in sql_partition.cc to sql_field->explicit_or_derived_charset() - it is perfectly enough. The old code was redundant: get_sql_field_charset() was called from sql_partition.cc only when there were no a "CONVERT TO CHARACTER SET" clause involved, so its purpose was only to inherit the table level CHARSET/COLLATE clause. - Moving the code handling the BINCMP_FLAG flag from mysql_prepare_create_table() to Column_definition::prepare_charset_for_string(): This code is responsible to resolve the BINARY comparison style into the corresponding _bin collation, to do the following transparent rewrite: CREATE TABLE t1 (a VARCHAR(10) BINARY) CHARSET utf8; -> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_bin); This code is only needed for string data types. Previously it was erroneously called for all data types. 6. Renaming Table_scope_and_contents_source_pod_st::table_charset to alter_table_convert_to_charset, because the only purpose it's used for is handlering "ALTER .. CONVERT". The new name is much more self-descriptive.
-
Daniel Black authored
-
Daniel Black authored
-
Daniel Black authored
SI_USER is, however in FreeBSD there are a couple of non-kernel user signal infomations above SI_KERNEL. Put a fallback just in case there is nothing available.
-
- 06 Apr, 2021 2 commits
-
-
Jan Lindström authored
Added handling for sql_safe_updated i.e. we disable it while we do wsrep_schema operations.
-
Monty authored
MDEV-17913 Encrypted transactional Aria tables remain corrupt after crash recovery, automatic repairment does not work This was because of a wrong test in encryption code that wrote random numbers over the LSN for pages for transactional Aria tables during repair. The effect was that after an ALTER TABLE ENABLE KEYS of a encrypted recovery of the tables would not work. Fixed by changing testing of !share->now_transactional to !share->base.born_transactional. Other things: - Extended Aria check_table() to check for wrong (= too big) LSN numbers. - If check_table() failed just because of wrong LSN or TRN numbers, a following repair table will just do a zerofill which is much faster. - Limit number of LSN errors in one check table to MAX_LSN_ERROR (10). - Removed old obsolete test of 'if (error_count & 2)'. Changed error_count and warning_count from bits to numbers of errors/warnings as this is more useful.
-
- 05 Apr, 2021 2 commits
-
-
mkaruza authored
`WSREP_CLIENT` is used as condition for starting ALTER/OPTIMIZE/REPAIR TOI. Using this condition async replicated affected DDL's will not be replicated. Fixed by removing this condition. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
Daniele Sciascia authored
This patch makes the following changes around variable wsrep_on: 1) Variable wsrep_on can no longer be updated from a session that has an active transaction running. The original behavior allowed cases like this: BEGIN; INSERT INTO t1 VALUES (1); SET SESSION wsrep_on = OFF; INSERT INTO t1 VALUES (2); COMMIT; With regular transactions this would result in no replication events (not even value 1). With streaming replication it would be unnecessarily complex to achieve the same behavior. In the above example, it would be possible for value 1 to be already replicated if it happened to fill a separate fragment, while value 2 wouldn't. 2) Global variable wsrep_on no longer affects current sessions, only subsequent ones. This is to avoid a similar case to the above, just using just by using global wsrep_on instead session wsrep_on: --connection conn_1 BEGIN; INSERT INTO t1 VALUES(1); --connection conn_2 SET GLOBAL wsrep_on = OFF; --connection conn_1 INSERT INTO t1 VALUES(2); COMMIT; The above example results in the transaction to be replicated, as global wsrep_on will only affect the session wsrep_on of new connections. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
- 02 Apr, 2021 1 commit
-
-
Monty authored
MDEV-17913 Encrypted transactional Aria tables remain corrupt after crash recovery, automatic repairment does not work This was because of a wrong test in encryption code that wrote random numbers over the LSN for pages for transactional Aria tables during repair. The effect was that after an ALTER TABLE ENABLE KEYS of a encrypted recovery of the tables would not work. The test cases will be pushed into 10.5 as it requires of several changes to check table that safer not to backport.
-
- 01 Apr, 2021 3 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
log_flush_notify(): Restore the reload of log_requests.start that was accidentally removed in commit 8c2e3259. Thanks to Elena Stepanova for a test case (repeatedly running FLUSH LOGS concurrently with InnoDB write transactions).
-
Marko Mäkelä authored
In commit 1bd41158 the intention was to move the trx_t::is_wsrep() check to the caller of the function wsrep_is_BF_lock_timeout().
-
- 31 Mar, 2021 6 commits
-
-
Marko Mäkelä authored
Merge commit 176aaf93 accidentally broke the WITH_WSREP release build.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
buf_page_get_low(): Do not try to re-evict the page if it is multiply buffer-fixed. In commit 7cffb5f6 (MDEV-23399) a livelock was introduced. If multiple threads are concurrently requesting the same secondary index leaf page in buf_page_get_low() and innodb_change_buffering_debug is set, all threads would try to evict the page in a busy loop, never succeeding because the block is buffer-fixed by other threads. Thanks to Roel Van de Paar for reporting the original failure and Elena Stepanova for producing an "rr replay" trace.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-