Commit ac4ad9bd authored by unknown's avatar unknown

WL#3072 Maria Recovery

misc fixes of execution of UNDOs in the UNDO phase:
- into the CLR_END, store the LSN of the _previous_ UNDO (we debated
what was best, so far we're going with "previous"; later we can change
to "current" if needed), and store the type of record which is being
undone (needed to know how to update state.records when we see the
CLR_END during the REDO phase).
- declaring all UNDOs and CLR_END as "compressed"
- when executing an UNDO in the UNDO phase, state.records is updated
as a hook when writing CLR_END (needed for "recovery of the state"),
and so is trn->undo_lsn (needed for when we have checkpoints).
- bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum
into the re-inserted row, maria_chk -r thus threw the row away).
- modifications of ma_test1: where to stop is now driven by --testflag;
--test-undo just tells how to stop (flush data, flush log, nothing).
- ma_test_recovery: testing of the UNDO phase, more testing of the
REDO phase, identification of a bug.


storage/maria/ma_blockrec.c:
  - bugfix: execution of UNDO_ROW_DELETE didn't store the correct
  checksum into the row (leading to "maria_chk -r" eliminating the
  re-inserted row, net effect was that rollback appeared to have
  rolled back no deletion). Reason was that write_block_record() used
  info->cur_row.checksum, while "row" can be != &info->cur_row
  (case of UNDO_ROW_DELETE). After fixing this, problems with
  _ma_update_block_record() appeared; indeed checksum was computed
  by  allocate_and_write_block_record() while _ma_update_block_record()
  directly calls write_block_record(). Solution is to compute checksum
  in write_block_record() instead.
  - when executing an UNDO, we now pass the LSN of the _previous_ UNDO
  to block_format functions. This LSN can be 0 (if the being-executed UNDO
  was the transaction's first UNDO), so "undo_lsn==0" cannot work
  anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR
  instead (this is an impossible LSN).
  - store into CLR_END the type of log record which was undone
  (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has
  to update state.records if it sees this CLR_END in the REDO phase.
  - when writing the CLR_END in _ma_apply_undo_row_insert(),
  the place to store file's id is log_data+LSN_STORE_SIZE.
  - in _ma_apply_undo_row_insert(), the records-- is moved
  to a hook when writing the CLR_END (this way it is under log's mutex
  which is needed for "recovery of the state")
storage/maria/ma_loghandler.c:
  - all UNDOs, and CLR_END, start with the LSN of another UNDO; so
  we can declare them "compressed".
  - write_hook_for_clr_end() to set trn->undo_lsn (to the previous
  UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's
  lock), and also update, if appropriate, state.records.
  - reset share->id to 0 when deassigning; not useful for now but
  sounds logical.
storage/maria/ma_recovery.c:
  - if no table is found for a REDO, it's not an error; for an UNDO, it is
  - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn
  and sometimes state.records.
  - in the UNDO phase, when we execute an UNDO_ROW_INSERT:
    * update trn->undo_lsn only after executing the record
    * store the _previous_ undo_lsn into the CLR_END
  - at the end of the REDO phase, when we recreate TRN objects, they
  have already their long id in the log (either via a
  LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write
  a new, useless LOGREC_LONG_TRANSACTION_ID for them.
storage/maria/ma_test1.c:
  * where to stop execution is now driven by --testflag and not --test-undo
  (ma_test2 already has --testflag for the same purpose). This allows
  us to do a clean stop (with commit) at any point.
  * --test-undo=# tells how to abort (flush all pages (which implies
  flushing log) or only log or nothing); all such "ways of crashing"
  are tested in ma_test_recovery
storage/maria/ma_test_recovery:
  * Testing execution of UNDOs, with and without BLOBs.
  * Testing idempotency of REDOs.
  * See @todo for a probable bug with BLOBs.
  * maria_chk -rq instead of -r, as with -q it nicely stops on any
  problem in the data file (like the checksum bug see comment of
  ma_blockrec.c).
  * Testing if log was written by UNDO phase (often expected),
  not written by REDO phase (always expected).
  * Less output on the screen, compares with expected output in the end.
  * some shell thingies like "set --" and $# are courtesy of
  Danny and Pekka.
storage/maria/maria_read_log.c:
  when only displaying the records, don't do an UNDO phase
storage/maria/ma_test_recovery.expected:
  This is the expected output of a great part of ma_test_recovery.
  ma_test_recovery compares its output to the expected output
  and tells if different.
  If we look at this file it mentions differences in checksum
  (normal, it's not recovered yet) and in records count
  (getting a correct records' count when recovery starts on an
  already existing table, like when testing rollback,
  is coded but not yet pushed).
parent 58ac5254
...@@ -1659,7 +1659,7 @@ static my_bool free_full_page_range(MARIA_HA *info, ulonglong page, uint count) ...@@ -1659,7 +1659,7 @@ static my_bool free_full_page_range(MARIA_HA *info, ulonglong page, uint count)
@param map_blocks On which pages the record should be stored @param map_blocks On which pages the record should be stored
@param row_pos Position on head page where to put head part of @param row_pos Position on head page where to put head part of
record record
@param undo_lsn <> 0 if we are in UNDO @param undo_lsn <> LSN_ERROR if we are executing an UNDO
@note @note
On return all pinned pages are released. On return all pinned pages are released.
...@@ -1729,7 +1729,10 @@ static my_bool write_block_record(MARIA_HA *info, ...@@ -1729,7 +1729,10 @@ static my_bool write_block_record(MARIA_HA *info,
if (share->base.pack_fields) if (share->base.pack_fields)
store_key_length_inc(data, row->field_lengths_length); store_key_length_inc(data, row->field_lengths_length);
if (share->calc_checksum) if (share->calc_checksum)
*(data++)= (uchar) info->cur_row.checksum; {
row->checksum= (info->s->calc_checksum)(info, record);
*(data++)= (uchar) (row->checksum); /* store least significant byte */
}
memcpy(data, record, share->base.null_bytes); memcpy(data, record, share->base.null_bytes);
data+= share->base.null_bytes; data+= share->base.null_bytes;
memcpy(data, row->empty_bits, share->base.pack_bytes); memcpy(data, row->empty_bits, share->base.pack_bytes);
...@@ -2283,19 +2286,25 @@ static my_bool write_block_record(MARIA_HA *info, ...@@ -2283,19 +2286,25 @@ static my_bool write_block_record(MARIA_HA *info,
{ {
LEX_STRING *log_array= info->log_row_parts; LEX_STRING *log_array= info->log_row_parts;
if (undo_lsn) if (undo_lsn != LSN_ERROR)
{ {
uchar log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE]; uchar log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE + 1];
/* undo_lsn must be first for compression to work */ /* undo_lsn must be first for compression to work */
lsn_store(log_data, undo_lsn); lsn_store(log_data, undo_lsn);
/*
Store if this CLR is about an UNDO_INSERT, UNDO_DELETE or UNDO_UPDATE;
in the first/second case, Recovery, when it sees the CLR_END in the
REDO phase, may decrement/increment the records' count.
*/
/** @todo when Monty has UNDO_UPDATE coded, revisit this */
log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE]= LOGREC_UNDO_ROW_DELETE;
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char*) log_data; log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char*) log_data;
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data); log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data);
if (translog_write_record(&lsn, LOGREC_CLR_END, if (translog_write_record(&lsn, LOGREC_CLR_END,
info->trn, info, sizeof(log_data), info->trn, info, sizeof(log_data),
TRANSLOG_INTERNAL_PARTS + 1, log_array, TRANSLOG_INTERNAL_PARTS + 1, log_array,
log_data+ FILEID_STORE_SIZE)) log_data + LSN_STORE_SIZE))
goto disk_err; goto disk_err;
} }
else else
...@@ -2425,7 +2434,7 @@ static my_bool write_block_record(MARIA_HA *info, ...@@ -2425,7 +2434,7 @@ static my_bool write_block_record(MARIA_HA *info,
@param info Maria handler @param info Maria handler
@param record Record to write @param record Record to write
@param row Information about fields in 'record' @param row Information about fields in 'record'
@param undo_lsn <> 0 if in undo @param undo_lsn <> LSN_ERROR if we are executing an UNDO
@return @return
@retval 0 ok @retval 0 ok
...@@ -2449,8 +2458,6 @@ static my_bool allocate_and_write_block_record(MARIA_HA *info, ...@@ -2449,8 +2458,6 @@ static my_bool allocate_and_write_block_record(MARIA_HA *info,
PAGECACHE_LOCK_WRITE, &row_pos)) PAGECACHE_LOCK_WRITE, &row_pos))
DBUG_RETURN(1); DBUG_RETURN(1);
row->lastpos= ma_recordpos(blocks->block->page, row_pos.rownr); row->lastpos= ma_recordpos(blocks->block->page, row_pos.rownr);
if (info->s->calc_checksum)
row->checksum= (info->s->calc_checksum)(info,record);
if (write_block_record(info, (uchar*) 0, record, row, if (write_block_record(info, (uchar*) 0, record, row,
blocks, blocks->block->org_bitmap_value != 0, blocks, blocks->block->org_bitmap_value != 0,
&row_pos, undo_lsn)) &row_pos, undo_lsn))
...@@ -2482,7 +2489,8 @@ MARIA_RECORD_POS _ma_write_init_block_record(MARIA_HA *info, ...@@ -2482,7 +2489,8 @@ MARIA_RECORD_POS _ma_write_init_block_record(MARIA_HA *info,
DBUG_ENTER("_ma_write_init_block_record"); DBUG_ENTER("_ma_write_init_block_record");
calc_record_size(info, record, &info->cur_row); calc_record_size(info, record, &info->cur_row);
if (allocate_and_write_block_record(info, record, &info->cur_row, 0)) if (allocate_and_write_block_record(info, record,
&info->cur_row, LSN_ERROR))
DBUG_RETURN(HA_OFFSET_ERROR); DBUG_RETURN(HA_OFFSET_ERROR);
DBUG_RETURN(info->cur_row.lastpos); DBUG_RETURN(info->cur_row.lastpos);
} }
...@@ -2669,7 +2677,7 @@ my_bool _ma_update_block_record(MARIA_HA *info, MARIA_RECORD_POS record_pos, ...@@ -2669,7 +2677,7 @@ my_bool _ma_update_block_record(MARIA_HA *info, MARIA_RECORD_POS record_pos,
if (cur_row->extents_count && free_full_pages(info, cur_row)) if (cur_row->extents_count && free_full_pages(info, cur_row))
goto err; goto err;
DBUG_RETURN(write_block_record(info, oldrec, record, new_row, blocks, DBUG_RETURN(write_block_record(info, oldrec, record, new_row, blocks,
1, &row_pos, 0)); 1, &row_pos, LSN_ERROR));
} }
/* /*
Allocate all size in block for record Allocate all size in block for record
...@@ -2702,7 +2710,7 @@ my_bool _ma_update_block_record(MARIA_HA *info, MARIA_RECORD_POS record_pos, ...@@ -2702,7 +2710,7 @@ my_bool _ma_update_block_record(MARIA_HA *info, MARIA_RECORD_POS record_pos,
row_pos.data= buff + uint2korr(dir); row_pos.data= buff + uint2korr(dir);
row_pos.length= head_length; row_pos.length= head_length;
DBUG_RETURN(write_block_record(info, oldrec, record, new_row, blocks, 1, DBUG_RETURN(write_block_record(info, oldrec, record, new_row, blocks, 1,
&row_pos, 0)); &row_pos, LSN_ERROR));
err: err:
_ma_unpin_all_pages(info, 0); _ma_unpin_all_pages(info, 0);
...@@ -4825,7 +4833,7 @@ my_bool _ma_apply_undo_row_insert(MARIA_HA *info, LSN undo_lsn, ...@@ -4825,7 +4833,7 @@ my_bool _ma_apply_undo_row_insert(MARIA_HA *info, LSN undo_lsn,
ulonglong page; ulonglong page;
uint rownr; uint rownr;
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 1]; LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 1];
uchar log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE], *buff; uchar log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE + 1], *buff;
my_bool res= 1; my_bool res= 1;
MARIA_PINNED_PAGE page_link; MARIA_PINNED_PAGE page_link;
LSN lsn; LSN lsn;
...@@ -4858,16 +4866,16 @@ my_bool _ma_apply_undo_row_insert(MARIA_HA *info, LSN undo_lsn, ...@@ -4858,16 +4866,16 @@ my_bool _ma_apply_undo_row_insert(MARIA_HA *info, LSN undo_lsn,
/* undo_lsn must be first for compression to work */ /* undo_lsn must be first for compression to work */
lsn_store(log_data, undo_lsn); lsn_store(log_data, undo_lsn);
log_data[LSN_STORE_SIZE + FILEID_STORE_SIZE]= LOGREC_UNDO_ROW_INSERT;
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char*) log_data; log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char*) log_data;
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data); log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data);
if (translog_write_record(&lsn, LOGREC_CLR_END, if (translog_write_record(&lsn, LOGREC_CLR_END,
info->trn, info, sizeof(log_data), info->trn, info, sizeof(log_data),
TRANSLOG_INTERNAL_PARTS + 1, log_array, TRANSLOG_INTERNAL_PARTS + 1, log_array,
log_data+ FILEID_STORE_SIZE)) log_data + LSN_STORE_SIZE))
goto err; goto err;
info->s->state.state.records--;
res= 0; res= 0;
err: err:
_ma_unpin_all_pages(info, lsn); _ma_unpin_all_pages(info, lsn);
......
...@@ -213,6 +213,9 @@ static my_bool write_hook_for_redo(enum translog_record_type type, ...@@ -213,6 +213,9 @@ static my_bool write_hook_for_redo(enum translog_record_type type,
static my_bool write_hook_for_undo(enum translog_record_type type, static my_bool write_hook_for_undo(enum translog_record_type type,
TRN *trn, MARIA_HA *tbl_info, LSN *lsn, TRN *trn, MARIA_HA *tbl_info, LSN *lsn,
struct st_translog_parts *parts); struct st_translog_parts *parts);
static my_bool write_hook_for_clr_end(enum translog_record_type type,
TRN *trn, MARIA_HA *tbl_info, LSN *lsn,
struct st_translog_parts *parts);
static my_bool translog_page_validator(uchar *page_addr, uchar* data_ptr); static my_bool translog_page_validator(uchar *page_addr, uchar* data_ptr);
...@@ -414,7 +417,8 @@ static LOG_DESC INIT_LOGREC_REDO_UNDELETE_ROW= ...@@ -414,7 +417,8 @@ static LOG_DESC INIT_LOGREC_REDO_UNDELETE_ROW=
"redo_undelete_row", LOGREC_NOT_LAST_IN_GROUP, NULL, NULL}; "redo_undelete_row", LOGREC_NOT_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_CLR_END= static LOG_DESC INIT_LOGREC_CLR_END=
{LOGRECTYPE_FIXEDLENGTH, 9, 9, NULL, write_hook_for_redo, NULL, 0, {LOGRECTYPE_PSEUDOFIXEDLENGTH, LSN_STORE_SIZE + FILEID_STORE_SIZE + 1,
LSN_STORE_SIZE + FILEID_STORE_SIZE + 1, NULL, write_hook_for_clr_end, NULL, 1,
"clr_end", LOGREC_LAST_IN_GROUP, NULL, NULL}; "clr_end", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_PURGE_END= static LOG_DESC INIT_LOGREC_PURGE_END=
...@@ -422,16 +426,16 @@ static LOG_DESC INIT_LOGREC_PURGE_END= ...@@ -422,16 +426,16 @@ static LOG_DESC INIT_LOGREC_PURGE_END=
"purge_end", LOGREC_LAST_IN_GROUP, NULL, NULL}; "purge_end", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_UNDO_ROW_INSERT= static LOG_DESC INIT_LOGREC_UNDO_ROW_INSERT=
{LOGRECTYPE_FIXEDLENGTH, {LOGRECTYPE_PSEUDOFIXEDLENGTH,
LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE, LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE, LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
NULL, write_hook_for_undo, NULL, 0, NULL, write_hook_for_undo, NULL, 1,
"undo_row_insert", LOGREC_LAST_IN_GROUP, NULL, NULL}; "undo_row_insert", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_UNDO_ROW_DELETE= static LOG_DESC INIT_LOGREC_UNDO_ROW_DELETE=
{LOGRECTYPE_VARIABLE_LENGTH, 0, {LOGRECTYPE_VARIABLE_LENGTH, 0,
LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE, LSN_STORE_SIZE + FILEID_STORE_SIZE + PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
NULL, write_hook_for_undo, NULL, 0, NULL, write_hook_for_undo, NULL, 1,
"undo_row_delete", LOGREC_LAST_IN_GROUP, NULL, NULL}; "undo_row_delete", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_UNDO_ROW_UPDATE= static LOG_DESC INIT_LOGREC_UNDO_ROW_UPDATE=
...@@ -451,8 +455,8 @@ static LOG_DESC INIT_LOGREC_UNDO_KEY_INSERT= ...@@ -451,8 +455,8 @@ static LOG_DESC INIT_LOGREC_UNDO_KEY_INSERT=
"undo_key_insert", LOGREC_LAST_IN_GROUP, NULL, NULL}; "undo_key_insert", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_UNDO_KEY_DELETE= static LOG_DESC INIT_LOGREC_UNDO_KEY_DELETE=
{LOGRECTYPE_VARIABLE_LENGTH, 0, 15, NULL, write_hook_for_undo, NULL, 0, {LOGRECTYPE_VARIABLE_LENGTH, 0, 15, NULL, write_hook_for_undo, NULL, 1,
"undo_key_delete", LOGREC_LAST_IN_GROUP, NULL, NULL}; // QQ: why not compressed? "undo_key_delete", LOGREC_LAST_IN_GROUP, NULL, NULL};
static LOG_DESC INIT_LOGREC_PREPARE= static LOG_DESC INIT_LOGREC_PREPARE=
{LOGRECTYPE_VARIABLE_LENGTH, 0, 0, NULL, NULL, NULL, 0, {LOGRECTYPE_VARIABLE_LENGTH, 0, 0, NULL, NULL, NULL, 0,
...@@ -6303,6 +6307,46 @@ static my_bool write_hook_for_undo(enum translog_record_type type ...@@ -6303,6 +6307,46 @@ static my_bool write_hook_for_undo(enum translog_record_type type
*/ */
} }
/**
@brief Sets transaction's undo_lsn, first_undo_lsn if needed
@todo move it to a separate file
@return Operation status, always 0 (success)
*/
static my_bool write_hook_for_clr_end(enum translog_record_type type
__attribute__ ((unused)),
TRN *trn, MARIA_HA *tbl_info
__attribute__ ((unused)),
LSN *lsn
__attribute__ ((unused)),
struct st_translog_parts *parts)
{
char *ptr= parts->parts[TRANSLOG_INTERNAL_PARTS + 0].str;
enum translog_record_type undone_record_type=
ptr[LSN_STORE_SIZE + FILEID_STORE_SIZE];
DBUG_ASSERT(trn->trid != 0);
/** @todo depending on what we are undoing, update "records" or not */
trn->undo_lsn= lsn_korr(ptr);
switch (undone_record_type) {
case LOGREC_UNDO_ROW_DELETE:
tbl_info->s->state.state.records++;
break;
case LOGREC_UNDO_ROW_INSERT:
tbl_info->s->state.state.records--;
break;
default:
DBUG_ASSERT(0);
}
if (trn->undo_lsn == LSN_IMPOSSIBLE) /* has fully rolled back */
trn->first_undo_lsn= LSN_WITH_FLAGS_TO_FLAGS(trn->first_undo_lsn);
return 0;
}
/** /**
@brief Gives a 2-byte-id to MARIA_SHARE and logs this fact @brief Gives a 2-byte-id to MARIA_SHARE and logs this fact
...@@ -6375,6 +6419,15 @@ int translog_assign_id_to_share(MARIA_SHARE *share, TRN *trn) ...@@ -6375,6 +6419,15 @@ int translog_assign_id_to_share(MARIA_SHARE *share, TRN *trn)
sizeof(log_array)/sizeof(log_array[0]), sizeof(log_array)/sizeof(log_array[0]),
log_array, NULL))) log_array, NULL)))
return 1; return 1;
/*
Note that we first set share->id then write the record. The checkpoint
record does not include any share with id==0; this is ok because:
checkpoint_start_log_horizon is either before or after the above
record. If before, ok to not include the share, as the record will be
seen for sure during the REDO phase. If after, Checkpoint will see all
data as it was after this record was written, including the id!=0, so
share will be included.
*/
} }
pthread_mutex_unlock(&share->intern_lock); pthread_mutex_unlock(&share->intern_lock);
return 0; return 0;
...@@ -6400,6 +6453,7 @@ void translog_deassign_id_from_share(MARIA_SHARE *share) ...@@ -6400,6 +6453,7 @@ void translog_deassign_id_from_share(MARIA_SHARE *share)
my_atomic_rwlock_rdlock(&LOCK_id_to_share); my_atomic_rwlock_rdlock(&LOCK_id_to_share);
my_atomic_storeptr((void **)&id_to_share[share->id], 0); my_atomic_storeptr((void **)&id_to_share[share->id], 0);
my_atomic_rwlock_rdunlock(&LOCK_id_to_share); my_atomic_rwlock_rdunlock(&LOCK_id_to_share);
share->id= 0;
} }
......
This diff is collapsed.
...@@ -15,11 +15,12 @@ ...@@ -15,11 +15,12 @@
/* Testing of the basic functions of a MARIA table */ /* Testing of the basic functions of a MARIA table */
#include "maria.h" #include "maria_def.h"
#include <my_getopt.h> #include <my_getopt.h>
#include <m_string.h> #include <m_string.h>
#include "ma_control_file.h" #include "ma_control_file.h"
#include "ma_loghandler.h" #include "ma_loghandler.h"
#include "trnman.h"
extern PAGECACHE *maria_log_pagecache; extern PAGECACHE *maria_log_pagecache;
extern const char *maria_data_root; extern const char *maria_data_root;
...@@ -28,7 +29,7 @@ extern const char *maria_data_root; ...@@ -28,7 +29,7 @@ extern const char *maria_data_root;
static void usage(); static void usage();
static int rec_pointer_size=0, flags[50]; static int rec_pointer_size=0, flags[50], testflag;
static int key_field=FIELD_SKIP_PRESPACE,extra_field=FIELD_SKIP_ENDSPACE; static int key_field=FIELD_SKIP_PRESPACE,extra_field=FIELD_SKIP_ENDSPACE;
static int key_type=HA_KEYTYPE_NUM; static int key_type=HA_KEYTYPE_NUM;
static int create_flag=0; static int create_flag=0;
...@@ -223,6 +224,9 @@ static int run_test(const char *filename) ...@@ -223,6 +224,9 @@ static int run_test(const char *filename)
if (maria_commit(file) || maria_begin(file)) if (maria_commit(file) || maria_begin(file))
goto err; goto err;
if (testflag == 1)
goto end;
/* Insert 2 rows with null values */ /* Insert 2 rows with null values */
if (null_fields) if (null_fields)
{ {
...@@ -240,16 +244,10 @@ static int run_test(const char *filename) ...@@ -240,16 +244,10 @@ static int run_test(const char *filename)
flags[0]=2; flags[0]=2;
} }
if (die_in_middle_of_transaction == 1) if (testflag == 2)
{ {
/* printf("terminating after inserts\n");
Ensure we get changed pages and log to disk goto end;
As commit record is not done, the undo entries needs to be rolled back.
*/
_ma_flush_table_files(file, MARIA_FLUSH_DATA, FLUSH_RELEASE,
FLUSH_RELEASE);
printf("Dying on request after insert without maria_close()\n");
exit(1);
} }
if (!skip_update) if (!skip_update)
...@@ -304,6 +302,8 @@ static int run_test(const char *filename) ...@@ -304,6 +302,8 @@ static int run_test(const char *filename)
maria_scan_end(file); maria_scan_end(file);
} }
if (testflag == 3)
goto end;
if (!silent) if (!silent)
printf("- Reopening file\n"); printf("- Reopening file\n");
if (maria_commit(file)) if (maria_commit(file))
...@@ -321,6 +321,12 @@ static int run_test(const char *filename) ...@@ -321,6 +321,12 @@ static int run_test(const char *filename)
for (i=0 ; i <= 10 ; i++) for (i=0 ; i <= 10 ; i++)
{ {
/*
If you want to debug the problem in ma_test_recovery with BLOBs
(see @todo there), you can break out of the loop after just one
delete, it is enough, like this:
if (i==1) break;
*/
/* testing */ /* testing */
if (remove_count-- == 0) if (remove_count-- == 0)
{ {
...@@ -355,19 +361,14 @@ static int run_test(const char *filename) ...@@ -355,19 +361,14 @@ static int run_test(const char *filename)
} }
} }
} }
}
if (die_in_middle_of_transaction == 2) if (testflag == 4)
{ {
/* printf("terminating after deletes\n");
Ensure we get changed pages and log to disk goto end;
As commit record is not done, the undo entries needs to be rolled back.
*/
_ma_flush_table_files(file, MARIA_FLUSH_DATA, FLUSH_RELEASE,
FLUSH_RELEASE);
printf("Dying on request after delete without maria_close()\n");
exit(1);
}
} }
if (!silent) if (!silent)
printf("- Reading rows with key\n"); printf("- Reading rows with key\n");
record[1]= 0; /* For nicer printf */ record[1]= 0; /* For nicer printf */
...@@ -412,6 +413,39 @@ static int run_test(const char *filename) ...@@ -412,6 +413,39 @@ static int run_test(const char *filename)
i-1,error,my_errno,read_record+1); i-1,error,my_errno,read_record+1);
} }
} }
end:
if (die_in_middle_of_transaction)
{
/* As commit record is not done, UNDO entries needs to be rolled back */
switch (die_in_middle_of_transaction) {
case 1:
/*
Flush changed pages go to disk. That will also flush log. Recovery
will skip REDOs and apply UNDOs.
*/
_ma_flush_table_files(file, MARIA_FLUSH_DATA, FLUSH_RELEASE,
FLUSH_RELEASE);
break;
case 2:
/*
Just flush log. Pages are likely to not be on disk. Recovery will
then execute REDOs and UNDOs.
*/
if (translog_flush(file->trn->undo_lsn))
goto err;
break;
case 3:
/*
Flush nothing. Pages and log are likely to not be on disk. Recovery
will then do nothing.
*/
break;
}
printf("Dying on request without maria_commit()/maria_close()\n");
exit(0);
}
if (maria_commit(file)) if (maria_commit(file))
goto err; goto err;
if (maria_close(file)) if (maria_close(file))
...@@ -676,11 +710,13 @@ static struct my_option my_long_options[] = ...@@ -676,11 +710,13 @@ static struct my_option my_long_options[] =
(uchar**) &skip_delete, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0}, (uchar**) &skip_delete, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0},
{"skip-update", 'D', "Don't test updates", (uchar**) &skip_update, {"skip-update", 'D', "Don't test updates", (uchar**) &skip_update,
(uchar**) &skip_update, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0}, (uchar**) &skip_update, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0},
{"testflag", 't', "Stop test at specified stage", (uchar**) &testflag,
(uchar**) &testflag, 0, GET_INT, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},
{"test-undo", 'A', {"test-undo", 'A',
"Abort hard after doing inserts. Used for testing recovery with undo", "Abort hard. Used for testing recovery with undo",
(uchar**) &die_in_middle_of_transaction, (uchar**) &die_in_middle_of_transaction,
(uchar**) &die_in_middle_of_transaction, (uchar**) &die_in_middle_of_transaction,
0, GET_INT, OPT_ARG, 0, 0, 0, 0, 0, 0}, 0, GET_INT, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},
{"transactional", 'T', {"transactional", 'T',
"Test in transactional mode. (Only works with block format)", "Test in transactional mode. (Only works with block format)",
(uchar**) &transactional, (uchar**) &transactional, 0, GET_BOOL, NO_ARG, (uchar**) &transactional, (uchar**) &transactional, 0, GET_BOOL, NO_ARG,
...@@ -768,12 +804,6 @@ get_one_option(int optid, const struct my_option *opt __attribute__((unused)), ...@@ -768,12 +804,6 @@ get_one_option(int optid, const struct my_option *opt __attribute__((unused)),
case 'K': /* Use key cacheing */ case 'K': /* Use key cacheing */
pagecacheing=1; pagecacheing=1;
break; break;
case 'A':
if (!argument)
die_in_middle_of_transaction= 1;
else
die_in_middle_of_transaction= atoi(argument);
break;
case 'V': case 'V':
printf("test1 Ver 1.2 \n"); printf("test1 Ver 1.2 \n");
exit(0); exit(0);
......
...@@ -7,58 +7,201 @@ then ...@@ -7,58 +7,201 @@ then
maria_path="." maria_path="."
fi fi
tmp=$maria_path/tmp # test data is always put in the current directory or a tmp subdirectory of it
tmp="./tmp"
if test '!' -d $tmp if test '!' -d $tmp
then then
mkdir $tmp mkdir $tmp
fi fi
echo "MARIA RECOVERY TESTS - success is if exit code is 0" echo "MARIA RECOVERY TESTS"
check_table_is_same()
{
# Computes checksum of new table and compares to checksum of old table
# Shows any difference in table's state (info from the index's header)
$maria_path/maria_chk -dvv $table | grep -v "Creation time:" > $tmp/maria_chk_message.txt 2>&1
# save the index file (because we want to test idempotency afterwards)
cp $table.MAI tmp/
# In the repair below it's good to use -q because it will die on any
# incorrectness of the data file if UNDO was badly applied.
# QQ: Remove the following line when we also can recover the index file
$maria_path/maria_chk -s -rq $table
$maria_path/maria_chk -s -e $table
checksum2=`$maria_path/maria_chk -dss $table`
if test "$checksum" != "$checksum2"
then
echo "checksum differs for $table before and after recovery"
return 1;
fi
diff $tmp/maria_chk_message.good.txt $tmp/maria_chk_message.txt > $tmp/maria_chk_diff.txt || true
if [ -s $tmp/maria_chk_diff.txt ]
then
echo "Differences in maria_chk -dvv, recovery not yet perfect !"
echo "========DIFF START======="
cat $tmp/maria_chk_diff.txt
echo "========DIFF END======="
fi
mv tmp/$table.MAI .
}
apply_log()
{
# applies log, can verify if applying did write to log or not
shouldchangelog=$1
if [ "$shouldchangelog" != "shouldnotchangelog" ] &&
[ "$shouldchangelog" != "shouldchangelog" ] &&
[ "$shouldchangelog" != "dontknow" ]
then
echo "bad argument '$shouldchangelog'"
return 1
fi
log_md5=`md5sum maria_log.*`
echo "applying log"
$maria_path/maria_read_log -a > $tmp/maria_read_log_$table.txt
log_md5_2=`md5sum maria_log.*`
if [ "$log_md5" != "$log_md5_2" ]
then
if [ "$shouldchangelog" == "shouldnotchangelog" ]
then
echo "maria_read_log should not have modified the log"
return 1
fi
else
if [ "$shouldchangelog" == "shouldchangelog" ]
then
echo "maria_read_log should have modified the log"
return 1
fi
fi
}
# To not flood the screen, we redirect all the commands below to a text file
# and just give a final error if their output is not as expected
(
# this message is to remember about the problem with -b (see @todo below)
echo "!!!!!!!! REMEMBER to FIX this BLOB issue !!!!!!!"
echo "Testing the REDO PHASE ALONE"
# runs a program inserting/deleting rows, then moves the resulting table # runs a program inserting/deleting rows, then moves the resulting table
# elsewhere; applies the log and checks that the data file is # elsewhere; applies the log and checks that the data file is
# identical to the saved original. # identical to the saved original.
# Does not test the index file as we don't have logging for it yet. # Does not test the index file as we don't have logging for it yet.
for prog in "$maria_path/ma_test1 $silent -M -T -c" "$maria_path/ma_test2 $silent -L -K -W -P -M -T -c" "$maria_path/ma_test2 $silent -M -T -c -b" set -- "$maria_path/ma_test1 $silent -M -T -c" "$maria_path/ma_test2 $silent -L -K -W -P -M -T -c" "$maria_path/ma_test2 $silent -M -T -c -b"
while [ $# != 0 ]
do do
rm -f maria_log.* maria_log_control prog=$1
rm maria_log.* maria_log_control
echo "TEST WITH $prog" echo "TEST WITH $prog"
$prog $prog
# derive table's name from program's name # derive table's name from program's name
table=`echo $prog | sed -e 's;.*ma_\(test[0-9]\).*;\1;' ` table=`echo $prog | sed -e 's;.*ma_\(test[0-9]\).*;\1;' `
$maria_path/maria_chk -dvv $table > $tmp/maria_chk_message.good.txt 2>&1 $maria_path/maria_chk -dvv $table | grep -v "Creation time:"> $tmp/maria_chk_message.good.txt 2>&1
checksum=`$maria_path/maria_chk -dss $table` checksum=`$maria_path/maria_chk -dss $table`
mv -f $table.MAD $tmp/$table.MAD.good mv $table.MAD $tmp/$table.MAD.good
rm $table.MAI rm $table.MAI
echo "applying log" apply_log "shouldnotchangelog"
$maria_path/maria_read_log -a > $tmp/maria_read_log_$table.txt
$maria_path/maria_chk -dvv $table > $tmp/maria_chk_message.txt 2>&1
cmp $table.MAD $tmp/$table.MAD.good cmp $table.MAD $tmp/$table.MAD.good
check_table_is_same
echo "testing idempotency"
apply_log "shouldnotchangelog"
cmp $table.MAD $tmp/$table.MAD.good
check_table_is_same
shift
done
# QQ: Remove the following line when we also can recovert the index file echo "Testing the REDO AND UNDO PHASE"
$maria_path/maria_chk -s -r $table # The test programs look like:
# work; commit (time T1); work; exit-without-commit (time T2)
# We first run the test program and let it exit after T1's commit.
# Then we run it again and let it exit at T2. Then we compare
# and expect identity.
$maria_path/maria_chk -s -e $table for blobs in "" "-b" # we test table without blobs and then table with blobs
checksum2=`$maria_path/maria_chk -dss $table` do
if test "$checksum" != "$checksum2" for test_undo in 1 2 3
do
# first iteration tests rollback of insert, second tests rollback of delete
set -- "$maria_path/ma_test1 $silent -M -T -c -N $blobs" "--testflag=1" "--testflag=2" "$maria_path/ma_test1 $silent -M -T -c -N --debug=d:t:i:o,/tmp/ma_test1.trace $blobs" "--testflag=3" "--testflag=4"
# -N (create NULL fields) is needed because --test-undo adds it anyway
while [ $# != 0 ]
do
prog=$1
commit_run_args=$2
abort_run_args=$3;
rm maria_log.* maria_log_control
echo "TEST WITH $prog $commit_run_args (commit at end)"
$prog $commit_run_args
# derive table's name from program's name
table=`echo $prog | sed -e 's;.*ma_\(test[0-9]\).*;\1;' `
$maria_path/maria_chk -dvv $table | grep -v "Creation time:"> $tmp/maria_chk_message.good.txt 2>&1
checksum=`$maria_path/maria_chk -dss $table`
mv $table.MAD $tmp/$table.MAD.good
rm $table.MAI
rm maria_log.* maria_log_control
echo "TEST WITH $prog $abort_run_args --test-undo=$test_undo (additional aborted work)"
$prog $abort_run_args --test-undo=$test_undo
cp $table.MAD $tmp/$table.MAD.before_undo
if [ $test_undo -lt 3 ]
then then
echo "checksum differs for $table before and after recovery" apply_log "shouldchangelog" # should undo aborted work
exit 1; else
# probably nothing to undo went to log or data file
apply_log "dontknow"
fi fi
cp $table.MAD $tmp/$table.MAD.after_undo
# It is impossible to do a "cmp" between .good and .after_undo,
# because the UNDO phase generated log
# records whose LSN tagged pages. Another reason is that rolling back
# INSERT only marks the rows free, does not empty them (optimization), so
# traces of the INSERT+rollback remain.
# When "recovery of the table's state" is ready, we can test it like this: check_table_is_same
# diff $tmp/maria_chk_message.good.txt $tmp/maria_chk_message.txt > $tmp/maria_chk_diff.txt || true echo "testing idempotency"
# if [ -s $tmp/maria_chk_diff.txt ] apply_log "shouldnotchangelog"
# then cmp $table.MAD $tmp/$table.MAD.after_undo
# echo "Differences in maria_chk -dvv, recovery not yet perfect !" check_table_is_same
# echo "========DIFF START=======" echo "testing applying of CLRs to recreate table"
# cat $tmp/maria_chk_diff.txt rm $table.MA?
# echo "========DIFF END=======" apply_log "shouldnotchangelog"
# fi # the cmp below fails with blobs! @todo RECOVERY BUG find out why.
rm -f $table.* $tmp/maria_chk_*.txt $tmp/maria_read_log_$table.txt # It is probably serious; REDOs shouldn't place rows in different
# positions from what the run-time code did. Indeed it may lead to
# more or less free space...
# Execution of UNDO re-inserted rows at different positions than
# originally. This generated REDOs which do not insert at the same
# positions as the execution of UNDOs, but at the same positions
# as before the row was originally deleted.
if [ "$blobs" == "" ]
then
cmp $table.MAD $tmp/$table.MAD.after_undo
fi
check_table_is_same
shift 3
done
done
done done
rm -f $table.* $tmp/$table* $tmp/maria_chk_*.txt $tmp/maria_read_log_$table.txt
) > $tmp/ma_test_recovery.output
diff $maria_path/ma_test_recovery.expected $tmp/ma_test_recovery.output > /dev/null || diff_failed=1
if [ "$diff_failed" == "1" ]
then
echo "UNEXPECTED OUTPUT OF TESTS, FAILED"
echo "For more info, do diff $maria_path/ma_test_recovery.expected $tmp/ma_test_recovery.output"
exit 1
fi
echo "ALL RECOVERY TESTS OK" echo "ALL RECOVERY TESTS OK"
# this message is to remember about the problem with -b (see @todo above)
echo "!!!!!!!! BUT REMEMBER to FIX this BLOB issue !!!!!!!"
This diff is collapsed.
...@@ -93,7 +93,8 @@ int main(int argc, char **argv) ...@@ -93,7 +93,8 @@ int main(int argc, char **argv)
*/ */
fprintf(stdout, "TRACE of the last maria_read_log\n"); fprintf(stdout, "TRACE of the last maria_read_log\n");
if (maria_apply_log(lsn, opt_display_and_apply, stdout, TRUE)) if (maria_apply_log(lsn, opt_display_and_apply, stdout,
opt_display_and_apply))
goto err; goto err;
fprintf(stdout, "%s: SUCCESS\n", my_progname); fprintf(stdout, "%s: SUCCESS\n", my_progname);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment