Commit 31d592ba authored by Marko Mäkelä's avatar Marko Mäkelä

MDEV-18349 InnoDB file size changes are not safe when file system crashes

When InnoDB is invoking posix_fallocate() to extend data files, it
was missing a call to fsync() to update the file system metadata.
If file system recovery is needed, the file size could be incorrect.

When the setting innodb_flush_method=O_DIRECT_NO_FSYNC
that was introduced in MariaDB 10.0.11 (and MySQL 5.6) is enabled,
InnoDB would wrongly skip fsync() after extending files.

Furthermore, the merge commit d8b45b0c
inadvertently removed XtraDB error checking for posix_fallocate()
which this fix is restoring.

fil_flush(): Add the parameter bool metadata=false to request that
fil_buffering_disabled() be ignored.

fil_extend_space_to_desired_size(): Invoke fil_flush() with the
extra parameter. After successful posix_fallocate(), invoke
os_file_flush(). Note: The bookkeeping for fil_flush() would not be
updated the posix_fallocate() code path, so the "redundant"
fil_flush() should be a no-op.
parent 6786fb00
/***************************************************************************** /*****************************************************************************
Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2014, 2017, MariaDB Corporation. All Rights Reserved. Copyright (c) 2014, 2019, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -4914,6 +4914,8 @@ fil_extend_space_to_desired_size( ...@@ -4914,6 +4914,8 @@ fil_extend_space_to_desired_size(
" failed with error %d", " failed with error %d",
node->name, start_offset, len + start_offset, node->name, start_offset, len + start_offset,
err); err);
} else {
os_file_flush(node->handle);
} }
DBUG_EXECUTE_IF("ib_os_aio_func_io_failure_28", DBUG_EXECUTE_IF("ib_os_aio_func_io_failure_28",
...@@ -5025,7 +5027,7 @@ fil_extend_space_to_desired_size( ...@@ -5025,7 +5027,7 @@ fil_extend_space_to_desired_size(
size_after_extend, *actual_size); */ size_after_extend, *actual_size); */
mutex_exit(&fil_system->mutex); mutex_exit(&fil_system->mutex);
fil_flush(space_id); fil_flush(space_id, true);
return(success); return(success);
} }
...@@ -5641,21 +5643,16 @@ fil_aio_wait( ...@@ -5641,21 +5643,16 @@ fil_aio_wait(
} }
#endif /* UNIV_HOTBACKUP */ #endif /* UNIV_HOTBACKUP */
/**********************************************************************//** /** Make persistent possible writes cached by the OS.
Flushes to disk possible writes cached by the OS. If the space does not exist If the space does not exist or is being dropped, do nothing.
or is being dropped, does not do anything. */ @param[in] space_id tablespace identifier
UNIV_INTERN @param[in] metadata whether to update file system metadata */
void UNIV_INTERN void fil_flush(ulint space_id, bool metadata)
fil_flush(
/*======*/
ulint space_id) /*!< in: file space id (this can be a group of
log files or a tablespace of the database) */
{ {
fil_space_t* space; fil_space_t* space;
fil_node_t* node; fil_node_t* node;
pfs_os_file_t file; pfs_os_file_t file;
mutex_enter(&fil_system->mutex); mutex_enter(&fil_system->mutex);
space = fil_space_get_by_id(space_id); space = fil_space_get_by_id(space_id);
...@@ -5684,8 +5681,10 @@ fil_flush( ...@@ -5684,8 +5681,10 @@ fil_flush(
} }
#endif /* UNIV_DEBUG */ #endif /* UNIV_DEBUG */
mutex_exit(&fil_system->mutex); if (!metadata) {
return; mutex_exit(&fil_system->mutex);
return;
}
} }
space->n_pending_flushes++; /*!< prevent dropping of the space while space->n_pending_flushes++; /*!< prevent dropping of the space while
......
/***************************************************************************** /*****************************************************************************
Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2017, 2018, MariaDB Corporation. Copyright (c) 2017, 2019, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -972,15 +972,11 @@ fil_aio_wait( ...@@ -972,15 +972,11 @@ fil_aio_wait(
/*=========*/ /*=========*/
ulint segment); /*!< in: the number of the segment in the aio ulint segment); /*!< in: the number of the segment in the aio
array to wait for */ array to wait for */
/**********************************************************************//** /** Make persistent possible writes cached by the OS.
Flushes to disk possible writes cached by the OS. If the space does not exist If the space does not exist or is being dropped, do nothing.
or is being dropped, does not do anything. */ @param[in] space_id tablespace identifier
UNIV_INTERN @param[in] metadata whether to update file system metadata */
void UNIV_INTERN void fil_flush(ulint space_id, bool metadata = false);
fil_flush(
/*======*/
ulint space_id); /*!< in: file space id (this can be a group of
log files or a tablespace of the database) */
/**********************************************************************//** /**********************************************************************//**
Flushes to disk writes in file spaces of the given type possibly cached by Flushes to disk writes in file spaces of the given type possibly cached by
the OS. */ the OS. */
......
/***************************************************************************** /*****************************************************************************
Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2014, 2017, MariaDB Corporation. All Rights Reserved. Copyright (c) 2014, 2019, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -4947,6 +4947,17 @@ fil_extend_space_to_desired_size( ...@@ -4947,6 +4947,17 @@ fil_extend_space_to_desired_size(
} while (err == EINTR } while (err == EINTR
&& srv_shutdown_state == SRV_SHUTDOWN_NONE); && srv_shutdown_state == SRV_SHUTDOWN_NONE);
success = !err;
if (!success) {
ib_logf(IB_LOG_LEVEL_ERROR, "extending file %s"
" from " INT64PF " to " INT64PF " bytes"
" failed with error %d",
node->name, start_offset, len + start_offset,
err);
} else {
os_file_flush(node->handle);
}
DBUG_EXECUTE_IF("ib_os_aio_func_io_failure_28", DBUG_EXECUTE_IF("ib_os_aio_func_io_failure_28",
success = FALSE; os_has_said_disk_full = TRUE;); success = FALSE; os_has_said_disk_full = TRUE;);
...@@ -5056,7 +5067,7 @@ fil_extend_space_to_desired_size( ...@@ -5056,7 +5067,7 @@ fil_extend_space_to_desired_size(
size_after_extend, *actual_size); */ size_after_extend, *actual_size); */
mutex_exit(&fil_system->mutex); mutex_exit(&fil_system->mutex);
fil_flush(space_id); fil_flush(space_id, true);
return(success); return(success);
} }
...@@ -5705,21 +5716,16 @@ fil_aio_wait( ...@@ -5705,21 +5716,16 @@ fil_aio_wait(
} }
#endif /* UNIV_HOTBACKUP */ #endif /* UNIV_HOTBACKUP */
/**********************************************************************//** /** Make persistent possible writes cached by the OS.
Flushes to disk possible writes cached by the OS. If the space does not exist If the space does not exist or is being dropped, do nothing.
or is being dropped, does not do anything. */ @param[in] space_id tablespace identifier
UNIV_INTERN @param[in] metadata whether to update file system metadata */
void UNIV_INTERN void fil_flush(ulint space_id, bool metadata)
fil_flush(
/*======*/
ulint space_id) /*!< in: file space id (this can be a group of
log files or a tablespace of the database) */
{ {
fil_space_t* space; fil_space_t* space;
fil_node_t* node; fil_node_t* node;
pfs_os_file_t file; pfs_os_file_t file;
mutex_enter(&fil_system->mutex); mutex_enter(&fil_system->mutex);
space = fil_space_get_by_id(space_id); space = fil_space_get_by_id(space_id);
...@@ -5748,8 +5754,10 @@ fil_flush( ...@@ -5748,8 +5754,10 @@ fil_flush(
} }
#endif /* UNIV_DEBUG */ #endif /* UNIV_DEBUG */
mutex_exit(&fil_system->mutex); if (!metadata) {
return; mutex_exit(&fil_system->mutex);
return;
}
} }
space->n_pending_flushes++; /*!< prevent dropping of the space while space->n_pending_flushes++; /*!< prevent dropping of the space while
......
/***************************************************************************** /*****************************************************************************
Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1995, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2017, 2018, MariaDB Corporation. Copyright (c) 2017, 2019, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -976,15 +976,11 @@ fil_aio_wait( ...@@ -976,15 +976,11 @@ fil_aio_wait(
/*=========*/ /*=========*/
ulint segment); /*!< in: the number of the segment in the aio ulint segment); /*!< in: the number of the segment in the aio
array to wait for */ array to wait for */
/**********************************************************************//** /** Make persistent possible writes cached by the OS.
Flushes to disk possible writes cached by the OS. If the space does not exist If the space does not exist or is being dropped, do nothing.
or is being dropped, does not do anything. */ @param[in] space_id tablespace identifier
UNIV_INTERN @param[in] metadata whether to update file system metadata */
void UNIV_INTERN void fil_flush(ulint space_id, bool metadata = false);
fil_flush(
/*======*/
ulint space_id); /*!< in: file space id (this can be a group of
log files or a tablespace of the database) */
/**********************************************************************//** /**********************************************************************//**
Flushes to disk writes in file spaces of the given type possibly cached by Flushes to disk writes in file spaces of the given type possibly cached by
the OS. */ the OS. */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment