Commit 0f24c9fc authored by heikki@donna.mysql.fi's avatar heikki@donna.mysql.fi

os0file.c Use O_SYNC if possible, second choice fdatasync, and last choice fsync

configure.in	Check if fdatasync exists
manual.texi	Updated defragmenting doc
manual.texi	Corrected fillfactor contract threashold for a page to 1/2
parent 265164db
...@@ -25283,7 +25283,7 @@ of the index records. ...@@ -25283,7 +25283,7 @@ of the index records.
If index records are inserted in a sequential (ascending or descending) If index records are inserted in a sequential (ascending or descending)
order, the resulting index pages will be about 15/16 full. order, the resulting index pages will be about 15/16 full.
If records are inserted in a random order, then the pages will be If records are inserted in a random order, then the pages will be
1/2 - 15/16 full. If the fillfactor of an index page drops below 1/4, 1/2 - 15/16 full. If the fillfactor of an index page drops below 1/2,
InnoDB will try to contract the index tree to free the page. InnoDB will try to contract the index tree to free the page.
@subsubsection Insert buffering @subsubsection Insert buffering
...@@ -25440,13 +25440,20 @@ consistent read. ...@@ -25440,13 +25440,20 @@ consistent read.
If there are random insertions or deletions If there are random insertions or deletions
in the indexes of a table, the indexes in the indexes of a table, the indexes
may become fragmented. By this we mean that the physical ordering may become fragmented. By fragmentation we mean that the physical ordering
of the index pages on the disk is not close to the alphabetical ordering of the index pages on the disk is not close to the alphabetical ordering
of the records on the pages. It can speed up index scans if you of the records on the pages, or that there are many unused pages in the
64-page blocks which were allocated to the index.
It can speed up index scans if you
periodically use @code{mysqldump} to dump the table to periodically use @code{mysqldump} to dump the table to
a text file, drop the table, and reload it from the dump. a text file, drop the table, and reload it from the dump.
Another way to do the defragmenting is to @code{ALTER} the table type to
@code{MyISAM} and back to @code{InnoDB} again.
Note that a @code{MyISAM} table must fit in a single file
on your operating system.
Note that if the insertions to and index are always ascending and If the insertions to and index are always ascending and
records are deleted only from the end, then the the file space management records are deleted only from the end, then the the file space management
algorithm of InnoDB guarantees that fragmentation in the index will algorithm of InnoDB guarantees that fragmentation in the index will
not occur. not occur.
...@@ -37,6 +37,7 @@ AC_PROG_INSTALL ...@@ -37,6 +37,7 @@ AC_PROG_INSTALL
AC_CHECK_HEADERS(aio.h sched.h) AC_CHECK_HEADERS(aio.h sched.h)
AC_CHECK_SIZEOF(int, 4) AC_CHECK_SIZEOF(int, 4)
AC_CHECK_FUNCS(sched_yield) AC_CHECK_FUNCS(sched_yield)
AC_CHECK_FUNCS(fdatasync)
#AC_C_INLINE Already checked in MySQL #AC_C_INLINE Already checked in MySQL
AC_C_BIGENDIAN AC_C_BIGENDIAN
......
...@@ -299,6 +299,13 @@ try_again: ...@@ -299,6 +299,13 @@ try_again:
UT_NOT_USED(purpose); UT_NOT_USED(purpose);
/* On Linux opening a file in the O_SYNC mode seems to be much
more efficient than calling an explicit fsync or fdatasync after
each write */
#ifdef O_SYNC
create_flag = create_flag | O_SYNC;
#endif
if (create_mode == OS_FILE_CREATE) { if (create_mode == OS_FILE_CREATE) {
file = open(name, create_flag, S_IRUSR | S_IWUSR | S_IRGRP file = open(name, create_flag, S_IRUSR | S_IWUSR | S_IRGRP
| S_IWGRP | S_IROTH | S_IWOTH); | S_IWGRP | S_IROTH | S_IWOTH);
...@@ -492,8 +499,18 @@ os_file_flush( ...@@ -492,8 +499,18 @@ os_file_flush(
#else #else
int ret; int ret;
ret = fsync(file); #ifdef O_SYNC
/* We open all files with the O_SYNC option, which means there
should be no need for fsync or fdatasync. In practice such a need
may be because on a Linux Xeon computer "donna" the OS seemed to be
fooled to believe that 500 disk writes/second are possible. */
ret = 0;
#elif defined(HAVE_FDATASYNC)
ret = fdatasync(file);
#else
ret = fsync(file);
#endif
if (ret == 0) { if (ret == 0) {
return(TRUE); return(TRUE);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment