Commit 66d27243 authored by Hisashi Hifumi's avatar Hisashi Hifumi Committed by Linus Torvalds

[PATCH] BUG on error handlings in Ext3 under I/O failure condition

I found bugs on error handlings in the functions arround the ext3 file
system, which cause inadequate completions of synchronous write I/O
operations when disk I/O failures occur.  Both 2.4 and 2.6 have this
problem.

	I carried out following experiment:

1.  Mount a ext3 file system on a SCSI disk with ordered mode.
2.  Open a file on the file system with O_SYNC|O_RDWR|O_TRUNC|O_CREAT flag.
3.  Write 512 bytes data to the file by calling write() every 5 seconds, and
     examine return values from the syscall.
     from write().
4.  Disconnect the SCSI cable,  and examine messages from the kernel.

After the SCSI cable is disconnected, write() must fail.  But the result
was different: write() succeeded for a while even though messages of the
kernel notified SCSI I/O error.

By applying following modifications, the above problem was solved.
Signed-off-by: default avatarHisashi Hifumi <hifumi.hisashi@lab.ntt.co.jp>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent b0c7ad66
......@@ -311,10 +311,10 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
{
struct inode * inode = dentry->d_inode;
struct super_block * sb;
int ret;
int ret, err;
/* sync the inode to buffers */
write_inode_now(inode, 0);
ret = write_inode_now(inode, 0);
/* sync the superblock to buffers */
sb = inode->i_sb;
......@@ -324,7 +324,9 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
unlock_super(sb);
/* .. finally sync the buffers to disk */
ret = sync_blockdev(sb->s_bdev);
err = sync_blockdev(sb->s_bdev);
if (!ret)
ret = err;
return ret;
}
......
......@@ -558,22 +558,24 @@ void sync_inodes(int wait)
* dirty. This is primarily needed by knfsd.
*/
void write_inode_now(struct inode *inode, int sync)
int write_inode_now(struct inode *inode, int sync)
{
int ret;
struct writeback_control wbc = {
.nr_to_write = LONG_MAX,
.sync_mode = WB_SYNC_ALL,
};
if (inode->i_mapping->backing_dev_info->memory_backed)
return;
return 0;
might_sleep();
spin_lock(&inode_lock);
__writeback_single_inode(inode, &wbc);
ret = __writeback_single_inode(inode, &wbc);
spin_unlock(&inode_lock);
if (sync)
wait_on_inode(inode);
return ret;
}
EXPORT_SYMBOL(write_inode_now);
......@@ -642,8 +644,11 @@ int generic_osync_inode(struct inode *inode, struct address_space *mapping, int
need_write_inode_now = 1;
spin_unlock(&inode_lock);
if (need_write_inode_now)
write_inode_now(inode, 1);
if (need_write_inode_now) {
err2 = write_inode_now(inode, 1);
if (!err)
err = err2;
}
else
wait_on_inode(inode);
......
......@@ -337,6 +337,9 @@ void journal_commit_transaction(journal_t *journal)
}
spin_unlock(&journal->j_list_lock);
if (err)
__journal_abort_hard(journal);
journal_write_revoke_records(journal, commit_transaction);
jbd_debug(3, "JBD: commit phase 2\n");
......
......@@ -1344,7 +1344,7 @@ static inline void invalidate_remote_inode(struct inode *inode)
invalidate_inode_pages(inode->i_mapping);
}
extern int invalidate_inode_pages2(struct address_space *mapping);
extern void write_inode_now(struct inode *, int);
extern int write_inode_now(struct inode *, int);
extern int filemap_fdatawrite(struct address_space *);
extern int filemap_flush(struct address_space *);
extern int filemap_fdatawait(struct address_space *);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment