[PATCH] Fix inode size accounting race
Since Jan removed the lock_kernel()s in inode_add_bytes() and inode_sub_bytes(), these functions have been racy. One problematic workload has been discovered in which concurrent writepage and truncate on SMP quickly causes i_blocks to go negative. writepage() does not take i_sem, and it seems that for ext2, there are no other locks in force when inode_add_bytes() is called. Putting the BKL back in there is not acceptable. To fix this race I have added a new spinlock "i_lock" to the inode. That lock is presently used to protect i_bytes and i_blocks. We could use it to protect i_size as well. The splitting of the used disk space into i_blocks and i_bytes is silly - we should nuke all that and just have a bare loff_t i_usedbytes. Later.
Showing
Please register or sign in to comment