[PATCH] posix locks oops fix
From: Andreas Gruenbacher <agruen@suse.de>
There is a race between unshare_files() and the following steal_locks().
As a consequence, steal_locks() may steal some additional FL_POSIX locks
that don't belong to the current thread. This triggers a BUG in
locks_remove_flock().
In detail, the current thread shares its files struct with other threads.
This causes unshare_files() to associate the current thread with a copy of
its files_struct. The copy shares all file objects with the original files
struct. In the time between unshare_files() and steal_locks(), another
thread creates a new file and a FL_POSIX lock on it. The current thread
gets into steal_locks() and takes over all FL_POSIX locks that refer to the
previous files_struct, including the new lock. We do
put_files_struct(original files_struct). This causes the file handle to
the new file to be closed. We get into locks_remove_posix() and miss the
lock, because its fl_owner field now refers to the new files_struct.
Finally we get into locks_remove_flock(), and stumble upon the lock.
While looking into this bug report I gathered the following data with a
SUSE kernel (oops and LKCD dump from Chris):
kernel BUG at fs/locks.c:1736!
invalid operand: 0000 [#1]
SMP
CPU: 0
EIP: 0060:[<c01844fb>] Tainted: G U
EFLAGS: 00010246 (2.6.5-0-testing)
EIP is at locks_remove_flock+0x8b/0x130
eax: f7b89998 ebx: f61df3fc ecx: f61df354 edx: 00000000
esi: f61df354 edi: f6702b80 ebp: f6179c24 esp: f6179c08
ds: 007b es: 007b ss: 0068
Process owcimomd (pid: 1713, threadinfo=f6178000 task=f66d0d60)
Stack: c1e1fdac c1e1fdac f7fe83c0 00000296 f6702b80 f7fe87c0 f61df354 f6179c3c
c016ce00 f61ddadc f6702b80 00000000 f6703b00 f6179c54 c0168b1f c0000000
0000026f 00000012 f6703b00 f6179c6c c0124ba7 00000001 f6179e5c f6179d88
Call Trace:
[<c016ce00>] __fput+0x30/0x120
[<c0168b1f>] filp_close+0x4f/0x90
[<c0124ba7>] put_files_struct+0x67/0xc0
[<c019d285>] load_elf_binary+0x3f5/0x1596
[<c018a5af>] update_atime+0x9f/0xc0
[<c01478fd>] __generic_file_aio_read+0x1cd/0x200
[<c0145060>] file_read_actor+0x0/0xd0
[<c01784b7>] search_binary_handler+0x97/0x270
[<c017a072>] do_execve+0x172/0x200
[<c0105fb2>] sys_execve+0x32/0x70
[<c0107e21>] sysenter_past_esp+0x52/0x71
Code: 0f 0b c8 06 eb 74 35 c0 eb db b8 00 e0 ff ff 21 e0 8b 10 8b
put_files_struct+0x67 is equivalent to fs/binfmt_elf.c:681 in 2.6.6
current->files == fl->fl_owner
fl->fl_file = 0xf6702b80 (a valid struct file)
current->files =
max_fds=32
max_fdset=1024
next_fd=3
fd=[0xf6927080 0xf6951b80 0xf6951b80 0 ...]
Here's a proposed fix. As a side effect, steal_locks no longer walks the
global list of locks, but only the locks of all open inodes.
What are the reasons (other than historic ones) for not getting rid of
fl_owner and using fl_pid instead, by the way? I think that would clean up
the whole mess with file locks a bit.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Showing
Please register or sign in to comment