Commits · 47e393622bbdd48aa21837eb2c55ee1c359e080c · Kirill Smelkov / linux

12 Apr, 2015 34 commits

aio_run_iocb(): kill dead check · 47e39362

Al Viro authored Mar 31, 2015

We check if ->ki_pos is positive.  However, by that point we have
already done rw_verify_area(), which would have rejected such
unless the file had been one of /dev/mem, /dev/kmem and /proc/kcore.
All of which do not have vectored rw methods, so we would've bailed
out even earlier.

This check had been introduced before rw_verify_area() had been added there
- in fact, it was a subset of checks done on sync paths by rw_verify_area()
(back then the /dev/mem exception didn't exist at all).  The rest of checks
(mandatory locking, etc.) hadn't been added until later.  Unfortunately,
by the time the call of rw_verify_area() got added, the /dev/mem exception
had already appeared, so it wasn't obvious that the older explicit check
downstream had become dead code.  It *is* a dead code, though, since the few
files for which the exception applies do not have ->aio_{read,write}() or
->{read,write}_iter() and for them we won't reach that check anyway.

What's more, even if we ever introduce vectored methods for /dev/mem
and friends, they'll have to cope with negative positions anyway, since
readv(2) and writev(2) are using the same checks as read(2) and write(2) -
i.e. rw_verify_area().

Let's bury it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

47e39362

ioctx_alloc(): remove pointless check · 08397acd

Al Viro authored Mar 31, 2015

Way, way back kiocb used to be picked from arrays, so ioctx_alloc()
checked for multiplication overflow when calculating the size of
such array.  By the time fs/aio.c went into the tree (in 2002) they
were already allocated one-by-one by kmem_cache_alloc(), so that
check had already become pointless.  Let's bury it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

08397acd

lustre: kill unused members of struct vvp_thread_info · 23602adf
Al Viro authored Mar 30, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
23602adf

expand __fuse_direct_write() in both callers · 812408fb

Al Viro authored Mar 30, 2015

it's actually shorter that way *and* later we'll want iocb in scope
of generic_write_check() caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

812408fb

fuse: switch fuse_direct_io_file_operations to ->{read,write}_iter() · 15316263
Al Viro authored Mar 30, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
15316263
cuse: switch to iov_iter · cfa86a74
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
cfa86a74
Merge branch 'for-davem' into for-next · 39c853eb
Al Viro authored Apr 11, 2015

39c853eb
sg_start_req(): use import_iovec() · fdc81f45
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
fdc81f45

sg_start_req(): make sure that there's not too many elements in iovec · 451a2886

Al Viro authored Mar 21, 2015

unfortunately, allowing an arbitrary 16bit value means a possibility of
overflow in the calculation of total number of pages in bio_map_user_iov() -
we rely on there being no more than PAGE_SIZE members of sum in the
first loop there.  If that sum wraps around, we end up allocating
too small array of pointers to pages and it's easy to overflow it in
the second loop.

X-Coverup: TINC (and there's no lumber cartel either)
Cc: stable@vger.kernel.org # way, way back
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

451a2886

blk_rq_map_user(): use import_single_range() · 8f7e885a
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
8f7e885a

sg_io(): use import_iovec() · e272b89f

Al Viro authored Mar 21, 2015

... and don't skip access_ok() validation.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e272b89f

process_vm_access: switch to {compat_,}import_iovec() · 17d17e72
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
17d17e72
switch keyctl_instantiate_key_common() to iov_iter · b353a1f7
Al Viro authored Mar 17, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
b353a1f7
switch {compat_,}do_readv_writev() to {compat_,}import_iovec() · 0504c074
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
0504c074
aio_setup_vectored_rw(): switch to {compat_,}import_iovec() · 32a56afa
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
32a56afa
vmsplice_to_user(): switch to import_iovec() · 345995fa
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
345995fa

kill aio_setup_single_vector() · d4fb392f

Al Viro authored Mar 21, 2015

identical to import_single_range()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

d4fb392f

Merge branch 'iov_iter' into for-next · 36e9f653
Al Viro authored Apr 11, 2015

36e9f653

aio: simplify arguments of aio_setup_..._rw() · a96114fa

Al Viro authored Mar 20, 2015

We don't need req in either of those. We don't need nr_segs in caller.
We don't really need len in caller either - iov_iter_count(&iter) will do.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a96114fa

aio: lift iov_iter_init() into aio_setup_..._rw() · 4c185ce0

Al Viro authored Mar 20, 2015

the only non-trivial detail is that we do it before rw_verify_area(),
so we'd better cap the length ourselves in aio_setup_single_rw()
case (for vectored case rw_copy_check_uvector() will do that for us).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

4c185ce0

lift iov_iter into {compat_,}do_readv_writev() · ac15ac06

Al Viro authored Mar 20, 2015

get it closer to matching {compat_,}rw_copy_check_uvector().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ac15ac06

Merge branch 'iocb' into for-next · c0fec3a9
Al Viro authored Apr 11, 2015

c0fec3a9

NFS: fix BUG() crash in notify_change() with patch to chown_common() · c1b8940b

Andrew Elble authored Feb 23, 2015

We have observed a BUG() crash in fs/attr.c:notify_change(). The crash
occurs during an rsync into a filesystem that is exported via NFS.

1.) fs/attr.c:notify_change() modifies the caller's version of attr.
2.) 6de0ec00 ("VFS: make notify_change pass ATTR_KILL_S*ID to
    setattr operations") introduced a BUG() restriction such that "no
    function will ever call notify_change() with both ATTR_MODE and
    ATTR_KILL_S*ID set". Under some circumstances though, it will have
    assisted in setting the caller's version of attr to this very
    combination.
3.) 27ac0ffe ("locks: break delegations on any attribute
    modification") introduced code to handle breaking
    delegations. This can result in notify_change() being re-called. attr
    _must_ be explicitly reset to avoid triggering the BUG() established
    in #2.
4.) The path that that triggers this is via fs/open.c:chmod_common().
    The combination of attr flags set here and in the first call to
    notify_change() along with a later failed break_deleg_wait()
    results in notify_change() being called again via retry_deleg
    without resetting attr.

Solution is to move retry_deleg in chmod_common() a bit further up to
ensure attr is completely reset.

There are other places where this seemingly could occur, such as
fs/utimes.c:utimes_common(), but the attr flags are not initially
set in such a way to trigger this.

Fixes: 27ac0ffe ("locks: break delegations on any attribute modification")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c1b8940b

dcache: return -ESTALE not -EBUSY on distributed fs race · 3d330dc1

J. Bruce Fields authored Feb 10, 2015

On a distributed filesystem it's possible for lookup to discover that a
directory it just found is already cached elsewhere in the directory
heirarchy.  The dcache won't let us keep the directory in both places,
so we have to move the dentry to the new location from the place we
previously had it cached.

If the parent has changed, then this requires all the same locks as we'd
need to do a cross-directory rename.  But we're already in lookup
holding one parent's i_mutex, so it's too late to acquire those locks in
the right order.

The (unreliable) solution in __d_unalias is to trylock() the required
locks and return -EBUSY if it fails.

I see no particular reason for returning -EBUSY, and -ESTALE is already
the result of some other lookup races on NFS.  I think -ESTALE is the
more helpful error return.  It also allows us to take advantage of the
logic Jeff Layton added in c6a94284 "vfs: fix renameat to retry on
ESTALE errors" and ancestors, which hopefully resolves some of these
errors before they're returned to userspace.

I can reproduce these cases using NFS with:

	ssh root@$client '
		mount -olookupcache=pos '$server':'$export' /mnt/
		mkdir /mnt/TO
		mkdir /mnt/DIR
		touch /mnt/DIR/test.txt
		while true; do
			strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY
		done
	'
	ssh root@$server '
		while true; do
			mv $export/DIR $export/TO/DIR
			mv $export/TO/DIR $export/DIR
		done
	'

It also helps to add some other concurrent use of the directory on the
client (e.g., "ls /mnt/TO").  And you can replace the server-side mv's
by client-side mv's that are repeatedly killed.  (If the client is
interrupted while waiting for the RENAME response then it's left with a
dentry that has to go under one parent or the other, but it doesn't yet
know which.)
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

3d330dc1

NTFS: Version 2.1.32 - Update file write from aio_write to write_iter. · a632f559
Anton Altaparmakov authored Mar 11, 2015
```
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
a632f559

VFS: Add iov_iter_fault_in_multipages_readable() · 171a0203

Anton Altaparmakov authored Mar 11, 2015

simillar to iov_iter_fault_in_readable() but differs in that it is
not limited to faulting in the first iovec and instead faults in
"bytes" bytes iterating over the iovecs as necessary.

Also, instead of only faulting in the first and last page of the
range, all pages are faulted in.

This function is needed by NTFS when it does multi page file
writes.
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

171a0203

drop bogus check in file_open_root() · e5b811e3

Al Viro authored Mar 08, 2015

For one thing, LOOKUP_DIRECTORY will be dealt with in do_last().
For another, name can be an empty string, but not NULL - no callers
pass that and it would oops immediately if they would.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e5b811e3

switch security_inode_getattr() to struct path * · 3f7036a0
Al Viro authored Mar 08, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
3f7036a0
constify tomoyo_realpath_from_path() · 22473862
Al Viro authored Mar 08, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
22473862
whack-a-mole: there's no point doing set_fs(USER_DS) in sigframe setup · 74008b36
Al Viro authored Feb 23, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
74008b36

whack-a-mole: no need to set_fs(USER_DS) in {start,flush}_thread() · a555ad45

Al Viro authored Feb 23, 2015

flush_old_exec() has already done that.  Back on 2011 a bunch of
instances like that had been kicked out, but that hadn't taken
care of then-out-of-tree architectures, obviously, and they served
as reinfection vector...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a555ad45

remove incorrect comment in lookup_one_len() · 9e7543e9
Al Viro authored Feb 23, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
9e7543e9
namei.c: fold do_path_lookup() into both callers · 74eb8cc5
Al Viro authored Feb 23, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
74eb8cc5

kill struct filename.separate · fd2f7cb5

Al Viro authored Feb 22, 2015

just make const char iname[] the last member and compare name->name with
name->iname instead of checking name->separate

We need to make sure that out-of-line name doesn't end up allocated adjacent
to struct filename refering to it; fortunately, it's easy to achieve - just
allocate that struct filename with one byte in ->iname[], so that ->iname[0]
will be inside the same object and thus have an address different from that
of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fd2f7cb5

11 Apr, 2015 3 commits
- new helper: msg_data_left() · 01e97e65
  Al Viro authored Dec 15, 2014
```
convert open-coded instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
  01e97e65
- Merge remote-tracking branch 'dh/afs' into for-davem · a2dd3793
  Al Viro authored Apr 11, 2015
  
  a2dd3793
- get rid of the size argument of sock_sendmsg() · d8725c86
  Al Viro authored Dec 11, 2014
```
it's equal to iov_iter_count(&msg->msg_iter) in all cases
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
  d8725c86
09 Apr, 2015 3 commits

switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec() · 6aa24814

Al Viro authored Mar 21, 2015

For kernel_sendmsg() that eliminates the need to play with setfs();
for kernel_recvmsg() it does *not* - a couple of callers are using
it with non-NULL ->msg_control, which would be treated as userland
address on recvmsg side of things.

In all cases we are really setting a kvec-backed iov_iter, though.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

6aa24814

net: switch importing msghdr from userland to {compat_,}import_iovec() · da184284
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
da184284
net: switch sendto() and recvfrom() to import_single_range() · 602bd0e9
Al Viro authored Mar 21, 2015
```
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
```
602bd0e9