- 10 Feb, 2015 40 commits
-
-
Kirill A. Shutemov authored
remap_file_pages(2) was invented to be able efficiently map parts of huge file into limited 32-bit virtual address space such as in database workloads. Nonlinear mappings are pain to support and it seems there's no legitimate use-cases nowadays since 64-bit systems are widely available. Let's drop it and get rid of all these special-cased code. The patch replaces the syscall with emulation which creates new VMA on each remap_file_pages(), unless they it can be merged with an adjacent one. I didn't find *any* real code that uses remap_file_pages(2) to test emulation impact on. I've checked Debian code search and source of all packages in ALT Linux. No real users: libc wrappers, mentions in strace, gdb, valgrind and this kind of stuff. There are few basic tests in LTP for the syscall. They work just fine with emulation. To test performance impact, I've written small test case which demonstrate pretty much worst case scenario: map 4G shmfs file, write to begin of every page pgoff of the page, remap pages in reverse order, read every page. The test creates 1 million of VMAs if emulation is in use, so I had to set vm.max_map_count to 1100000 to avoid -ENOMEM. Before: 23.3 ( +- 4.31% ) seconds After: 43.9 ( +- 0.85% ) seconds Slowdown: 1.88x I believe we can live with that. Test case: #define _GNU_SOURCE #include <assert.h> #include <stdlib.h> #include <stdio.h> #include <sys/mman.h> #define MB (1024UL * 1024) #define SIZE (4096 * MB) int main(int argc, char **argv) { unsigned long *p; long i, pass; for (pass = 0; pass < 10; pass++) { p = mmap(NULL, SIZE, PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) { perror("mmap"); return -1; } for (i = 0; i < SIZE / 4096; i++) p[i * 4096 / sizeof(*p)] = i; for (i = 0; i < SIZE / 4096; i++) { if (remap_file_pages(p + i * 4096 / sizeof(*p), 4096, 0, (SIZE - 4096 * (i + 1)) >> 12, 0)) { perror("remap_file_pages"); return -1; } } for (i = SIZE / 4096 - 1; i >= 0; i--) assert(p[i * 4096 / sizeof(*p)] == SIZE / 4096 - i - 1); munmap(p, SIZE); } return 0; } [akpm@linux-foundation.org: fix spello] [sasha.levin@oracle.com: initialize populate before usage] [sasha.levin@oracle.com: grab file ref to prevent race while mmaping] Signed-off-by: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Armin Rigo <arigo@tunes.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
CONFIG_COMPACTION=y, CONFIG_DEBUG_FS=n: mm/vmstat.c:690: warning: 'frag_start' defined but not used mm/vmstat.c:702: warning: 'frag_next' defined but not used mm/vmstat.c:710: warning: 'frag_stop' defined but not used mm/vmstat.c:715: warning: 'walk_zones_in_node' defined but not used It's all a bit of a tangly mess and it's unclear why CONFIG_COMPACTION figures in there at all. Move frag_start/frag_next/frag_stop and migratetype_names[] into the existing CONFIG_PROC_FS block. walk_zones_in_node() gets a special ifdef. Also move the #include lines up to where #include lines live. [axel.lin@ingics.com: fix build error when !CONFIG_PROC_FS] Signed-off-by: Axel Lin <axel.lin@ingics.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vaishali Thakkar authored
Here, free memory is allocated using kmem_cache_zalloc. So, use kmem_cache_free instead of kfree. This is done using Coccinelle and semantic patch used is as follows: @@ expression x,E,c; @@ x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...) ... when != x = E when != &x ?-kfree(x) +kmem_cache_free(c,x) Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kim Phillips authored
Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
compound_head() is implemented with assumption that there would be race condition when checking tail flag. This assumption is only true when we try to access arbitrary positioned struct page. The situation that virt_to_head_page() is called is different case. We call virt_to_head_page() only in the range of allocated pages, so there is no race condition on tail flag. In this case, we don't need to handle race condition and we can reduce overhead slightly. This patch implements compound_head_fast() which is similar with compound_head() except tail flag race handling. And then, virt_to_head_page() uses this optimized function to improve performance. I saw 1.8% win in a fast-path loop over kmem_cache_alloc/free, (14.063 ns -> 13.810 ns) if target object is on tail page. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
We had to insert a preempt enable/disable in the fastpath a while ago in order to guarantee that tid and kmem_cache_cpu are retrieved on the same cpu. It is the problem only for CONFIG_PREEMPT in which scheduler can move the process to other cpu during retrieving data. Now, I reach the solution to remove preempt enable/disable in the fastpath. If tid is matched with kmem_cache_cpu's tid after tid and kmem_cache_cpu are retrieved by separate this_cpu operation, it means that they are retrieved on the same cpu. If not matched, we just have to retry it. With this guarantee, preemption enable/disable isn't need at all even if CONFIG_PREEMPT, so this patch removes it. I saw roughly 5% win in a fast-path loop over kmem_cache_alloc/free in CONFIG_PREEMPT. (14.821 ns -> 14.049 ns) Below is the result of Christoph's slab_test reported by Jesper Dangaard Brouer. * Before Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 10000 times kmalloc(8) -> 49 cycles kfree -> 62 cycles 10000 times kmalloc(16) -> 48 cycles kfree -> 64 cycles 10000 times kmalloc(32) -> 53 cycles kfree -> 70 cycles 10000 times kmalloc(64) -> 64 cycles kfree -> 77 cycles 10000 times kmalloc(128) -> 74 cycles kfree -> 84 cycles 10000 times kmalloc(256) -> 84 cycles kfree -> 114 cycles 10000 times kmalloc(512) -> 83 cycles kfree -> 116 cycles 10000 times kmalloc(1024) -> 81 cycles kfree -> 120 cycles 10000 times kmalloc(2048) -> 104 cycles kfree -> 136 cycles 10000 times kmalloc(4096) -> 142 cycles kfree -> 165 cycles 10000 times kmalloc(8192) -> 238 cycles kfree -> 226 cycles 10000 times kmalloc(16384) -> 403 cycles kfree -> 264 cycles 2. Kmalloc: alloc/free test 10000 times kmalloc(8)/kfree -> 68 cycles 10000 times kmalloc(16)/kfree -> 68 cycles 10000 times kmalloc(32)/kfree -> 69 cycles 10000 times kmalloc(64)/kfree -> 68 cycles 10000 times kmalloc(128)/kfree -> 68 cycles 10000 times kmalloc(256)/kfree -> 68 cycles 10000 times kmalloc(512)/kfree -> 74 cycles 10000 times kmalloc(1024)/kfree -> 75 cycles 10000 times kmalloc(2048)/kfree -> 74 cycles 10000 times kmalloc(4096)/kfree -> 74 cycles 10000 times kmalloc(8192)/kfree -> 75 cycles 10000 times kmalloc(16384)/kfree -> 510 cycles * After Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 10000 times kmalloc(8) -> 46 cycles kfree -> 61 cycles 10000 times kmalloc(16) -> 46 cycles kfree -> 63 cycles 10000 times kmalloc(32) -> 49 cycles kfree -> 69 cycles 10000 times kmalloc(64) -> 57 cycles kfree -> 76 cycles 10000 times kmalloc(128) -> 66 cycles kfree -> 83 cycles 10000 times kmalloc(256) -> 84 cycles kfree -> 110 cycles 10000 times kmalloc(512) -> 77 cycles kfree -> 114 cycles 10000 times kmalloc(1024) -> 80 cycles kfree -> 116 cycles 10000 times kmalloc(2048) -> 102 cycles kfree -> 131 cycles 10000 times kmalloc(4096) -> 135 cycles kfree -> 163 cycles 10000 times kmalloc(8192) -> 238 cycles kfree -> 218 cycles 10000 times kmalloc(16384) -> 399 cycles kfree -> 262 cycles 2. Kmalloc: alloc/free test 10000 times kmalloc(8)/kfree -> 65 cycles 10000 times kmalloc(16)/kfree -> 66 cycles 10000 times kmalloc(32)/kfree -> 65 cycles 10000 times kmalloc(64)/kfree -> 66 cycles 10000 times kmalloc(128)/kfree -> 66 cycles 10000 times kmalloc(256)/kfree -> 71 cycles 10000 times kmalloc(512)/kfree -> 72 cycles 10000 times kmalloc(1024)/kfree -> 71 cycles 10000 times kmalloc(2048)/kfree -> 71 cycles 10000 times kmalloc(4096)/kfree -> 71 cycles 10000 times kmalloc(8192)/kfree -> 65 cycles 10000 times kmalloc(16384)/kfree -> 511 cycles Most of the results are better than before. Note that this change slightly worses performance in !CONFIG_PREEMPT, roughly 0.3%. Implementing each case separately would help performance, but, since it's so marginal, I didn't do that. This would help maintanance since we have same code for all cases. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dmitry Monakhov authored
__generic_block_fiemap may spin very long time for large sparse files. Without this patch an unprivileged user may abuse system resources simply by spawning a vast number of unkilable busyloops (works on ext2/ext3): truncate --size 1T test for ((i=0;i<1024;i++)) do filefrag test > /dev/null & done Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Srinivas Eeda authored
A tiny race between BAST and unlock message causes the NULL dereference. A node sends an unlock request to master and receives a response. Before processing the response it receives a BAST from the master. Since both requests are processed by different threads it creates a race. While the BAST is being processed, lock can get freed by unlock code. This patch makes bast to return immediately if lock is found but unlock is pending. The code should handle this race. We also have to fix master node to skip sending BAST after receiving unlock message. Below is the crash stack BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 IP: o2dlm_blocking_ast_wrapper+0xd/0x16 dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm] dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm] o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager] o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager] worker_thread+0x14d/0x1ed [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
alex chen authored
In ocfs2_dentry_convert_worker, we should prune the dcache before deleting the dentry of directory, otherwise, in the following cases the inode of directory will still remain in orphan directory until the device being umounted. Mount point: /mnt/ocfs2 Node A Node B mkdir /mnt/ocfs2/testdir ocfs2_mkdir ->ocfs2_mknod ->ocfs2_dentry_attach_lock ->ocfs2_dentry_lock(dentry, 0) ... ... touch /mnt/ocfs2/testdir/testfile unlink /mnt/test/testdir/testfile rmdir /mnt/ocfs2/testdir ocfs2_unlink ->ocfs2_remote_dentry_delete ->ocfs2_dentry_lock(dentry, 1) ... ... ... ... ocfs2_downconvert_thread ->ocfs2_unblock_lock ->ocfs2_dentry_convert_worker ->ocfs2_find_local_alias ->dget_dlock ->d_delete Here the dentry can not be released because the children's dentry is negative but still exist. Finally, this inode will still remain in orphan directory until its children are destroyed. So before deleting dentry of directory, we should prune the dcache to remove unused children of the parent dentry by shrink_dcache_parent(). Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
resv_lock is only used in reservations.c Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Daeseok Youn authored
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Daeseok Youn authored
mlog_errno() is called twice when some functions are failed. Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Daeseok Youn authored
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dan Carpenter authored
Smatch complains that, if o2net_tx_can_proceed() returns false, then "sc" and "ret" are uninialized or maybe we are re-using the data from previous iteration. I do not know if we can hit this bug in real life but checking the return value is harmless and we may as well silence the static checker warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jan Kara authored
The assigned value is never used. Coverity-id 1226847. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
alex chen authored
Add a mount option to support JBD2 feature: JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT. When this feature is opened, journal commit block can be written to disk without waiting for descriptor blocks, which can improve journal commit performance. This option will enable 'journal_checksum' internally. Using the fs_mark benchmark, using journal_async_commit shows a 50% improvement, the files per second go up from 215.2 to 317.5. test script: fs_mark -d /mnt/ocfs2/ -s 10240 -n 1000 default: FSUse% Count Size Files/sec App Overhead 0 1000 10240 215.2 17878 with journal_async_commit option: FSUse% Count Size Files/sec App Overhead 0 1000 10240 317.5 17881 Signed-off-by: Alex Chen <alex.chen@huawei.com> Signed-off-by: Weiwei Wang <wangww631@huawei.comm> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
alex chen authored
Similar to ocfs2_write_end_nolock() which is metioned at commit 136f49b9 ("ocfs2: fix journal commit deadlock"), we should unlock pages before ocfs2_commit_trans() in ocfs2_convert_inline_data_to_extents. Otherwise, it will cause a deadlock with journal commit threads. Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rickard Strandqvist authored
Remove dlm_joined() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rickard Strandqvist authored
Remove ol_dqblk_file_block() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rickard Strandqvist authored
Remove ocfs2_xattr_bucket_get_val() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
alex chen authored
Use snprintf format specifier "%lu" instead of "%ld" for argument of type 'unsigned long'. Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Junxiao Bi authored
O2NET_CONN_IDLE_DELAY is not defined, connection attempts will not be canceled due to timeout. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Junxiao Bi authored
Variable "why" is not yet initialized at line 615, fix it. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
else is unnecessary after return. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Xue jiufei authored
When the recovery master is down, the owner of $RECOVERY calls dlm_do_local_recovery_cleanup() to prune any $RECOVERY entries for dead nodes. The lock is in the granted list and the refcount must be 2. We should put twice to remove this lock. Otherwise, it will lead to a memory leak. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Reported-by: yangwenfang <vicky.yangwenfang@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kevin Cernekee authored
Defining these macros way down in arch/sh/.../irq.c doesn't cause kernel/irq/generic-chip.c to use them. As far as I can tell this code has no effect. Fixes: 332fd7c4 ("genirq: Generic chip: Change irq_reg_{readl,writel} arguments") Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> (cpp/asm comparison) Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rob Landley authored
What sh4 actually wants is HAVE_PATA_PLATFORM, so select that instead. Signed-off-by: Rob Landley <rob@landley.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jan Kara authored
Commit e9fd702a ("audit: convert audit watches to use fsnotify instead of inotify") broke handling of renames in audit. Audit code wants to update inode number of an inode corresponding to watched name in a directory. When something gets renamed into a directory to a watched name, inotify previously passed moved inode to audit code however new fsnotify code passes directory inode where the change happened. That confuses audit and it starts watching parent directory instead of a file in a directory. This can be observed for example by doing: cd /tmp touch foo bar auditctl -w /tmp/foo touch foo mv bar foo touch foo In audit log we see events like: type=CONFIG_CHANGE msg=audit(1423563584.155:90): auid=1000 ses=2 op="updated rules" path="/tmp/foo" key=(null) list=4 res=1 ... type=PATH msg=audit(1423563584.155:91): item=2 name="bar" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE type=PATH msg=audit(1423563584.155:91): item=3 name="foo" inode=1046842 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=DELETE type=PATH msg=audit(1423563584.155:91): item=4 name="foo" inode=1046884 dev=08:0 2 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE ... and that's it - we see event for the first touch after creating the audit rule, we see events for rename but we don't see any event for the last touch. However we start seeing events for unrelated stuff happening in /tmp. Fix the problem by passing moved inode as data in the FS_MOVED_FROM and FS_MOVED_TO events instead of the directory where the change happens. This doesn't introduce any new problems because noone besides audit_watch.c cares about the passed value: fs/notify/fanotify/fanotify.c cares only about FSNOTIFY_EVENT_PATH events. fs/notify/dnotify/dnotify.c doesn't care about passed 'data' value at all. fs/notify/inotify/inotify_fsnotify.c uses 'data' only for FSNOTIFY_EVENT_PATH. kernel/audit_tree.c doesn't care about passed 'data' at all. kernel/audit_watch.c expects moved inode as 'data'. Fixes: e9fd702a ("audit: convert audit watches to use fsnotify instead of inotify") Signed-off-by: Jan Kara <jack@suse.cz> Cc: Paul Moore <paul@paul-moore.com> Cc: Eric Paris <eparis@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhang Zhen authored
The inotify interface has changed a lot. The user interface was too old, and the kernel interface was removed by Eric Paris in commit: 2dfc1cae inotify: remove inotify in kernel interface. Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Cc: Wang Kai <morgan.wang@huawei.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Robert Love <robert.w.love@intel.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Lino Sanfilippo authored
Currently FAN_ONDIR is always set on a mark's ignored mask when the event mask is extended without FAN_MARK_ONDIR being set. This may result in events for directories being ignored unexpectedly for call sequences like fanotify_mark(fd, FAN_MARK_ADD, FAN_OPEN | FAN_ONDIR , AT_FDCWD, "dir"); fanotify_mark(fd, FAN_MARK_ADD, FAN_CLOSE, AT_FDCWD, "dir"); Also FAN_MARK_ONDIR is only honored when adding events to a mark's mask, but not for event removal. Fix both issues by not setting FAN_ONDIR implicitly on the ignore mask any more. Instead treat FAN_ONDIR as any other event flag and require FAN_MARK_ONDIR to be set by the user for both event mask and ignore mask. Furthermore take FAN_MARK_ONDIR into account when set for event removal. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Lino Sanfilippo authored
If removing bits from a mark's ignored mask, the concerning inodes/vfsmounts mask is not affected. So don't recalculate it. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Lino Sanfilippo authored
In fanotify_mark_remove_from_mask() a mark is destroyed if only one of both bitmasks (mask or ignored_mask) of a mark is cleared. However the other mask may still be set and contain information that should not be lost. So only destroy a mark if both masks are cleared. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kirill A. Shutemov authored
After commit 944d9fec ("hugetlb: add support for gigantic page allocation at runtime") we can allocate 1G pages at runtime if CMA is enabled. Let's register 1G pages into hugetlb even if the user hasn't requested them explicitly at boot time with hugepagesz=1G. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libataLinus Torvalds authored
Pull libata changes from Tejun Heo: "Mostly driver-specific changes. Nothing too noteworthy. This pull request contains three merges from for-3.19-fixes. The first two are to pull ahci_xgene and sata_dwc_460ex fix commits which are depended upon by later changes. The last one is to pull in a fix patch which missed the v3.19-rc window" * 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (24 commits) ahci_xgene: Fix the dma state machine lockup for the ATA_CMD_SMART PIO mode command. ata: libahci: Use of_platform_device_create only if supported sata_mv: Delete unnecessary checks before the function call "phy_power_off" ata: Delete unnecessary checks before the function call "pci_dev_put" ata: pata_platform: fix owner module reference mismatch for scsi host ata: ahci_platform: fix owner module reference mismatch for scsi host pata_pdc2027x: Use 64-bit timekeeping ata: libahci: Fix devres cleanup on failure ata: libahci: Allow using multiple regulators Documentation: bindings: Add the regulator property to the sub-nodes AHCI bindings ata: libahci: Clean-up the ahci_platform_en/disable_phys functions sata_rcar: extend PM methods sata_dwc_460ex: disable compilation on ARM and ARM64 ata: libata-core: Remove unused function sata_dwc_460ex: convert to devm_kzalloc in ->probe() sata_dwc_460ex: remove extra message sata_dwc_460ex: use np local variable in ->probe() sata_dwc_460ex: fix most of the sparse warnings sata_dwc_460ex: enable COMPILE_TEST for the driver sata_dwc_460ex: remove redundant dev_set_drvdata ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds authored
Pull workqueue changes from Tejun Heo: "One cosmetic cleanup patch" * 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue.h: remove loops of single statement macros
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds authored
Pull cgroup changes from Tejun Heo: "Three mostly trivial patches. The biggest change is that blkio is now initialized before memcg which will be needed to make memcg and blkcg cooperate on writeback IOs" * 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: add dummy css_put() for !CONFIG_CGROUPS cgroup: reorder SUBSYS(blkio) in cgroup_subsys.h Update of Documentation/cgroups/00-INDEX
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds authored
Pull percpu changes from Tejun Heo: "Nothing too interesting. One cleanup patch and another to add a trivial state check function" * 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu_ref: implement percpu_ref_is_dying() percpu_ref: remove unnecessary ACCESS_ONCE() in percpu_ref_tryget_live()
-
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds authored
Pull EDAC updates from Borislav Petkov: - a new synopsys_edac.c driver for the Synopsys DDR controller, from Punnaiah Choudary Kalluri. - minor fixes/cleanups all around - Mauro and I are adding the repo URLs to MAINTAINERS since people asked for trees to base upcoming work on. * tag 'edac_for_3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC: Add repo URLs to MAINTAINERS EDAC, mv64x60_edac: Fix an error code in probe() EDAC: edac_mc_sysfs: Make stuff static EDAC: Fix the leak of mci->bus->name when bus_register fails edac: i5100_edac: Remove unused i5100_recmema_dm_buf_id EDAC, synps: Add EDAC support for zynq ddr ecc controller mpc85xx_edac: Fix a typo in comments EDAC, mce_amd_inj: Fix sparse non static symbol warning
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 RAS update from Ingo Molnar: "The changes in this cycle were: - allow mmcfg access to APEI error injection handlers - improve MCE error messages - smaller cleanups" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, mce: Fix sparse errors x86, mce: Improve timeout error messages ACPI, EINJ: Enhance error injection tolerance level
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 mm cleanups from Ingo Molnar: "Two cleanups: simplify parse_setup_data() and sanitize_e820_map() usage" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, e820: Clean up sanitize_e820_map() users x86, setup: Let early_memremap() handle page alignment
-