- 22 Dec, 2014 1 commit
-
-
Nakajima Akira authored
In spite of different file type, if file is same name and same inode number, old inode cache is used. This causes that you can not cd directory, can not cat SymbolicLink. So this patch is that if file type is different, return error. Reproducible sample : 1. create file 'a' at cifs client. 2. repeat rm and mkdir 'a' 4 times at server, then direcotry 'a' having same inode number is created. (Repeat 4 times, then same inode number is recycled.) (When server is under RHEL 6.6, 1 time is O.K. Always same inode number is recycled.) 3. ls -li at client, then you can not cd directory, can not remove directory. SymbolicLink has same problem. Bug link: https://bugzilla.kernel.org/show_bug.cgi?id=90011Signed-off-by: Nakajima Akira <nakajima.akira@nttcom.co.jp> Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Steve French <steve.french@primarydata.com>
-
- 14 Dec, 2014 2 commits
-
-
Kevin Cernekee authored
Commit 2ae83bf9 ("[CIFS] Fix setting time before epoch (negative time values)") changed "u64 t" to "s64 t", which makes do_div() complain about a pointer signedness mismatch: CC fs/cifs/netmisc.o In file included from ./arch/mips/include/asm/div64.h:12:0, from include/linux/kernel.h:124, from include/linux/list.h:8, from include/linux/wait.h:6, from include/linux/net.h:23, from fs/cifs/netmisc.c:25: fs/cifs/netmisc.c: In function ‘cifs_NTtimeToUnix’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ ^ fs/cifs/netmisc.c:941:22: note: in expansion of macro ‘do_div’ ts.tv_nsec = (long)do_div(t, 10000000) * 100; Introduce a temporary "u64 abs_t" variable to fix this. Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Steve French <steve.french@primarydata.com>
-
Sachin Prabhu authored
We have encountered failures when When testing smb2 mounts on ppc64 machines when using both Samba as well as Windows 2012. On poking around, the problem was determined to be caused by the high endian MessageID passed in the header for smb2. On checking the corresponding MID for smb1 is converted to LE before being sent on the wire. We have tested this patch successfully on a ppc64 machine. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
-
- 13 Dec, 2014 37 commits
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block layer driver updates from Jens Axboe: - NVMe updates: - The blk-mq conversion from Matias (and others) - A stack of NVMe bug fixes from the nvme tree, mostly from Keith. - Various bug fixes from me, fixing issues in both the blk-mq conversion and generic bugs. - Abort and CPU online fix from Sam. - Hot add/remove fix from Indraneel. - A couple of drbd fixes from the drbd team (Andreas, Lars, Philipp) - With the generic IO stat accounting from 3.19/core, converting md, bcache, and rsxx to use those. From Gu Zheng. - Boundary check for queue/irq mode for null_blk from Matias. Fixes cases where invalid values could be given, causing the device to hang. - The xen blkfront pull request, with two bug fixes from Vitaly. * 'for-3.19/drivers' of git://git.kernel.dk/linux-block: (56 commits) NVMe: fix race condition in nvme_submit_sync_cmd() NVMe: fix retry/error logic in nvme_queue_rq() NVMe: Fix FS mount issue (hot-remove followed by hot-add) NVMe: fix error return checking from blk_mq_alloc_request() NVMe: fix freeing of wrong request in abort path xen/blkfront: remove redundant flush_op xen/blkfront: improve protection against issuing unsupported REQ_FUA NVMe: Fix command setup on IO retry null_blk: boundary check queue_mode and irqmode block/rsxx: use generic io stats accounting functions to simplify io stat accounting md: use generic io stats accounting functions to simplify io stat accounting drbd: use generic io stats accounting functions to simplify io stat accounting md/bcache: use generic io stats accounting functions to simplify io stat accounting NVMe: Update module version major number NVMe: fail pci initialization if the device doesn't have any BARs NVMe: add ->exit_hctx() hook NVMe: make setup work for devices that don't do INTx NVMe: enable IO stats by default NVMe: nvme_submit_async_admin_req() must use atomic rq allocation NVMe: replace blk_put_request() with blk_mq_free_request() ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block driver core update from Jens Axboe: "This is the pull request for the core block IO changes for 3.19. Not a huge round this time, mostly lots of little good fixes: - Fix a bug in sysfs blktrace interface causing a NULL pointer dereference, when enabled/disabled through that API. From Arianna Avanzini. - Various updates/fixes/improvements for blk-mq: - A set of updates from Bart, mostly fixing buts in the tag handling. - Cleanup/code consolidation from Christoph. - Extend queue_rq API to be able to handle batching issues of IO requests. NVMe will utilize this shortly. From me. - A few tag and request handling updates from me. - Cleanup of the preempt handling for running queues from Paolo. - Prevent running of unmapped hardware queues from Ming Lei. - Move the kdump memory limiting check to be in the correct location, from Shaohua. - Initialize all software queues at init time from Takashi. This prevents a kobject warning when CPUs are brought online that weren't online when a queue was registered. - Single writeback fix for I_DIRTY clearing from Tejun. Queued with the core IO changes, since it's just a single fix. - Version X of the __bio_add_page() segment addition retry from Maurizio. Hope the Xth time is the charm. - Documentation fixup for IO scheduler merging from Jan. - Introduce (and use) generic IO stat accounting helpers for non-rq drivers, from Gu Zheng. - Kill off artificial limiting of max sectors in a request from Christoph" * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits) bio: modify __bio_add_page() to accept pages that don't start a new segment blk-mq: Fix uninitialized kobject at CPU hotplugging blktrace: don't let the sysfs interface remove trace from running list blk-mq: Use all available hardware queues blk-mq: Micro-optimize bt_get() blk-mq: Fix a race between bt_clear_tag() and bt_get() blk-mq: Avoid that __bt_get_word() wraps multiple times blk-mq: Fix a use-after-free blk-mq: prevent unmapped hw queue from being scheduled blk-mq: re-check for available tags after running the hardware queue blk-mq: fix hang in bt_get() blk-mq: move the kdump check to blk_mq_alloc_tag_set blk-mq: cleanup tag free handling blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map blk: introduce generic io stat accounting help function blk-mq: handle the single queue case in blk_mq_hctx_next_cpu genhd: check for int overflow in disk_expand_part_tbl() blk-mq: add blk_mq_free_hctx_request() blk-mq: export blk_mq_free_request() blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable ...
-
Linus Torvalds authored
Merge tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixlet from Steven Rostedt: "Remove unnecessary preempt_disable in printk()" * tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: printk: Do not disable preemption for accessing printk_func
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: "Here's two fixes: 1) Discovered by Fengguang Wu's tests. I changed the parameters to the function graph x86 prepare_ftrace_return call but forgot to update the call from entry_32 (i386 version). This patch corrects that. 2) I was tracing some code and found that the sched_switch tracepoint was showing tasks in the INTERRUPTIBLE state as RUNNING. This was due to the updates to convert preempt_count into a per_cpu variable. The tracepoint logic was made to use the tasks saved_preempt_count which could hold a stale "PREEMPT_ACTIVE", instead of using the current preempt_count() call" * tag 'trace-fixes-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/sched: Check preempt_count() for current when reading task->state ftrace/x86: Update i386 call to prepare_ftrace_return()
-
git://git.infradead.org/users/pcmoore/auditLinus Torvalds authored
Pull audit updates from Paul Moore: "Two small patches from the audit next branch; only one of which has any real significant code changes, the other is simply a MAINTAINERS update for audit. The single code patch is pretty small and rather straightforward, it changes the audit "version" number reported to userspace from an integer to a bitmap which is used to indicate the functionality of the running kernel. This really doesn't have much impact on the kernel, but it will make life easier for the audit userspace folks. Thankfully we were still on a version number which allowed us to do this without breaking userspace" * 'upstream' of git://git.infradead.org/users/pcmoore/audit: audit: convert status version to a feature bitmap audit: add Paul Moore to the MAINTAINERS entry
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto update from Herbert Xu: - The crypto API is now documented :) - Disallow arbitrary module loading through crypto API. - Allow get request with empty driver name through crypto_user. - Allow speed testing of arbitrary hash functions. - Add caam support for ctr(aes), gcm(aes) and their derivatives. - nx now supports concurrent hashing properly. - Add sahara support for SHA1/256. - Add ARM64 version of CRC32. - Misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits) crypto: tcrypt - Allow speed testing of arbitrary hash functions crypto: af_alg - add user space interface for AEAD crypto: qat - fix problem with coalescing enable logic crypto: sahara - add support for SHA1/256 crypto: sahara - replace tasklets with kthread crypto: sahara - add support for i.MX53 crypto: sahara - fix spinlock initialization crypto: arm - replace memset by memzero_explicit crypto: powerpc - replace memset by memzero_explicit crypto: sha - replace memset by memzero_explicit crypto: sparc - replace memset by memzero_explicit crypto: algif_skcipher - initialize upon init request crypto: algif_skcipher - removed unneeded code crypto: algif_skcipher - Fixed blocking recvmsg crypto: drbg - use memzero_explicit() for clearing sensitive data crypto: drbg - use MODULE_ALIAS_CRYPTO crypto: include crypto- module prefix in template crypto: user - add MODULE_ALIAS crypto: sha-mb - remove a bogus NULL check crytpo: qat - Fix 64 bytes requests ...
-
Linus Torvalds authored
Merge second patchbomb from Andrew Morton: - the rest of MM - misc fs fixes - add execveat() syscall - new ratelimit feature for fault-injection - decompressor updates - ipc/ updates - fallocate feature creep - fsnotify cleanups - a few other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (99 commits) cgroups: Documentation: fix trivial typos and wrong paragraph numberings parisc: percpu: update comments referring to __get_cpu_var percpu: update local_ops.txt to reflect this_cpu operations percpu: remove __get_cpu_var and __raw_get_cpu_var macros fsnotify: remove destroy_list from fsnotify_mark fsnotify: unify inode and mount marks handling fallocate: create FAN_MODIFY and IN_MODIFY events mm/cma: make kmemleak ignore CMA regions slub: fix cpuset check in get_any_partial slab: fix cpuset check in fallback_alloc shmdt: use i_size_read() instead of ->i_size ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segments ipc/msg: increase MSGMNI, remove scaling ipc/sem.c: increase SEMMSL, SEMMNI, SEMOPM ipc/sem.c: change memory barrier in sem_lock() to smp_rmb() lib/decompress.c: consistency of compress formats for kernel image decompress_bunzip2: off by one in get_next_block() usr/Kconfig: make initrd compression algorithm selection not expert fault-inject: add ratelimit option ratelimit: add initialization macro ...
-
SeongJae Park authored
Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christoph Lameter authored
__get_cpu_var was removed. Update comments to refer to this_cpu_ptr() instead. Signed-off-by: Christoph Lameter <cl@linux.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christoph Lameter authored
Update the documentation to reflect changes due to the availability of this_cpu operations. Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christoph Lameter authored
No user is left in the kernel source tree. Therefore we can drop the definitions. This is the final merge of the transition away from __get_cpu_var. After this patch the kernel will not build if anyone uses __get_cpu_var. Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jan Kara authored
destroy_list is used to track marks which still need waiting for srcu period end before they can be freed. However by the time mark is added to destroy_list it isn't in group's list of marks anymore and thus we can reuse fsnotify_mark->g_list for queueing into destroy_list. This saves two pointers for each fsnotify_mark. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jan Kara authored
There's a lot of common code in inode and mount marks handling. Factor it out to a common helper function. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Heinrich Schuchardt authored
The fanotify and the inotify API can be used to monitor changes of the file system. System call fallocate() modifies files. Hence it should trigger the corresponding fanotify (FAN_MODIFY) and inotify (IN_MODIFY) events. The most interesting case is FALLOC_FL_COLLAPSE_RANGE because this value allows to create arbitrary file content from random data. This patch adds the missing call to fsnotify_modify(). The FAN_MODIFY and IN_MODIFY event will be created when fallocate() succeeds. It will even be created if the file length remains unchanged, e.g. when calling fanotify with flag FALLOC_FL_KEEP_SIZE. This logic was primarily chosen to keep the coding simple. It resembles the logic of the write() system call. When we call write() we always create a FAN_MODIFY event, even in the case of overwriting with identical data. Events FAN_MODIFY and IN_MODIFY do not provide any guarantee that data was actually changed. Furthermore even if if the filesize remains unchanged, fallocate() may influence whether a subsequent write() will succeed and hence the fallocate() call may be considered a modification. The fallocate(2) man page teaches: After a successful call, subsequent writes into the range specified by offset and len are guaranteed not to fail because of lack of disk space. So calling fallocate(fd, FALLOC_FL_KEEP_SIZE, offset, len) may result in different outcomes of a subsequent write depending on the values of offset and len. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@parisplace.org> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Thierry Reding authored
kmemleak will add allocations as objects to a pool. The memory allocated for each object in this pool is periodically searched for pointers to other allocated objects. This only works for memory that is mapped into the kernel's virtual address space, which happens not to be the case for most CMA regions. Furthermore, CMA regions are typically used to store data transferred to or from a device and therefore don't contain pointers to other objects. Without this, the kernel crashes on the first execution of the scan_gray_list() because it tries to access highmem. Perhaps a more appropriate fix would be to reject any object that can't map to a kernel virtual address? [akpm@linux-foundation.org: add comment] [akpm@linux-foundation.org: fix comment, per Catalin] [sfr@canb.auug.org.au: include linux/io.h for phys_to_virt()] Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vladimir Davydov authored
If we fail to allocate from the current node's stock, we look for free objects on other nodes before calling the page allocator (see get_any_partial). While checking other nodes we respect cpuset constraints by calling cpuset_zone_allowed. We enforce hardwall check. As a result, we will fallback to the page allocator even if there are some pages cached on other nodes, but the current cpuset doesn't have them set. However, the page allocator uses softwall check for kernel allocations, so it may allocate from one of the other nodes in this case. Therefore we should use softwall cpuset check in get_any_partial to conform with the cpuset check in the page allocator. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Zefan Li <lizefan@huawei.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vladimir Davydov authored
fallback_alloc is called on kmalloc if the preferred node doesn't have free or partial slabs and there's no pages on the node's free list (GFP_THISNODE allocations fail). Before invoking the reclaimer it tries to locate a free or partial slab on other allowed nodes' lists. While iterating over the preferred node's zonelist it skips those zones which hardwall cpuset check returns false for. That means that for a task bound to a specific node using cpusets fallback_alloc will always ignore free slabs on other nodes and go directly to the reclaimer, which, however, may allocate from other nodes if cpuset.mem_hardwall is unset (default). As a result, we may get lists of free slabs grow without bounds on other nodes, which is bad, because inactive slabs are only evicted by cache_reap at a very slow rate and cannot be dropped forcefully. To reproduce the issue, run a process that will walk over a directory tree with lots of files inside a cpuset bound to a node that constantly experiences memory pressure. Look at num_slabs vs active_slabs growth as reported by /proc/slabinfo. To avoid this we should use softwall cpuset check in fallback_alloc. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Zefan Li <lizefan@huawei.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dave Hansen authored
Andrew Morton noted http://lkml.kernel.org/r/20141104142027.a7a0d010772d84560b445f59@linux-foundation.org that the shmdt uses inode->i_size outside of i_mutex being held. There is one more case in shm.c in shm_destroy(). This converts both users over to use i_size_read(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dave Hansen authored
This is a highly-contrived scenario. But, a single shmdt() call can be induced in to unmapping memory from mulitple shm segments. Example code is here: http://www.sr71.net/~dave/intel/shmfun.c The fix is pretty simple: Record the 'struct file' for the first VMA we encounter and then stick to it. Decline to unmap anything not from the same file and thus the same segment. I found this by inspection and the odds of anyone hitting this in practice are pretty darn small. Lightly tested, but it's a pretty small patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Manfred Spraul authored
SysV can be abused to allocate locked kernel memory. For most systems, a small limit doesn't make sense, see the discussion with regards to SHMMAX. Therefore: increase MSGMNI to the maximum supported. And: If we ignore the risk of locking too much memory, then an automatic scaling of MSGMNI doesn't make sense. Therefore the logic can be removed. The code preserves auto_msgmni to avoid breaking any user space applications that expect that the value exists. Notes: 1) If an administrator must limit the memory allocations, then he can set MSGMNI as necessary. Or he can disable sysv entirely (as e.g. done by Android). 2) MSGMAX and MSGMNB are intentionally not increased, as these values are used to control latency vs. throughput: If MSGMNB is large, then msgsnd() just returns and more messages can be queued before a task switch to a task that calls msgrcv() is forced. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Manfred Spraul authored
a) SysV can be abused to allocate locked kernel memory. For most systems, a small limit doesn't make sense, see the discussion with regards to SHMMAX. Therefore: Increase the sysv sem limits so that all known applications will work with these defaults. b) With regards to the maximum supported: Some of the specified hard limits are not correct anymore, therefore the patch updates the documentation. - SEMMNI must stay below IPCMNI, which is 32768. As for SHMMAX: Stay a bit below this limit. - SEMMSL was limited to 8k, to ensure that the kmalloc for the kernel array was limited to 16 kB (order=2) This doesn't apply anymore: - the allocation size isn't sizeof(short)*nsems anymore. - ipc_alloc falls back to vmalloc - SEMOPM should stay below 1000, to limit the kmalloc in semtimedop() to an order=1 allocation. Therefore: Leave it at 500 (order=0 allocation). Note: If an administrator must limit the memory allocations, then he can set the values as necessary. Or he can disable sysv entirely (as e.g. done by Android). Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Manfred Spraul authored
When I fixed bugs in the sem_lock() logic, I was more conservative than necessary. Therefore it is safe to replace the smp_mb() with smp_rmb(). And: With smp_rmb(), semop() syscalls are up to 10% faster. The race we must protect against is: sem->lock is free sma->complex_count = 0 sma->sem_perm.lock held by thread B thread A: A: spin_lock(&sem->lock) B: sma->complex_count++; (now 1) B: spin_unlock(&sma->sem_perm.lock); A: spin_is_locked(&sma->sem_perm.lock); A: XXXXX memory barrier A: if (sma->complex_count == 0) Thread A must read the increased complex_count value, i.e. the read must not be reordered with the read of sem_perm.lock done by spin_is_locked(). Since it's about ordering of reads, smp_rmb() is sufficient. [akpm@linux-foundation.org: update sem_lock() comment, from Davidlohr] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Haesung Kim authored
Magic number of compress formats for kernel image is defined by two bytes. These numbers are written in hexadecimal number, nevertheless magic number for only gunzip is written in octal number. The formats should be consistent for readability. Therefore, magic numbers for gunzip are also defined by hexadecimal number. Signed-off-by: Haesung Kim <matia.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dan Carpenter authored
"origPtr" is used as an offset into the bd->dbuf[] array. That array is allocated in start_bunzip() and has "bd->dbufSize" number of elements so the test here should be >= instead of >. Later we check "origPtr" again before using it as an offset so I don't know if this bug can be triggered in real life. Fixes: bc22c17e ('bzip2/lzma: library support for gzip, bzip2 and lzma decompression') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alain Knaff <alain@knaff.lu> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andi Kleen authored
The kernel has support for (nearly) every compression algorithm known to man, each to handle some particular microscopic niche. Unfortunately all of these always get compiled in if you want to support INITRDs, and can be only disabled when CONFIG_EXPERT is set. I don't see why I need to set EXPERT just to properly configure the initrd compression algorithms, and not always include every possible algorithm Usually the initrd is just compressed with gzip anyways, at least that's true on all distributions I use. Remove the dependencies for initrd compression on CONFIG_EXPERT. Make the various options just default y, which should be good enough to not break any previous configuration. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dmitry Monakhov authored
Current debug levels are not optimal. Especially if one want to provoke big numbers of faults(broken device simulator) then any verbose level will produce giant numbers of identical logging messages. Let's add ratelimit parameter for that purpose. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dmitry Monakhov authored
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
linux kernel doesn't manage page sizes below 4kb. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
Based on ext2_direct_IO Tested with O_DIRECT file open and sysbench/mariadb with 1% written queries improvement (update_non_index test) on a volume created with mkaffs. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
-Remove ErrorBuffer and use %pV -Add __printf to enable argument mistmatch warnings Original patch by Joe Perches. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
-Move file_operations to avoid forward declarations. -Remove unused declarations. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Riku Voipio authored
Following the suggestions from Andrew Morton and Stephen Rothwell, Dont expand the ARCH list in kernel/gcov/Kconfig. Instead, define a ARCH_HAS_GCOV_PROFILE_ALL bool which architectures can enable. set ARCH_HAS_GCOV_PROFILE_ALL on Architectures where it was previously allowed + ARM64 which I tested. Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Masanari Iida authored
Remove unnecessary KERN_ERR from pr_err() within kexec.c. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Drysdale authored
Signed-off-by: David Drysdale <drysdale@google.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Drysdale authored
Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Drysdale authored
Hook up x86-64, i386 and x32 ABIs. Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Drysdale authored
This patchset adds execveat(2) for x86, and is derived from Meredydd Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528). The primary aim of adding an execveat syscall is to allow an implementation of fexecve(3) that does not rely on the /proc filesystem, at least for executables (rather than scripts). The current glibc version of fexecve(3) is implemented via /proc, which causes problems in sandboxed or otherwise restricted environments. Given the desire for a /proc-free fexecve() implementation, HPA suggested (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be an appropriate generalization. Also, having a new syscall means that it can take a flags argument without back-compatibility concerns. The current implementation just defines the AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be added in future -- for example, flags for new namespaces (as suggested at https://lkml.org/lkml/2006/7/11/474). Related history: - https://lkml.org/lkml/2006/12/27/123 is an example of someone realizing that fexecve() is likely to fail in a chroot environment. - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered documenting the /proc requirement of fexecve(3) in its manpage, to "prevent other people from wasting their time". - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a problem where a process that did setuid() could not fexecve() because it no longer had access to /proc/self/fd; this has since been fixed. This patch (of 4): Add a new execveat(2) system call. execveat() is to execve() as openat() is to open(): it takes a file descriptor that refers to a directory, and resolves the filename relative to that. In addition, if the filename is empty and AT_EMPTY_PATH is specified, execveat() executes the file to which the file descriptor refers. This replicates the functionality of fexecve(), which is a system call in other UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and so relies on /proc being mounted). The filename fed to the executed program as argv[0] (or the name of the script fed to a script interpreter) will be of the form "/dev/fd/<fd>" (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively reflecting how the executable was found. This does however mean that execution of a script in a /proc-less environment won't work; also, script execution via an O_CLOEXEC file descriptor fails (as the file will not be accessible after exec). Based on patches by Meredydd Luff. Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-