- 06 May, 2011 2 commits
-
-
Hillf Danton authored
For a given node, when constructing the cpumask for its sched_domain to span, if there is no best node available after searching, further efforts could be saved, based on small change in the return value of find_next_best_node(). Signed-off-by: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/BANLkTi%3DqPWxRAa6%2BdT3ohEP6Z%3D0v%2Be4EXA@mail.gmail.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Rakib Mullick authored
cfs_rq->nr_spread_over is only used when CONFIG_SCHED_DEBUG is set. So wrap it with CONFIG_SCHED_DEBUG. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1304528026.15681.3.camel@localhost.localdomainSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 04 May, 2011 1 commit
-
-
Vladimir Davydov authored
It's passed across multiple functions but is never really used, so remove it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1304447467-29200-1-git-send-email-vdavydov@parallels.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 26 Apr, 2011 1 commit
-
-
Hillf Danton authored
The rq varible, though computed for each possible cpu, has nothing to do in the function, so it can be removed. This also eliminates a build warning. Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/BANLkTin-FfQfqW5ym1iuEmrk8s777Y1LAg@mail.gmail.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 24 Apr, 2011 1 commit
-
-
Jonathan Corbet authored
Neil Brown pointed out that lock_depth somehow escaped the BKL removal work. Let's get rid of it now. Note that the perf scripting utilities still have a bunch of code for dealing with common_lock_depth in tracepoints; I have left that in place in case anybody wants to use that code with older kernels. Suggested-by: Neil Brown <neilb@suse.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.netSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
- 21 Apr, 2011 2 commits
-
-
Rakib Mullick authored
scheduler_tick() is no longer called by fork code - this got discarded a long time ago by commit bc947631 ("sched: improve efficiency of sched_fork()"). So, remove the comment which still claims otherwise. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/BANLkTimO4iGP0QpaHO1HHF1QOnVcQpc0cw@mail.gmail.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Merge reason: Pick up upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 19 Apr, 2011 4 commits
-
-
Peter Zijlstra authored
Vladis Kletnieks reported a new RCU debug warning in the scheduler. Since commit dce840a0 ("sched: Dynamically allocate sched_domain/ sched_group data-structures") the sched_domain trees are protected by RCU instead of RCU-sched. This means that we need to include rcu_read_lock() protection when we iterate them since disabling preemption doesn't suffice anymore. Reported-by: Valdis.Kletnieks@vt.edu Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1302882741.2388.241.camel@twinsSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Venkatesh Pallipadi authored
When a task in a taskgroup sleeps, pick_next_task starts all the way back at the root and picks the task/taskgroup with the min vruntime across all runnable tasks. But when there are many frequently sleeping tasks across different taskgroups, it makes better sense to stay with same taskgroup for its slice period (or until all tasks in the taskgroup sleeps) instead of switching cross taskgroup on each sleep after a short runtime. This helps specifically where taskgroups corresponds to a process with multiple threads. The change reduces the number of CR3 switches in this case. Example: Two taskgroups with 2 threads each which are running for 2ms and sleeping for 1ms. Looking at sched:sched_switch shows: BEFORE: taskgroup_1 threads [5004, 5005], taskgroup_2 threads [5016, 5017] cpu-soaker-5004 [003] 3683.391089 cpu-soaker-5016 [003] 3683.393106 cpu-soaker-5005 [003] 3683.395119 cpu-soaker-5017 [003] 3683.397130 cpu-soaker-5004 [003] 3683.399143 cpu-soaker-5016 [003] 3683.401155 cpu-soaker-5005 [003] 3683.403168 cpu-soaker-5017 [003] 3683.405170 AFTER: taskgroup_1 threads [21890, 21891], taskgroup_2 threads [21934, 21935] cpu-soaker-21890 [003] 865.895494 cpu-soaker-21935 [003] 865.897506 cpu-soaker-21934 [003] 865.899520 cpu-soaker-21935 [003] 865.901532 cpu-soaker-21934 [003] 865.903543 cpu-soaker-21935 [003] 865.905546 cpu-soaker-21891 [003] 865.907548 cpu-soaker-21890 [003] 865.909560 cpu-soaker-21891 [003] 865.911571 cpu-soaker-21890 [003] 865.913582 cpu-soaker-21891 [003] 865.915594 cpu-soaker-21934 [003] 865.917606 Similar problem is there when there are multiple taskgroups and say a task A preempts currently running task B of taskgroup_1. On schedule, pick_next_task can pick an unrelated task on taskgroup_2. Here it would be better to give some preference to task B on pick_next_task. A simple (may be extreme case) benchmark I tried was tbench with 2 tbench client processes with 2 threads each running on a single CPU. Avg throughput across 5 50 sec runs was: BEFORE: 105.84 MB/sec AFTER: 112.42 MB/sec Signed-off-by: Venkatesh Pallipadi <venki@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1302802253-25760-1-git-send-email-venki@google.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Venkatesh Pallipadi authored
Make set_*_buddy() work on non-task sched_entity, to facilitate the use of next_buddy to cache a group entity in cases where one of the tasks within that entity sleeps or gets preempted. set_skip_buddy() was incorrectly comparing the policy of task that is yielding to be not equal to SCHED_IDLE. Yielding should happen even when task yielding is SCHED_IDLE. This change removes the policy check on the yielding task. Signed-off-by: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1302744070-30079-2-git-send-email-venki@google.comSigned-off-by: Ingo Molnar <mingo@elte.hu>
-
Linus Torvalds authored
-
- 18 Apr, 2011 29 commits
-
-
git://codeaurora.org/quic/kernel/davidb/linux-msmLinus Torvalds authored
* 'for-39-rc4' of git://codeaurora.org/quic/kernel/davidb/linux-msm: msm: timer: fix missing return value msm: Remove extraneous ffa device check
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: xen-kbdfront - fix mouse getting stuck after save/restore Input: estimate number of events per packet Input: evdev - indicate buffer overrun with SYN_DROPPED Input: document event types and codes and their intended use Input: add KEY_IMAGES specifically for AL Image Browser Input: twl4030_keypad - fix potential NULL dereference in twl4030_kp_probe() Input: h3600_ts - fix error handling at connect Input: twl4030_keypad - avoid potential NULL-pointer dereference
-
git://git.kernel.dk/linux-2.6-blockLinus Torvalds authored
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: add blk_run_queue_async block: blk_delay_queue() should use kblockd workqueue md: fix up raid1/raid10 unplugging. md: incorporate new plugging into raid5. md: provide generic support for handling unplug callbacks. md - remove old plugging code. md/dm - remove remains of plug_fn callback. md: use new plugging interface for RAID IO. block: drop queue lock before calling __blk_run_queue() for kblockd punt Revert "block: add callback function for unplug notification" block: Enhance new plugging support to support general callbacks
-
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpcLinus Torvalds authored
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/powermac: Build fix with SMP and CPU hotplug powerpc/perf_event: Skip updating kernel counters if register value shrinks powerpc: Don't write protect kernel text with CONFIG_DYNAMIC_FTRACE enabled powerpc: Fix oops if scan_dispatch_log is called too early powerpc/pseries: Use a kmem cache for DTL buffers powerpc/kexec: Fix regression causing compile failure on UP powerpc/85xx: disable Suspend support if SMP enabled powerpc/e500mc: Remove CPU_FTR_MAYBE_CAN_NAP/CPU_FTR_MAYBE_CAN_DOZE powerpc/book3e: Fix CPU feature handling on 64-bit e5500 powerpc: Check device status before adding serial device powerpc/85xx: Don't add disabled PCIe devices
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits) Btrfs: fix free space cache leak Btrfs: avoid taking the chunk_mutex in do_chunk_alloc Btrfs end_bio_extent_readpage should look for locked bits Btrfs: don't force chunk allocation in find_free_extent Btrfs: Check validity before setting an acl Btrfs: Fix incorrect inode nlink in btrfs_link() Btrfs: Check if btrfs_next_leaf() returns error in btrfs_real_readdir() Btrfs: Check if btrfs_next_leaf() returns error in btrfs_listxattr() Btrfs: make uncache_state unconditional btrfs: using cached extent_state in set/unlock combinations Btrfs: avoid taking the trans_mutex in btrfs_end_transaction Btrfs: fix subvolume mount by name problem when default mount subvolume is set fix user annotation in ioctl.c Btrfs: check for duplicate iov_base's when doing dio reads btrfs: properly handle overlapping areas in memmove_extent_buffer Btrfs: fix memory leaks in btrfs_new_inode() Btrfs: check for duplicate iov_base's when doing dio reads Btrfs: reuse the extent_map we found when calling btrfs_get_extent Btrfs: do not use async submit for small DIO io's Btrfs: don't split dio bios if we don't have to ...
-
Linus Torvalds authored
Rather than pass in some random truncated offset to the pid-related functions, check that the offset is in range up-front. This is just cleanup, the previous commit fixed the real problem. Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
next_pidmap() just quietly accepted whatever 'last' pid that was passed in, which is not all that safe when one of the users is /proc. Admittedly the proc code should do some sanity checking on the range (and that will be the next commit), but that doesn't mean that the helper functions should just do that pidmap pointer arithmetic without checking the range of its arguments. So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1" doesn't really matter, the for-loop does check against the end of the pidmap array properly (it's only the actual pointer arithmetic overflow case we need to worry about, and going one bit beyond isn't going to overflow). [ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ] Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Analyzed-by: Robert Święcki <robert@swiecki.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Igor Mammedov authored
Mouse gets "stuck" after restore of PV guest but buttons are in working condition. If driver has been configured for ABS coordinates at start it will get XENKBD_TYPE_POS events and then suddenly after restore it'll start getting XENKBD_TYPE_MOTION events, that will be dropped later and they won't get into user-space. Regression was introduced by hunk 5 and 6 of 5ea5254a ("Input: xen-kbdfront - advertise either absolute or relative coordinates"). Driver on restore should ask xen for request-abs-pointer again if it is available. So restore parts that did it before 5ea5254a. Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Igor Mammedov <imammedo@redhat.com> [v1: Expanded the commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
-
Jeff Brown authored
Calculate a default based on the number of ABS axes, REL axes, and MT slots for the device during input device registration. Signed-off-by: Jeff Brown <jeffbrown@android.com> Reviewed-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
-
Chris Mason authored
The free space caching code was recently reworked to cache all the pages it needed instead of using find_get_page everywhere. One loop was missed though, so it ended up leaking pages. This fixes it to use our page array instead of find_get_page. Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Ingo Molnar authored
Merge reason: the rq locking changes are stable, propagate them into the .40 queue. Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Christoph Hellwig authored
Instead of overloading __blk_run_queue to force an offload to kblockd add a new blk_run_queue_async helper to do it explicitly. I've kept the blk_queue_stopped check for now, but I suspect it's not needed as the check we do when the workqueue items runs should be enough. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-
Jens Axboe authored
Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-
NeilBrown authored
We just need to make sure that an unplug event wakes up the md thread, which is exactly what mddev_check_plugged does. Also remove some plug-related code that is no longer needed. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
In raid5 plugging is used for 2 things: 1/ collecting writes that require a bitmap update 2/ collecting writes in the hope that we can create full stripes - or at least more-full. We now release these different sets of stripes when plug_cnt is zero. Also in make_request, we call mddev_check_plug to hopefully increase plug_cnt, and wake up the thread at the end if plugging wasn't achieved for some reason. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When an md device adds a request to a queue, it can call mddev_check_plugged. If this succeeds then we know that the md thread will be woken up shortly, and ->plug_cnt will be non-zero until then, so some processing can be delayed. If it fails, then no unplug callback is expected and the make_request function needs to do whatever is required to make the request happen. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
md has some plugging infrastructure for RAID5 to use because the normal plugging infrastructure required a 'request_queue', and when called from dm, RAID5 doesn't have one of those available. This relied on the ->unplug_fn callback which doesn't exist any more. So remove all of that code, both in md and raid5. Subsequent patches with restore the plugging functionality. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
Now that unplugging is done differently, the unplug_fn callback is never called, so it can be completely discarded. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
md/raid submits a lot of IO from the various raid threads. So adding start/finish plug calls to those so that some plugging happens. Signed-off-by: NeilBrown <neilb@suse.de>
-
Jens Axboe authored
If we know we are going to punt to kblockd, we can drop the queue lock before calling into __blk_run_queue() since it only does a safe bit test and a workqueue call. Since kblockd needs to grab this very lock as one of the first things it does, it's a good optimization to drop the lock before waking kblockd. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-
Jens Axboe authored
MD can't use this since it really requires us to be able to keep more than a single piece of state for the unplug. Commit 048c9374 added the required support for MD, so get rid of this now unused code. This reverts commit f7566457. Conflicts: block/blk-core.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-
NeilBrown authored
md/raid requires an unplug callback, but as it does not uses requests the current code cannot provide one. So allow arbitrary callbacks to be attached to the blk_plug. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-
Benjamin Herrenschmidt authored
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Eric B Munson authored
Because of speculative event roll back, it is possible for some event coutners to decrease between reads on POWER7. This causes a problem with the way that counters are updated. Delta calues are calculated in a 64 bit value and the top 32 bits are masked. If the register value has decreased, this leaves us with a very large positive value added to the kernel counters. This patch protects against this by skipping the update if the delta would be negative. This can lead to a lack of precision in the coutner values, but from my testing the value is typcially fewer than 10 samples at a time. Signed-off-by: Eric B Munson <emunson@mgebm.net> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stefan Roese authored
This problem was noticed on an MPC855T platform. Ftrace did oops when trying to write to the kernel text segment. Many thanks to Joakim for finding the root cause of this problem. Signed-off-by: Stefan Roese <sr@denx.de> Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Anton Blanchard authored
We currently enable interrupts before the dispatch log for the boot cpu is setup. If a timer interrupt comes in early enough we oops in scan_dispatch_log: Unable to handle kernel paging request for data at address 0x00000010 ... .scan_dispatch_log+0xb0/0x170 .account_system_vtime+0xa0/0x220 .irq_enter+0x88/0xc0 .do_IRQ+0x48/0x230 The patch below adds a check to scan_dispatch_log to ensure the dispatch log has been allocated. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nishanth Aravamudan authored
PAPR specifies that DTL buffers can not cross AMS environments (aka CMO in the PAPR) and can not cross a memory entitlement granule boundary (4k). This is found in section 14.11.3.2 H_REGISTER_VPA of the PAPR. kmalloc does not guarantee an alignment of the allocation, though, beyond 8 bytes (at least in my understanding). Create a special kmem cache for DTL buffers with the alignment requirement. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Paul Gortmaker authored
Recent commit b987812b caused a compile failure on UP because a considerably large block of the file was included within CONFIG_SMP, hence making a stub function not exposed on UP builds when it needed to be. Relocate the stub to the #else /* ! CONFIG_SMP */ section and also annotate the relevant else/endif so that nobody else falls into the same trap I did. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
-