- 13 Jan, 2014 11 commits
-
-
Juri Lelli authored
Data from tests confirmed that the original active load balancing logic didn't scale neither in the number of CPU nor in the number of tasks (as sched_rt does). Here we provide a global data structure to keep track of deadlines of the running tasks in the system. The structure is composed by a bitmask showing the free CPUs and a max-heap, needed when the system is heavily loaded. The implementation and concurrent access scheme are kept simple by design. However, our measurements show that we can compete with sched_rt on large multi-CPUs machines [1]. Only the push path is addressed, the extension to use this structure also for pull decisions is straightforward. However, we are currently evaluating different (in order to decrease/avoid contention) data structures to solve possibly both problems. We are also going to re-run tests considering recent changes inside cpupri [2]. [1] http://retis.sssup.it/~jlelli/papers/Ospert11Lelli.pdf [2] http://www.spinics.net/lists/linux-rt-users/msg06778.htmlSigned-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-14-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
In order of deadline scheduling to be effective and useful, it is important that some method of having the allocation of the available CPU bandwidth to tasks and task groups under control. This is usually called "admission control" and if it is not performed at all, no guarantee can be given on the actual scheduling of the -deadline tasks. Since when RT-throttling has been introduced each task group have a bandwidth associated to itself, calculated as a certain amount of runtime over a period. Moreover, to make it possible to manipulate such bandwidth, readable/writable controls have been added to both procfs (for system wide settings) and cgroupfs (for per-group settings). Therefore, the same interface is being used for controlling the bandwidth distrubution to -deadline tasks and task groups, i.e., new controls but with similar names, equivalent meaning and with the same usage paradigm are added. However, more discussion is needed in order to figure out how we want to manage SCHED_DEADLINE bandwidth at the task group level. Therefore, this patch adds a less sophisticated, but actually very sensible, mechanism to ensure that a certain utilization cap is not overcome per each root_domain (the single rq for !SMP configurations). Another main difference between deadline bandwidth management and RT-throttling is that -deadline tasks have bandwidth on their own (while -rt ones doesn't!), and thus we don't need an higher level throttling mechanism to enforce the desired bandwidth. This patch, therefore: - adds system wide deadline bandwidth management by means of: * /proc/sys/kernel/sched_dl_runtime_us, * /proc/sys/kernel/sched_dl_period_us, that determine (i.e., runtime / period) the total bandwidth available on each CPU of each root_domain for -deadline tasks; - couples the RT and deadline bandwidth management, i.e., enforces that the sum of how much bandwidth is being devoted to -rt -deadline tasks to stay below 100%. This means that, for a root_domain comprising M CPUs, -deadline tasks can be created until the sum of their bandwidths stay below: M * (sched_dl_runtime_us / sched_dl_period_us) It is also possible to disable this bandwidth management logic, and be thus free of oversubscribing the system up to any arbitrary level. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-12-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
Some method to deal with rt-mutexes and make sched_dl interact with the current PI-coded is needed, raising all but trivial issues, that needs (according to us) to be solved with some restructuring of the pi-code (i.e., going toward a proxy execution-ish implementation). This is under development, in the meanwhile, as a temporary solution, what this commits does is: - ensure a pi-lock owner with waiters is never throttled down. Instead, when it runs out of runtime, it immediately gets replenished and it's deadline is postponed; - the scheduling parameters (relative deadline and default runtime) used for that replenishments --during the whole period it holds the pi-lock-- are the ones of the waiting task with earliest deadline. Acting this way, we provide some kind of boosting to the lock-owner, still by using the existing (actually, slightly modified by the previous commit) pi-architecture. We would stress the fact that this is only a surely needed, all but clean solution to the problem. In the end it's only a way to re-start discussion within the community. So, as always, comments, ideas, rants, etc.. are welcome! :-) Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> [ Added !RT_MUTEXES build fix. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Turn the pi-chains from plist to rb-tree, in the rt_mutex code, and provide a proper comparison function for -deadline and -priority tasks. This is done mainly because: - classical prio field of the plist is just an int, which might not be enough for representing a deadline; - manipulating such a list would become O(nr_deadline_tasks), which might be to much, as the number of -deadline task increases. Therefore, an rb-tree is used, and tasks are queued in it according to the following logic: - among two -priority (i.e., SCHED_BATCH/OTHER/RR/FIFO) tasks, the one with the higher (lower, actually!) prio wins; - among a -priority and a -deadline task, the latter always wins; - among two -deadline tasks, the one with the earliest deadline wins. Queueing and dequeueing functions are changed accordingly, for both the list of a task's pi-waiters and the list of tasks blocked on a pi-lock. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-again-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-10-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
It is very likely that systems that wants/needs to use the new SCHED_DEADLINE policy also want to have the scheduling latency of the -deadline tasks under control. For this reason a new version of the scheduling wakeup latency, called "wakeup_dl", is introduced. As a consequence of applying this patch there will be three wakeup latency tracer: * "wakeup", that deals with all tasks in the system; * "wakeup_rt", that deals with -rt and -deadline tasks only; * "wakeup_dl", that deals with -deadline tasks only. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-9-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Harald Gustafsson authored
Make it possible to specify a period (different or equal than deadline) for -deadline tasks. Relative deadlines (D_i) are used on task arrivals to generate new scheduling (absolute) deadlines as "d = t + D_i", and periods (P_i) to postpone the scheduling deadlines as "d = d + P_i" when the budget is zero. This is in general useful to model (and schedule) tasks that have slow activation rates (long periods), but have to be scheduled soon once activated (short deadlines). Signed-off-by: Harald Gustafsson <harald.gustafsson@ericsson.com> Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-7-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
Make the core scheduler and load balancer aware of the load produced by -deadline tasks, by updating the moving average like for sched_rt. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-6-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Juri Lelli authored
Introduces data structures relevant for implementing dynamic migration of -deadline tasks and the logic for checking if runqueues are overloaded with -deadline tasks and for choosing where a task should migrate, when it is the case. Adds also dynamic migrations to SCHED_DEADLINE, so that tasks can be moved among CPUs when necessary. It is also possible to bind a task to a (set of) CPU(s), thus restricting its capability of migrating, or forbidding migrations at all. The very same approach used in sched_rt is utilised: - -deadline tasks are kept into CPU-specific runqueues, - -deadline tasks are migrated among runqueues to achieve the following: * on an M-CPU system the M earliest deadline ready tasks are always running; * affinity/cpusets settings of all the -deadline tasks is always respected. Therefore, this very special form of "load balancing" is done with an active method, i.e., the scheduler pushes or pulls tasks between runqueues when they are woken up and/or (de)scheduled. IOW, every time a preemption occurs, the descheduled task might be sent to some other CPU (depending on its deadline) to continue executing (push). On the other hand, every time a CPU becomes idle, it might pull the second earliest deadline ready task from some other CPU. To enforce this, a pull operation is always attempted before taking any scheduling decision (pre_schedule()), as well as a push one after each scheduling decision (post_schedule()). In addition, when a task arrives or wakes up, the best CPU where to resume it is selected taking into account its affinity mask, the system topology, but also its deadline. E.g., from the scheduling point of view, the best CPU where to wake up (and also where to push) a task is the one which is running the task with the latest deadline among the M executing ones. In order to facilitate these decisions, per-runqueue "caching" of the deadlines of the currently running and of the first ready task is used. Queued but not running tasks are also parked in another rb-tree to speed-up pushes. Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-5-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
Introduces the data structures, constants and symbols needed for SCHED_DEADLINE implementation. Core data structure of SCHED_DEADLINE are defined, along with their initializers. Hooks for checking if a task belong to the new policy are also added where they are needed. Adds a scheduling class, in sched/dl.c and a new policy called SCHED_DEADLINE. It is an implementation of the Earliest Deadline First (EDF) scheduling algorithm, augmented with a mechanism (called Constant Bandwidth Server, CBS) that makes it possible to isolate the behaviour of tasks between each other. The typical -deadline task will be made up of a computation phase (instance) which is activated on a periodic or sporadic fashion. The expected (maximum) duration of such computation is called the task's runtime; the time interval by which each instance need to be completed is called the task's relative deadline. The task's absolute deadline is dynamically calculated as the time instant a task (better, an instance) activates plus the relative deadline. The EDF algorithms selects the task with the smallest absolute deadline as the one to be executed first, while the CBS ensures each task to run for at most its runtime every (relative) deadline length time interval, avoiding any interference between different tasks (bandwidth isolation). Thanks to this feature, also tasks that do not strictly comply with the computational model sketched above can effectively use the new policy. To summarize, this patch: - introduces the data structures, constants and symbols needed; - implements the core logic of the scheduling algorithm in the new scheduling class file; - provides all the glue code between the new scheduling class and the core scheduler and refines the interactions between sched/dl and the other existing scheduling classes. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Dario Faggioli authored
Add the syscalls needed for supporting scheduling algorithms with extended scheduling parameters (e.g., SCHED_DEADLINE). In general, it makes possible to specify a periodic/sporadic task, that executes for a given amount of runtime at each instance, and is scheduled according to the urgency of their own timing constraints, i.e.: - a (maximum/typical) instance execution time, - a minimum interval between consecutive instances, - a time constraint by which each instance must be completed. Thus, both the data structure that holds the scheduling parameters of the tasks and the system calls dealing with it must be extended. Unfortunately, modifying the existing struct sched_param would break the ABI and result in potentially serious compatibility issues with legacy binaries. For these reasons, this patch: - defines the new struct sched_attr, containing all the fields that are necessary for specifying a task in the computational model described above; - defines and implements the new scheduling related syscalls that manipulate it, i.e., sched_setattr() and sched_getattr(). Syscalls are introduced for x86 (32 and 64 bits) and ARM only, as a proof of concept and for developing and testing purposes. Making them available on other architectures is straightforward. Since no "user" for these new parameters is introduced in this patch, the implementation of the new system calls is just identical to their already existing counterpart. Future patches that implement scheduling policies able to exploit the new data structure must also take care of modifying the sched_*attr() calls accordingly with their own purposes. Signed-off-by: Dario Faggioli <raistlin@linux.it> [ Rewrote to use sched_attr. ] Signed-off-by: Juri Lelli <juri.lelli@gmail.com> [ Removed sched_setscheduler2() for now. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-3-git-send-email-juri.lelli@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Pick up the latest fixes before applying new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 12 Jan, 2014 1 commit
-
-
Rik van Riel authored
Thomas Hellstrom bisected a regression where erratic 3D performance is experienced on virtual machines as measured by glxgears. It identified commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA node") as the problem which had modified the behaviour of effective_load. Effective load calculates the difference to the system-wide load if a scheduling entity was moved to another CPU. The task group is not heavier as a result of the move but overall system load can increase/decrease as a result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA node") changed effective_load to make it suitable for calculating if a particular NUMA node was compute overloaded. To reduce the cost of the function, it assumed that a current sched entity weight of 0 was uninteresting but that is not the case. wake_affine() uses a weight of 0 for sync wakeups on the grounds that it is assuming the waking task will sleep and not contribute to load in the near future. In this case, we still want to calculate the effective load of the sched entity hierarchy. As effective_load is no longer used by task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide search to find swap/migration candidates), this patch simply restores the historical behaviour. Reported-and-tested-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Rik van Riel <riel@redhat.com> [ Wrote changelog] Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.deSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 10 Jan, 2014 20 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: "Famouse last words: "final pull request" :-) I'm sending this because Jason Wang's fixes are pretty important 1) Add missing per-cpu stats initialization to ip6_vti. Otherwise lockdep spits out a call trace. From Li RongQing. 2) Fix NULL oops in wireless hwsim, from Javier Lopez 3) TIPC deferred packet queue unlink must NULL out skb->next to avoid crashes. From Erik Hugne 4) Fix access to uninitialized buffer in nf_nat netfilter code, from Daniel Borkmann 5) Fix lifetime of ipv6 loopback and SIT tunnel addresses, otherwise they basically timeout immediately. From Hannes Frederic Sowa 6) Fix DMA unmapping of TSO packets in bnx2x driver, from Michal Schmidt 7) Do not allow L2 forwarding offload via macvtap device, the way things are now it will not end up being forwaded at all. From Jason Wang 8) Fix transmit queue selection via ndo_dfwd_start_xmit(), fixing things like applying NETIF_F_LLTX to the wrong device (!!) and eliding the proper transmit watchdog handling 9) qlcnic driver was not updating tx statistics at all, from Manish Chopra" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: qlcnic: Fix ethtool statistics length calculation qlcnic: Fix bug in TX statistics net: core: explicitly select a txq before doing l2 forwarding macvlan: forbid L2 fowarding offload for macvtap bnx2x: fix DMA unmapping of TSO split BDs ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME bnx2x: prevent WARN during driver unload tipc: correctly unlink packets from deferred packet queue ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c netfilter: only warn once on wrong seqadj usage netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper NFC: Fix target mode p2p link establishment iwlwifi: add new devices for 7265 series mac80211: move "bufferable MMPDU" check to fix AP mode scan mac80211_hwsim: Fix NULL pointer dereference
-
git://oss.sgi.com/xfs/xfsLinus Torvalds authored
Pull xfs bugfixes from Ben Myers: "Here we have a bugfix for an off-by-one in the remote attribute verifier that results in a forced shutdown which you can hit with v5 superblock by creating a 64k xattr, and a fix for a missing destroy_work_on_stack() in the allocation worker. It's a bit late, but they are both fairly straightforward" * tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs: xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() xfs: fix off-by-one error in xfs_attr3_rmt_verify
-
Linus Torvalds authored
Merge branch 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED fix from Bryan Wu: "Pali Rohár and Pavel Machek reported the LED of Nokia N900 doesn't work with our latest 3.13-rc6 kernel. Milo fixed the regression here" * 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: leds: lp5521/5523: Remove duplicate mutex
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI and power management fixes from Rafael Wysocki: - Recent commits modifying the lists of C-states in the intel_idle driver introduced bugs leading to crashes on some systems. Two fixes from Jiang Liu. - The ACPI AC driver should receive all types of notifications, but recent change made it ignore some of them. Fix from Alexander Mezin. - intel_pstate's validity checks for MSRs it depends on are not sufficient to catch the lack of support in nested KVM setups, so they are extended to cover that case. From Dirk Brandewie. - NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our ACPI battery driver needs a quirk for it. From Lan Tianyu. - The tpm_ppi driver sometimes leaks memory allocated by acpi_get_name(). Fix from Jiang Liu. * tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: intel_idle: close avn_cstates array with correct marker Revert "intel_idle: mark states tables with __initdata tag" ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters. ACPI / TPM: fix memory leak when walking ACPI namespace ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixesLinus Torvalds authored
Pull MFD fix from Samuel Ortiz: "This is the 2nd MFD pull request for 3.13 It only contains one fix for the rtsx_pcr driver. Without it we see a kernel panic on some machines, when resuming from suspend to RAM" * tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes: mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
-
Milo Kim authored
It can be a problem when a pattern is loaded via the firmware interface. LP55xx common driver has already locked the mutex in 'lp55xx_firmware_loaded()'. So it should be deleted. On the other hand, locks are required in store_engine_load() on updating program memory. Reported-by: Pali Rohár <pali.rohar@gmail.com> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Bryan Wu <cooloney@gmail.com> Cc: <stable@vger.kernel.org>
-
Chuansheng Liu authored
In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to call destroy_work_on_stack() which frees the debug object to pair with INIT_WORK_ONSTACK(). Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 6f96b306)
-
Jie Liu authored
With CRC check is enabled, if trying to set an attributes value just equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote attr write verification procedure failure, which would yield the back trace like below: <snip> XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c <snip> Call Trace: [<ffffffff816f0042>] dump_stack+0x45/0x56 [<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs] [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs] [<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs] [<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs] [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs] [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs] [<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs] [<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460 [<ffffffff81097230>] ? wake_up_state+0x20/0x20 [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs] [<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs] [<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs] [<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs] [<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs] [<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs] [<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs] [<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs] [<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs] [<ffffffff811df9b2>] generic_setxattr+0x62/0x80 [<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0 [<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10 [<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0 [<ffffffff811e054e>] setxattr+0x12e/0x1c0 [<ffffffff811c6e82>] ? final_putname+0x22/0x50 [<ffffffff811c708b>] ? putname+0x2b/0x40 [<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90 [<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0 [<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0 [<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0 [<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f Tests: setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile This patch fix it to check the remote EA size is greater than the XATTR_SIZE_MAX rather than more than or equal to it, because it's valid if the specified EA value size is equal to the limitation as per VFS setxattr interface. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 85dd0707)
-
Shahed Shaikh authored
o Consider number of Tx queues while calculating the length of Tx statistics as part of ethtool stats. o Calculate statistics lenght properly for 82xx and 83xx adapter Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manish Chopra authored
o Driver was not updating TX stats so it was not populating statistics in `ifconfig` command output. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The will cause several issues: - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan instead of lower device which misses the necessary txq synchronization for lower device such as txq stopping or frozen required by dev watchdog or control path. - dev_hard_start_xmit() was called with NULL txq which bypasses the net device watchdog. - dev_hard_start_xmit() does not check txq everywhere which will lead a crash when tso is disabled for lower device. Fix this by explicitly introducing a new param for .ndo_select_queue() for just selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also extended to accept this parameter and dev_queue_xmit_accel() was used to do l2 forwarding transmission. With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of dev_queue_xmit() to do the transmission. In the future, it was also required for macvtap l2 forwarding support since it provides a necessary synchronization method. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: e1000-devel@lists.sourceforge.net Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
L2 fowarding offload will bypass the rx handler of real device. This will make the packet could not be forwarded to macvtap device. Another problem is the dev_hard_start_xmit() called for macvtap does not have any synchronization. Fix this by forbidding L2 forwarding for macvtap. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirelessDavid S. Miller authored
John W. Linville says: ==================== For the mac80211 bits, Johannes says: "I have a fix from Javier for mac80211_hwsim when used with wmediumd userspace, and a fix from Felix for buffering in AP mode." For the NFC bits, Samuel says: "This pull request only contains one fix for a regression introduced with commit e29a9e2a. Without this fix, we can not establish a p2p link in target mode. Only initiator mode works." For the iwlwifi bits, Emmanuel says: "It only includes new device IDs so it's not vital. If you have a pull request to net.git anyway, I'd happy to have this in." ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
bnx2x triggers warnings with CONFIG_DMA_API_DEBUG=y: WARNING: CPU: 0 PID: 2253 at lib/dma-debug.c:887 check_unmap+0xf8/0x920() bnx2x 0000:28:00.0: DMA-API: device driver frees DMA memory with different size [device address=0x00000000da2b389e] [map size=1490 bytes] [unmap size=66 bytes] The reason is that bnx2x splits a TSO BD into two BDs (headers + data) using one DMA mapping for both, but it uses only the length of the first BD when unmapping. This patch fixes the bug by unmapping the whole length of the two BDs. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.linaro.org/people/mike.turquette/linuxLinus Torvalds authored
Pull clock fixes from Mike Turquette: "Late fixes for clock drivers. All of these fixes are for user-visible regressions, typically boot failures or other unsafe system configuration that causes badness" * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux: clk: clk-divider: fix divisor > 255 bug clk: exynos: File scope reg_save array should depend on PM_SLEEP clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock ARM: dts: exynos5250: Fix MDMA0 clock number clk: samsung: exynos5250: Add MDMA0 clocks clk: samsung: exynos5250: Fix ACP gate register offset clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks clk: samsung: exynos4: Correct SRC_MFC register
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "A few fixes for Renesas platforms to fixup DMA masks (this started causing errors once the DMA API added checks for valid masks in 3.13)" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: shmobile: mackerel: Fix coherent DMA mask ARM: shmobile: kzm9g: Fix coherent DMA mask ARM: shmobile: armadillo: Fix coherent DMA mask
-
Hannes Frederic Sowa authored
In the past the IFA_PERMANENT flag indicated, that the valid and preferred lifetime where ignored. Since change fad8da3e ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity") we honour at least the preferred lifetime on those addresses. As such the valid lifetime gets recalculated and updated to 0. If loopback address is added manually this problem does not occur. Also if NetworkManager manages IPv6, those addresses will get added via inet6_rtm_newaddr and thus will have a correct lifetime, too. Reported-by: François-Xavier Le Bail <fx.lebail@yahoo.com> Reported-by: Damien Wyart <damien.wyart@gmail.com> Fixes: fad8da3e ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity") Cc: Yasushi Asano <yasushi.asano@jp.fujitsu.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuval Mintz authored
Starting with commit 80c33ddd "net: add might_sleep() call to napi_disable" bnx2x fails the might_sleep tests causing a stack trace to appear whenever the driver is unloaded, as local_bh_disable() is being called before napi_disable(). This changes the locking schematics related to CONFIG_NET_RX_BUSY_POLL, preventing the need for calling local_bh_disable() and thus eliminating the issue. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rafael J. Wysocki authored
* pm-cpuidle: intel_idle: close avn_cstates array with correct marker Revert "intel_idle: mark states tables with __initdata tag"
-
Jiang Liu authored
Close avn_cstates array with correct marker to avoid overflow in function intel_idle_cpu_init(). [rjw: The problem was introduced when commit 22e580d0 was merged on top of eba682a5 (intel_idle: shrink states tables).] Fixes: 22e580d0 (intel_idle: Fixed C6 state on Avoton/Rangeley processors) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 09 Jan, 2014 4 commits
-
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
-
Jiang Liu authored
This reverts commit 9d046ccb. Commit 9d046ccb marks all state tables with __initdata, but the state table may be accessed when doing CPU online, which then causing system crash as below: [ 204.188841] BUG: unable to handle kernel paging request at ffffffff8227cce8 [ 204.196844] IP: [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.203996] PGD 1e11067 PUD 1e12063 PMD 455859063 PTE 800000000227c062 [ 204.211638] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 204.216975] Modules linked in: x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd gpio_ich microcode joydev sb_edac edac_core ipmi_si lpc_ich ipmi_msghandler lp tpm_tis parport wmi mac_hid acpi_pad hid_generic ixgbe isci usbhid dca hid libsas ptp ahci libahci scsi_transport_sas megaraid_sas pps_core mdio [ 204.262815] CPU: 11 PID: 1489 Comm: bash Not tainted 3.13.0-rc7+ #48 [ 204.269993] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.L09.1312061514 12/06/2013 [ 204.281646] task: ffff8804303a24a0 ti: ffff880440fac000 task.ti: ffff880440fac000 [ 204.290311] RIP: 0010:[<ffffffff814aa1c0>] [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.300184] RSP: 0018:ffff880440fadd28 EFLAGS: 00010286 [ 204.306192] RAX: ffffffff8227cca0 RBX: ffffe8fff1a03400 RCX: 0000000000000007 [ 204.314244] RDX: ffff88045f400000 RSI: 0000000000000009 RDI: 0000000000001120 [ 204.322296] RBP: ffff880440fadd38 R08: 0000000000000000 R09: 0000000000000001 [ 204.330411] R10: 0000000000000001 R11: 0000000000000000 R12: 000000000000001e [ 204.338482] R13: 00000000ffffffdb R14: 0000000000000001 R15: 0000000000000000 [ 204.346743] FS: 00007f64f7b0c740(0000) GS:ffff88045ce00000(0000) knlGS:0000000000000000 [ 204.355919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 204.362449] CR2: ffffffff8227cce8 CR3: 0000000444ab0000 CR4: 00000000001407e0 [ 204.370520] Stack: [ 204.372853] 000000000000001e ffffffff81f10240 ffff880440fadd50 ffffffff814aa307 [ 204.381519] ffffffff81ea80e0 ffff880440fadda0 ffffffff8185a230 0000000000000000 [ 204.390196] 000000000000001e 0000000000000002 0000000000000002 0000000000000000 [ 204.398856] Call Trace: [ 204.401683] [<ffffffff814aa307>] cpu_hotplug_notify+0x57/0x70 [ 204.408638] [<ffffffff8185a230>] notifier_call_chain+0x100/0x150 [ 204.415553] [<ffffffff810a7dae>] __raw_notifier_call_chain+0xe/0x10 [ 204.422772] [<ffffffff81072163>] cpu_notify+0x23/0x50 [ 204.428616] [<ffffffff810723b2>] _cpu_up+0x132/0x1a0 [ 204.434361] [<ffffffff8107249d>] cpu_up+0x7d/0xa0 [ 204.439819] [<ffffffff81836c9c>] cpu_subsys_online+0x3c/0x90 [ 204.446345] [<ffffffff81554625>] device_online+0x45/0xa0 [ 204.452471] [<ffffffff815546ce>] online_store+0x4e/0x80 [ 204.458511] [<ffffffff815519a8>] dev_attr_store+0x18/0x30 [ 204.464744] [<ffffffff812a68f1>] sysfs_write_file+0x151/0x1c0 [ 204.471681] [<ffffffff81217ef1>] vfs_write+0xe1/0x160 [ 204.477524] [<ffffffff8121889c>] SyS_write+0x4c/0x90 [ 204.483270] [<ffffffff8185f2ed>] system_call_fastpath+0x1a/0x1f [ 204.490081] Code: 41 54 41 89 fc 8b 3d 48 25 85 01 53 48 8b 1d 30 25 85 01 48 03 1c c5 40 90 fb 81 48 8b 05 19 25 85 01 c7 43 0c 01 00 00 00 66 90 <48> 83 78 48 00 74 4f 41 83 c0 01 41 39 f0 7e 10 48 c7 c7 38 79 [ 204.515723] RIP [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.522996] RSP <ffff880440fadd28> [ 204.526976] CR2: ffffffff8227cce8 [ 204.530766] ---[ end trace 336f56cc3d1cfc8c ]--- Fixes: 9d046ccb (intel_idle: mark states tables with __initdata tag) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc fix from Helge Deller: "This patch fixes the kmap/kunmap implementation on parisc and finally makes AIO work on parisc" * 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Ensure full cache coherency for kmap/kunmap
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libataLinus Torvalds authored
Pull libata fixes from Tejun Heo: "Late fixes for libata. Nothing too interesting. Adding missing PM callbacks to satat_sis and an additional PCI ID for ahci" * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: sata_sis: missing PM support ahci: add PCI ID for Marvell 88SE9170 SATA controller
-
- 08 Jan, 2014 4 commits
-
-
John David Anglin authored
Helge Deller noted a few weeks ago problems with the AIO support on parisc. This change is the result of numerous iterations on how best to deal with this problem. The solution adopted here is to provide full cache coherency in a uniform manner on all parisc systems. This involves calling flush_dcache_page() on kmap operations and flush_kernel_dcache_page() on kunmap operations. As a result, the copy_user_page() and clear_user_page() functions can be removed and the overall code is simpler. The change ensures that both userspace and kernel aliases to a mapped page are invalidated and flushed. This is necessary for the correct operation of PA8800 and PA8900 based systems which do not support inequivalent aliases. With this change, I have observed no cache related issues on c8000 and rp3440. It is now possible for example to do kernel builds with "-j64" on four way systems. On systems using XFS file systems, the patch recently posted by Mikulas Patocka to "fix crash using XFS on loopback" is needed to avoid a hang caused by an uninitialized lock passed to flush_dcache_page() in the page struct. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Helge Deller <deller@gmx.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixesJohn W. Linville authored
Samuel Ortiz <sameo@linux.intel.com> says: "This is the first NFC fixes pull request for 3.13. It only contains one fix for a regression introduced with commit e29a9e2a. Without this fix, we can not establish a p2p link in target mode. Only initiator mode works." Signed-off-by: John W. Linville <linville@tuxdriver.com>
-
James Hogan authored
Commit 6d9252bd (clk: Add support for power of two type dividers) merged in v3.6 added the _get_val function to convert a divisor value to a register field value depending on the flags. However it used the type u8 for the div field, causing divisors larger than 255 to be masked and the resultant clock rate to be too high. E.g. in my case an 11bit divider was supposed to divide 24.576 MHz down to 32.768KHz. The divisor was correctly calculated as 750 (0x2ee). This was masked to 238 (0xee) resulting in a frequency of 103.26KHz. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Rajendra Nayak <rnayak@ti.com> Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Mike Turquette <mturquette@linaro.org>
-
git://anongit.freedesktop.org/nouveau/linux-2.6Dave Airlie authored
misc fixes for nouveau, one more msi rearm, regression fix for old bioses crash and leak fixes. * 'drm-nouveau-next' of git://anongit.freedesktop.org/nouveau/linux-2.6: drm/nouveau/nouveau: fix memory leak in nouveau_crtc_page_flip() drm/nouveau/bios: fix offset calculation for BMPv1 bioses drm/nouveau: return offset of allocated notifier drm/nouveau/bios: make jump conditional drm/nvce/mc: fix msi rearm on GF114 drm/nvc0/gr: fix mthd data submission drm/nouveau: populate master subdev pointer only when fully constructed
-