- 03 Oct, 2011 40 commits
-
-
Sudhakar Rajashekhara authored
commit 810198bc upstream. DA850/OMAP-L138 EMAC driver uses random mac address instead of a fixed one because the mac address is not stuffed into EMAC platform data. This patch provides a function which reads the mac address stored in SPI flash (registered as MTD device) and populates the EMAC platform data. The function which reads the mac address is registered as a callback which gets called upon addition of MTD device. NOTE: In case the MAC address stored in SPI flash is erased, follow the instructions at [1] to restore it. [1] http://processors.wiki.ti.com/index.php/GSG:_OMAP-L138_DVEVM_Additional_Procedures#Restoring_MAC_address_on_SPI_Flash Modifications in v2: Guarded registering the mtd_notifier only when MTD is enabled. Earlier this was handled using mtd_has_partitions() call, but this has been removed in Linux v3.0. Modifications in v3: a. Guarded da850_evm_m25p80_notify_add() function and da850evm_spi_notifier structure with CONFIG_MTD macros. b. Renamed da850_evm_register_mtd_user() function to da850_evm_setup_mac_addr() and removed the struct mtd_notifier argument to this function. c. Passed the da850evm_spi_notifier structure to register_mtd_user() function. Modifications in v4: Moved the da850_evm_setup_mac_addr() function within the first CONFIG_MTD ifdef construct. Signed-off-by:
Sudhakar Rajashekhara <sudhakar.raj@ti.com> Signed-off-by:
Sekhar Nori <nsekhar@ti.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Walleij authored
commit bb9ea778 upstream. I was intrigued by the fact that the clock stood still on the Integrator, but it wasn't strange at all, because the timer was set up all wrong and probably has been for a while. With this patch the clock starts ticking again: make the timer periodic (reload), |= on the divisor bit and load the timer before starting it. Signed-off-by:
Linus Walleij <linus.walleij@linaro.org> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Guenter Roeck authored
commit ff71c182 upstream. Current calculation is completely wrong. Add missing brackets to fix it. Signed-off-by:
Guenter Roeck <guenter.roeck@ericsson.com> Acked-by:
Jean Delvare <khali@linux-fr.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Konrad Rzeszutek Wilk authored
commit ed467e69 upstream. We have hit a couple of customer bugs where they would like to use those parameters to run an UP kernel - but both of those options turn of important sources of interrupt information so we end up not being able to boot. The correct way is to pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and the kernel will patch itself to be a UP kernel. Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308Acked-by:
Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Igor Mammedov authored
commit d198d499 upstream. If vmalloc page_fault happens inside of interrupt handler with interrupts disabled then on exit path from exception handler when there is no pending interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): cmpw $0x0001, XEN_vcpu_info_pending(%eax) sete XEN_vcpu_info_mask(%eax) will enable interrupts even if they has been previously disabled according to eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) setz XEN_vcpu_info_mask(%eax) Solution is in setting XEN_vcpu_info_mask only when it should be set according to cmpw $0x0001, XEN_vcpu_info_pending(%eax) but not clearing it if there isn't any pending events. Reproducer for bug is attached to RHBZ 707552 Signed-off-by:
Igor Mammedov <imammedo@redhat.com> Acked-by:
Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Girish K S authored
commit 49bb1e61 upstream. This patch fixes the problem in sdhci-s3c host driver for Samsung Soc's. During the card identification stage the mmc core driver enumerates for the best bus width in combination with the highest available data rate. It starts enumerating from the highest bus width (8) to lowest width (1). In case of few MMC cards the 4-bit bus enumeration fails and tries the 1-bit bus enumeration. When switched to 1-bit bus mode the host driver has to clear the previous bus width setting and apply the new setting. The current patch will clear the previous bus mode and apply the new mode setting. Signed-off-by:
Girish K S <girish.shivananjappa@linaro.org> Acked-by:
Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mika Westerberg authored
commit 50a50f92 upstream. The default multithread workqueue can cause the same work to be executed concurrently on a different CPUs. This isn't really suitable for clock gating as it might already gated the clock and gating it twice results both host->clk_old and host->ios.clock to be set to 0. To prevent this from happening we use system_nrt_wq instead. Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Tested-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mika Westerberg authored
commit 778e277c upstream. We have seen at least two different races when clock gating kicks in in a middle of ios structure update. First one happens when ios->clock is changed outside of aggressive clock gating framework, for example via mmc_set_clock(). The race might happen when we run following code: mmc_set_ios(): ... if (ios->clock > 0) mmc_set_ungated(host); Now if gating kicks in right after the condition check we end up setting host->clk_gated to false even though we have just gated the clock. Next time a request is started we try to ungate and restore the clock in mmc_host_clk_hold(). However since we have host->clk_gated set to false the original clock is not restored. This eventually will cause the host controller to hang since its clock is disabled while we are trying to issue a request. For example on Intel Medfield platform we see: [ 13.818610] mmc2: Timeout waiting for hardware interrupt. [ 13.818698] sdhci: =========== REGISTER DUMP (mmc2)=========== [ 13.818753] sdhci: Sys addr: 0x00000000 | Version: 0x00008901 [ 13.818804] sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000 [ 13.818853] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000 [ 13.818903] sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001 [ 13.818951] sdhci: Power: 0x0000000d | Blk gap: 0x00000000 [ 13.819000] sdhci: Wake-up: 0x00000000 | Clock: 0x00000000 [ 13.819049] sdhci: Timeout: 0x00000000 | Int stat: 0x00000000 [ 13.819098] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3 [ 13.819147] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000 [ 13.819196] sdhci: Caps: 0x6bee32b2 | Caps_1: 0x00000000 [ 13.819245] sdhci: Cmd: 0x00000000 | Max curr: 0x00000000 [ 13.819292] sdhci: Host ctl2: 0x00000000 [ 13.819331] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000 [ 13.819377] sdhci: =========================================== [ 13.919605] mmc2: Reset 0x2 never completed. and it never recovers. Second race might happen while running mmc_power_off(): static void mmc_power_off(struct mmc_host *host) { host->ios.clock = 0; host->ios.vdd = 0; [ clock gating kicks in here ] /* * Reset ocr mask to be the highest possible voltage supported for * this mmc host. This value will be used at next power up. */ host->ocr = 1 << (fls(host->ocr_avail) - 1); if (!mmc_host_is_spi(host)) { host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN; host->ios.chip_select = MMC_CS_DONTCARE; } host->ios.power_mode = MMC_POWER_OFF; host->ios.bus_width = MMC_BUS_WIDTH_1; host->ios.timing = MMC_TIMING_LEGACY; mmc_set_ios(host); } If the clock gating worker kicks in while we are only partially updated the ios structure the host controller gets incomplete ios and might not work as supposed. Again on Intel Medfield platform we get: [ 4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155! [ 4.185422] invalid opcode: 0000 [#1] PREEMPT SMP [ 4.185509] Modules linked in: [ 4.185565] [ 4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA [ 4.185742] EIP: 0060:[<c136364e>] EFLAGS: 00010083 CPU: 0 [ 4.185827] EIP is at sdhci_set_power+0x3e/0xd0 [ 4.185891] EAX: f5ff98e0 EBX: f5ff98e0 ECX: 00000000 EDX: 00000001 [ 4.185970] ESI: f5ff977c EDI: f5ff9904 EBP: f644fe98 ESP: f644fe94 [ 4.186049] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 4.186125] Process kworker/0:0 (pid: 4, ti=f644e000 task=f644c0e0 task.ti=f644e000) [ 4.186219] Stack: [ 4.186257] f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec [ 4.186418] c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286 [ 4.186579] f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e [ 4.186739] Call Trace: [ 4.186802] [<c1365173>] sdhci_set_ios+0x1c3/0x340 [ 4.186883] [<c1355bd8>] mmc_gate_clock+0x68/0x120 [ 4.186963] [<c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60 [ 4.187052] [<c107922b>] ? trace_hardirqs_on+0xb/0x10 [ 4.187134] [<c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130 [ 4.187219] [<c105ec09>] ? process_one_work+0xf9/0x5b0 [ 4.187300] [<c135841d>] mmc_host_clk_gate_work+0xd/0x10 [ 4.187379] [<c105ec82>] process_one_work+0x172/0x5b0 [ 4.187457] [<c105ec09>] ? process_one_work+0xf9/0x5b0 [ 4.187538] [<c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130 [ 4.187625] [<c105f3c8>] worker_thread+0x118/0x330 [ 4.187700] [<c1496cee>] ? preempt_schedule+0x2e/0x50 [ 4.187779] [<c105f2b0>] ? rescuer_thread+0x1f0/0x1f0 [ 4.187857] [<c1062cf4>] kthread+0x74/0x80 [ 4.187931] [<c1062c80>] ? __init_kthread_worker+0x60/0x60 [ 4.188015] [<c149acfa>] kernel_thread_helper+0x6/0xd [ 4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0 [ 4.188780] EIP: [<c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:f644fe94 [ 4.188898] ---[ end trace a7b23eecc71777e4 ]--- This BUG() comes from the fact that ios.power_mode was still in previous value (MMC_POWER_ON) and ios.vdd was set to zero. We prevent these by inhibiting the clock gating while we update the ios structure. Both problems can be reproduced by simply running the device in a reboot loop. Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Tested-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mika Westerberg authored
commit 08c14071 upstream. As per suggestion by Linus Walleij: > If you think the names of the functions are confusing then > you may rename them, say like this: > > mmc_host_clk_ungate() -> mmc_host_clk_hold() > mmc_host_clk_gate() -> mmc_host_clk_release() > > Which would make the usecases more clear (This is CC'd to stable@ because the next two patches, which fix observable races, depend on it.) Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Signed-off-by:
Chris Ball <cjb@laptop.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andrey Vagin authored
commit 20afc60f upstream. An event may occur when an mm is already released. I added an event in dequeue_entity() and caught a panic with the following backtrace: [ 434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120 ... [ 434.421258] Call Trace: [ 434.421258] [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0 [ 434.421258] [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90 [ 434.421258] [<ffffffff8101b048>] perf_callchain_user+0x128/0x170 [ 434.421258] [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100 [ 434.421258] [<ffffffff81116690>] perf_prepare_sample+0x200/0x280 [ 434.421258] [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290 [ 434.421258] [<ffffffff81065240>] ? tg_shares_up+0x0/0x670 [ 434.421258] [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0 [ 434.421258] [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0 [ 434.421258] [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250 [ 434.421258] [<ffffffff81119204>] perf_tp_event+0x44/0x70 [ 434.421258] [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110 [ 434.421258] [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0 [ 434.421258] [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60 [ 434.421258] [<ffffffff8105818a>] dequeue_task+0x9a/0xb0 [ 434.421258] [<ffffffff810581e2>] deactivate_task+0x42/0xe0 [ 434.421258] [<ffffffff814bc019>] thread_return+0x191/0x808 [ 434.421258] [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60 [ 434.421258] [<ffffffff8106f4c4>] do_exit+0x464/0x910 [ 434.421258] [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0 [ 434.421258] [<ffffffff8106fa57>] sys_exit_group+0x17/0x20 [ 434.421258] [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b Signed-off-by:
Andrey Vagin <avagin@openvz.org> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.orgSigned-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
WANG Cong authored
commit feff8fa0 upstream. This patch fixes the following memory leak: unreferenced object 0xffff880107266800 (size 512): comm "sched-powersave", pid 3718, jiffies 4323097853 (age 27495.450s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81133940>] create_object+0x187/0x28b [<ffffffff814ac103>] kmemleak_alloc+0x73/0x98 [<ffffffff811232ba>] __kmalloc_node+0x104/0x159 [<ffffffff81044b98>] kzalloc_node.clone.97+0x15/0x17 [<ffffffff8104cb90>] build_sched_domains+0xb7/0x7f3 [<ffffffff8104d4df>] partition_sched_domains+0x1db/0x24a [<ffffffff8109ee4a>] do_rebuild_sched_domains+0x3b/0x47 [<ffffffff810a00c7>] rebuild_sched_domains+0x10/0x12 [<ffffffff8104d5ba>] sched_power_savings_store+0x6c/0x7b [<ffffffff8104d5df>] sched_mc_power_savings_store+0x16/0x18 [<ffffffff8131322c>] sysdev_class_store+0x20/0x22 [<ffffffff81193876>] sysfs_write_file+0x108/0x144 [<ffffffff81135b10>] vfs_write+0xaf/0x102 [<ffffffff81135d23>] sys_write+0x4d/0x74 [<ffffffff814c8a42>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by:
WANG Cong <amwang@redhat.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1313671017-4112-1-git-send-email-amwang@redhat.comSigned-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Gleixner authored
commit 9c40cef2 upstream. There is no real reason to run blk_schedule_flush_plug() with interrupts and preemption disabled. Move it into schedule() and call it when the task is going voluntarily to sleep. There might be false positives when the task is woken between that call and actually scheduling, but that's not really different from being woken immediately after switching away. This fixes a deadlock in the scheduler where the blk_schedule_flush_plug() callchain enables interrupts and thereby allows a wakeup to happen of the task that's going to sleep. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-dwfxtra7yg1b5r65m32ywtct@git.kernel.orgSigned-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Gleixner authored
commit c259e01a upstream. Block-IO and workqueues call into notifier functions from the scheduler core code with interrupts and preemption disabled. These calls should be made before entering the scheduler core. To simplify this, separate the scheduler core code into __schedule(). __schedule() is directly called from the places which set PREEMPT_ACTIVE and from schedule(). This allows us to add the work checks into schedule(), so they are only called when a task voluntary goes to sleep. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20110622174918.813258321@linutronix.deSigned-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
John Stultz authored
commit 938f97bc upstream. Thomas earlier submitted a fix to limit the RTC PIE freq, but picked 5000Hz out of the air. Willy noticed that we should instead use the 8192Hz max from the rtc man documentation. Cc: Willy Tarreau <w@1wt.eu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
John Stultz <john.stultz@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
John Stultz authored
commit 6af7e471 upstream. Its possible to jam up the alarm timers by setting very small interval timers, which will cause the alarmtimer subsystem to spend all of its time firing and restarting timers. This can effectivly lock up a box. A deeper fix is needed, closely mimicking the hrtimer code, but for now just cap the interval to 100us to avoid userland hanging the system. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
John Stultz <john.stultz@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
John Stultz authored
commit ea7802f6 upstream. Following common_timer_get, zero out the itimerspec passed in. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
John Stultz <john.stultz@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
John Stultz authored
commit 971c90bf upstream. We don't check if old_setting is non null before assigning it, so correct this. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
John Stultz <john.stultz@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Troy Kisky authored
commit 425933b3 upstream. iomux-v3.c uses NO_PAD_CTRL as a 32 bit value so it should not be shifted left by MUX_PAD_CTRL_SHIFT(41) Previously, anything requesting NO_PAD_CTRL would get their pad control register set to 0. Since it is a pad control mask, place it with the other mask values. Signed-off-by:
Troy Kisky <troy.kisky@boundarydevices.com> Acked-by:
Lothar Waßmann <LW@KARO-electronics.de> Tested-by:
Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by:
Sascha Hauer <s.hauer@pengutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Carolyn Wyborny authored
commit 6d337dce upstream. This patch fixes a problem where WOL would fail on second port of i350 device. Reported-by:
Martin Wilck <martin.wilck@ts.fujitsu.com> Reported-by: Stefan Assmann<sassmann@redhat.com> Signed-off-by:
Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by:
Aaron Brown <aaron.f.brown@intel.com> Signed-off-by:
Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mel Gorman authored
commit 76d3fbf8 upstream. With zone_reclaim_mode enabled, it's possible for zones to be considered full in the zonelist_cache so they are skipped in the future. If the process enters direct reclaim, the ZLC may still consider zones to be full even after reclaiming pages. Reconsider all zones for allocation if direct reclaim returns successfully. Signed-off-by:
Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mel Gorman authored
commit cd38b115 upstream. There have been a small number of complaints about significant stalls while copying large amounts of data on NUMA machines reported on a distribution bugzilla. In these cases, zone_reclaim was enabled by default due to large NUMA distances. In general, the complaints have not been about the workload itself unless it was a file server (in which case the recommendation was disable zone_reclaim). The stalls are mostly due to significant amounts of time spent scanning the preferred zone for pages to free. After a failure, it might fallback to another node (as zonelists are often node-ordered rather than zone-ordered) but stall quickly again when the next allocation attempt occurs. In bad cases, each page allocated results in a full scan of the preferred zone. Patch 1 checks the preferred zone for recent allocation failure which is particularly important if zone_reclaim has failed recently. This avoids rescanning the zone in the near future and instead falling back to another node. This may hurt node locality in some cases but a failure to zone_reclaim is more expensive than a remote access. Patch 2 clears the zlc information after direct reclaim. Otherwise, zone_reclaim can mark zones full, direct reclaim can reclaim enough pages but the zone is still not considered for allocation. This was tested on a 24-thread 2-node x86_64 machine. The tests were focused on large amounts of IO. All tests were bound to the CPUs on node-0 to avoid disturbances due to processes being scheduled on different nodes. The kernels tested are 3.0-rc6-vanilla Vanilla 3.0-rc6 zlcfirst Patch 1 applied zlcreconsider Patches 1+2 applied FS-Mark ./fs_mark -d /tmp/fsmark-10813 -D 100 -N 5000 -n 208 -L 35 -t 24 -S0 -s 524288 fsmark-3.0-rc6 3.0-rc6 3.0-rc6 vanilla zlcfirs zlcreconsider Files/s min 54.90 ( 0.00%) 49.80 (-10.24%) 49.10 (-11.81%) Files/s mean 100.11 ( 0.00%) 135.17 (25.94%) 146.93 (31.87%) Files/s stddev 57.51 ( 0.00%) 138.97 (58.62%) 158.69 (63.76%) Files/s max 361.10 ( 0.00%) 834.40 (56.72%) 802.40 (55.00%) Overhead min 76704.00 ( 0.00%) 76501.00 ( 0.27%) 77784.00 (-1.39%) Overhead mean 1485356.51 ( 0.00%) 1035797.83 (43.40%) 1594680.26 (-6.86%) Overhead stddev 1848122.53 ( 0.00%) 881489.88 (109.66%) 1772354.90 ( 4.27%) Overhead max 7989060.00 ( 0.00%) 3369118.00 (137.13%) 10135324.00 (-21.18%) MMTests Statistics: duration User/Sys Time Running Test (seconds) 501.49 493.91 499.93 Total Elapsed Time (seconds) 2451.57 2257.48 2215.92 MMTests Statistics: vmstat Page Ins 46268 63840 66008 Page Outs 90821596 90671128 88043732 Swap Ins 0 0 0 Swap Outs 0 0 0 Direct pages scanned 13091697 8966863 8971790 Kswapd pages scanned 0 1830011 1831116 Kswapd pages reclaimed 0 1829068 1829930 Direct pages reclaimed 13037777 8956828 8648314 Kswapd efficiency 100% 99% 99% Kswapd velocity 0.000 810.643 826.346 Direct efficiency 99% 99% 96% Direct velocity 5340.128 3972.068 4048.788 Percentage direct scans 100% 83% 83% Page writes by reclaim 0 3 0 Slabs scanned 796672 720640 720256 Direct inode steals 7422667 7160012 7088638 Kswapd inode steals 0 1736840 2021238 Test completes far faster with a large increase in the number of files created per second. Standard deviation is high as a small number of iterations were much higher than the mean. The number of pages scanned by zone_reclaim is reduced and kswapd is used for more work. LARGE DD 3.0-rc6 3.0-rc6 3.0-rc6 vanilla zlcfirst zlcreconsider download tar 59 ( 0.00%) 59 ( 0.00%) 55 ( 7.27%) dd source files 527 ( 0.00%) 296 (78.04%) 320 (64.69%) delete source 36 ( 0.00%) 19 (89.47%) 20 (80.00%) MMTests Statistics: duration User/Sys Time Running Test (seconds) 125.03 118.98 122.01 Total Elapsed Time (seconds) 624.56 375.02 398.06 MMTests Statistics: vmstat Page Ins 3594216f 439368 407032 Page Outs 23380832 23380488 23377444 Swap Ins 0 0 0 Swap Outs 0 436 287 Direct pages scanned 17482342 69315973 82864918 Kswapd pages scanned 0 519123 575425 Kswapd pages reclaimed 0 466501 522487 Direct pages reclaimed 5858054 2732949 2712547 Kswapd efficiency 100% 89% 90% Kswapd velocity 0.000 1384.254 1445.574 Direct efficiency 33% 3% 3% Direct velocity 27991.453 184832.737 208171.929 Percentage direct scans 100% 99% 99% Page writes by reclaim 0 5082 13917 Slabs scanned 17280 29952 35328 Direct inode steals 115257 1431122 332201 Kswapd inode steals 0 0 979532 This test downloads a large tarfile and copies it with dd a number of times - similar to the most recent bug report I've dealt with. Time to completion is reduced. The number of pages scanned directly is still disturbingly high with a low efficiency but this is likely due to the number of dirty pages encountered. The figures could probably be improved with more work around how kswapd is used and how dirty pages are handled but that is separate work and this result is significant on its own. Streaming Mapped Writer MMTests Statistics: duration User/Sys Time Running Test (seconds) 124.47 111.67 112.64 Total Elapsed Time (seconds) 2138.14 1816.30 1867.56 MMTests Statistics: vmstat Page Ins 90760 89124 89516 Page Outs 121028340 120199524 120736696 Swap Ins 0 86 55 Swap Outs 0 0 0 Direct pages scanned 114989363 96461439 96330619 Kswapd pages scanned 56430948 56965763 57075875 Kswapd pages reclaimed 27743219 27752044 27766606 Direct pages reclaimed 49777 46884 36655 Kswapd efficiency 49% 48% 48% Kswapd velocity 26392.541 31363.631 30561.736 Direct efficiency 0% 0% 0% Direct velocity 53780.091 53108.759 51581.004 Percentage direct scans 67% 62% 62% Page writes by reclaim 385 122 1513 Slabs scanned 43008 39040 42112 Direct inode steals 0 10 8 Kswapd inode steals 733 534 477 This test just creates a large file mapping and writes to it linearly. Time to completion is again reduced. The gains are mostly down to two things. In many cases, there is less scanning as zone_reclaim simply gives up faster due to recent failures. The second reason is that memory is used more efficiently. Instead of scanning the preferred zone every time, the allocator falls back to another zone and uses it instead improving overall memory utilisation. This patch: initialise ZLC for first zone eligible for zone_reclaim. The zonelist cache (ZLC) is used among other things to record if zone_reclaim() failed for a particular zone recently. The intention is to avoid a high cost scanning extremely long zonelists or scanning within the zone uselessly. Currently the zonelist cache is setup only after the first zone has been considered and zone_reclaim() has been called. The objective was to avoid a costly setup but zone_reclaim is itself quite expensive. If it is failing regularly such as the first eligible zone having mostly mapped pages, the cost in scanning and allocation stalls is far higher than the ZLC initialisation step. This patch initialises ZLC before the first eligible zone calls zone_reclaim(). Once initialised, it is checked whether the zone failed zone_reclaim recently. If it has, the zone is skipped. As the first zone is now being checked, additional care has to be taken about zones marked full. A zone can be marked "full" because it should not have enough unmapped pages for zone_reclaim but this is excessive as direct reclaim or kswapd may succeed where zone_reclaim fails. Only mark zones "full" after zone_reclaim fails if it failed to reclaim enough pages after scanning. Signed-off-by:
Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Alex Deucher authored
commit d054ac16 upstream. If the bios or OS sets the pci max read request size to 0 or an invalid value (6,7), it can result in a hang or slowdown. Check and set it to something sane if it's invalid. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=42162 v2: use pci reg defines from include/linux/pci_regs.h Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> Signed-off-by:
Dave Airlie <airlied@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Airlie authored
commit 9adceaa5 upstream. On some Power rv100 cards, we have no ATY OF table, but we have no combios table either, and hence we refuse all modes on VGA-0 since we end up with a 0 max pixel clock. Signed-off-by:
Dave Airlie <airlied@redhat.com> Reviewed-by:
Alex Deucher <alexdeucher@gmail.com> Reviewed-by:
Jerome Glisse <jglisse@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
NeilBrown authored
commit 1b6afa17 upstream. I don't know what I was thinking putting 'rcu' after a dynamically sized array! The array could still be in use when we call rcu_free() (That is the point) so we mustn't corrupt it. Signed-off-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Srinivas Kandagatla authored
commit 43c734be upstream. This patch fixes L2 Cache size calculations for L2C-210, L2C-310 and PL310, by changing the L2X0_AUX_CTRL_WAY_SIZE_MASK from 2 bits to 3 bits. The Auxiliary Control Register for L2C-210, L2C-310 and PL310 has 3bits [19:17] for Way size, however the existing code only uses 2 bits to get this value. This results in incorrect cachesize calculations. It also results in performing operations on the whole cache when we erroneously decide that the range is big enough (due to l2x0_size being too small) and also prints incorrect cachesize. Signed-off-by:
Srinivas Kandagatla <srinivas.kandagatla@st.com> Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jerome Glisse authored
commit a49a50da upstream. For some reason SPI block is in broken state after module unloading. This lead to broken rendering after reloading module. Fix this by reseting SPI block in CP resume function Signed-off-by:
Jerome Glisse <jglisse@redhat.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Dave Airlie <airlied@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Alex Deucher authored
commit 302a8e8b upstream. Fixes resume on Compaq Presario V5245EU. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=41642Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Dave Airlie <airlied@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David S. Miller authored
commit 1a8e0da5 upstream. Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Axel Lin authored
commit d04156bc upstream. Also add a default case in tps65910_list_voltage_dcdc to silence 'volt' may be used uninitialized in this function warning. Signed-off-by:
Axel Lin <axel.lin@gmail.com> Acked-by:
Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by:
Liam Girdwood <lrg@slimlogic.co.uk> Cc: Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kjetil Oftedal authored
commit 38f7f8f0 upstream. On Sun4d systems running in SMP mode, IRQ 14 is used for timer interrupts and has a specialized interrupt handler. IPI is currently set to use IRQ 14 as well, which causes it to trigger the timer interrupt handler, and not the IPI interrupt handler. The IPI interrupt is therefore changed to IRQ 13, which is the highest normally handled interrupt. This IRQ is also used for SBUS interrupts, however there is nothing in the IPI/SBUS interrupt handlers that indicate that they will not handle sharing the interrupt. (IRQ 13 is indicated as audio interrupt, which is unlikely to be found in a sun4d system) Signed-off-by:
Kjetil Oftedal <oftedal@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Ian Campbell authored
commit 4a0342ca upstream. CC arch/sparc/kernel/pcic.o arch/sparc/kernel/pcic.c: In function 'pcic_probe': arch/sparc/kernel/pcic.c:359:33: error: array subscript is above array bounds [-Werror=array-bounds] arch/sparc/kernel/pcic.c:359:8: error: array subscript is above array bounds [-Werror=array-bounds] arch/sparc/kernel/pcic.c:360:33: error: array subscript is above array bounds [-Werror=array-bounds] arch/sparc/kernel/pcic.c:360:8: error: array subscript is above array bounds [-Werror=array-bounds] arch/sparc/kernel/pcic.c:361:33: error: array subscript is above array bounds [-Werror=array-bounds] arch/sparc/kernel/pcic.c:361:8: error: array subscript is above array bounds [-Werror=array-bounds] cc1: all warnings being treated as errors I'm not particularly familiar with sparc but t_nmi (defined in head_32.S via the TRAP_ENTRY macro) and pcic_nmi_trap_patch (defined in entry.S) both appear to be 4 instructions long and I presume from the usage that instructions are int sized. Signed-off-by:
Ian Campbell <ian.campbell@citrix.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Reviewed-by:
Sam Ravnborg <sam@ravnborg.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David S. Miller authored
[ Upstream commit 178a2960 ] Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
David S. Miller authored
commit 5598473a upstream. If we can't push the pending register windows onto the user's stack, we disallow signal delivery even if the signal would be delivered on a valid seperate signal stack. Add a register window save area in the signal frame, and store any unsavable windows there. On sigreturn, if any windows are still queued up in the signal frame, try to push them back onto the stack and if that fails we kill the process immediately. This allows the debug/tst-longjmp_chk2 glibc test case to pass. Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mikael Pettersson authored
commit 3f6aa0b1 upstream. The sparc32 version of arch_write_unlock() is just a plain assignment. Unfortunately this allows the compiler to schedule side-effects in a protected region to occur after the HW-level unlock, which is broken. E.g., the following trivial test case gets miscompiled: #include <linux/spinlock.h> rwlock_t lock; int counter; void foo(void) { write_lock(&lock); ++counter; write_unlock(&lock); } Fixed by adding a compiler memory barrier to arch_write_unlock(). The sparc64 version combines the barrier and assignment into a single asm(), and implements the operation as a static inline, so that's what I did too. Compile-tested with sparc32_defconfig + CONFIG_SMP=y. Signed-off-by:
Mikael Pettersson <mikpe@it.uu.se> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mikael Pettersson authored
commit a0fba3eb upstream. The sparc64 spinlock_64.h contains a number of operations defined first as static inline functions, and then as macros with the same names and parameters as the functions. Maybe this was needed at some point in the past, but now nothing seems to depend on these macros (checked with a recursive grep looking for ifdefs on these names). Other archs don't define these identity-macros. So this patch deletes these unnecessary macros. Compile-tested with sparc64_defconfig. Signed-off-by:
Mikael Pettersson <mikpe@it.uu.se> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stanislaw Gruszka authored
commit 674db134 upstream. Patch should fix this oops: BUG: unable to handle kernel NULL pointer dereference at 000000a0 IP: [<f81b30c9>] rt2800usb_get_txwi+0x19/0x70 [rt2800usb] *pdpt = 0000000000000000 *pde = f000ff53f000ff53 Oops: 0000 [#1] SMP Pid: 198, comm: kworker/u:3 Tainted: G W 3.0.0-wl+ #9 LENOVO 6369CTO/6369CTO EIP: 0060:[<f81b30c9>] EFLAGS: 00010283 CPU: 1 EIP is at rt2800usb_get_txwi+0x19/0x70 [rt2800usb] EAX: 00000000 EBX: f465e140 ECX: f4494960 EDX: ef24c5f8 ESI: 810f21f5 EDI: f1da9960 EBP: f4581e80 ESP: f4581e70 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process kworker/u:3 (pid: 198, ti=f4580000 task=f4494960 task.ti=f4580000) Call Trace: [<f804790f>] rt2800_txdone_entry+0x2f/0xf0 [rt2800lib] [<c045110d>] ? warn_slowpath_common+0x7d/0xa0 [<f81b3a38>] ? rt2800usb_work_txdone+0x288/0x360 [rt2800usb] [<f81b3a38>] ? rt2800usb_work_txdone+0x288/0x360 [rt2800usb] [<f81b3a13>] rt2800usb_work_txdone+0x263/0x360 [rt2800usb] [<c046a8d6>] process_one_work+0x186/0x440 [<c046a85a>] ? process_one_work+0x10a/0x440 [<f81b37b0>] ? rt2800usb_probe_hw+0x120/0x120 [rt2800usb] [<c046c283>] worker_thread+0x133/0x310 [<c04885db>] ? trace_hardirqs_on+0xb/0x10 [<c046c150>] ? manage_workers+0x1e0/0x1e0 [<c047054c>] kthread+0x7c/0x90 [<c04704d0>] ? __init_kthread_worker+0x60/0x60 [<c0826b42>] kernel_thread_helper+0x6/0x1 Oops might happen because we check rt2x00queue_empty(queue) twice, but this condition can change and we can process entry in rt2800_txdone_entry(), which was already processed by rt2800usb_txdone_entry_check() -> rt2x00lib_txdone_noinfo() and has nullify entry->skb . Reported-by:
Justin Piszcz <jpiszcz@lucidpixels.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Acked-by:
Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stanislaw Gruszka authored
commit 4b1bfb7d upstream. Patch should fix this oops: BUG: unable to handle kernel NULL pointer dereference at 000000a0 IP: [<f8e06078>] rt2800usb_write_tx_desc+0x18/0xc0 [rt2800usb] *pdpt = 000000002408c001 *pde = 0000000024079067 *pte = 0000000000000000 Oops: 0000 [#1] SMP EIP: 0060:[<f8e06078>] EFLAGS: 00010282 CPU: 0 EIP is at rt2800usb_write_tx_desc+0x18/0xc0 [rt2800usb] EAX: 00000035 EBX: ef2bef10 ECX: 00000000 EDX: d40958a0 ESI: ef1865f8 EDI: ef1865f8 EBP: d4095878 ESP: d409585c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Call Trace: [<f8da5e85>] rt2x00queue_write_tx_frame+0x155/0x300 [rt2x00lib] [<f8da424c>] rt2x00mac_tx+0x7c/0x370 [rt2x00lib] [<c04882b2>] ? mark_held_locks+0x62/0x90 [<c081f645>] ? _raw_spin_unlock_irqrestore+0x35/0x60 [<c04884ba>] ? trace_hardirqs_on_caller+0x5a/0x170 [<c04885db>] ? trace_hardirqs_on+0xb/0x10 [<f8d618ac>] __ieee80211_tx+0x5c/0x1e0 [mac80211] [<f8d631fc>] ieee80211_tx+0xbc/0xe0 [mac80211] [<f8d63163>] ? ieee80211_tx+0x23/0xe0 [mac80211] [<f8d632e1>] ieee80211_xmit+0xc1/0x200 [mac80211] [<f8d63220>] ? ieee80211_tx+0xe0/0xe0 [mac80211] [<c0487d45>] ? lock_release_holdtime+0x35/0x1b0 [<f8d63986>] ? ieee80211_subif_start_xmit+0x446/0x5f0 [mac80211] [<f8d637dd>] ieee80211_subif_start_xmit+0x29d/0x5f0 [mac80211] [<f8d63924>] ? ieee80211_subif_start_xmit+0x3e4/0x5f0 [mac80211] [<c0760188>] ? sock_setsockopt+0x6a8/0x6f0 [<c0760000>] ? sock_setsockopt+0x520/0x6f0 [<c076daef>] dev_hard_start_xmit+0x2ef/0x650 Oops might happen because we perform parallel putting new entries in a queue (rt2x00queue_write_tx_frame()) and removing entries after finishing transmitting (rt2800usb_work_txdone()). There are cases when _txdone may process an entry that was not fully send and nullify entry->skb . To fix check in _txdone if entry has flags that indicate pending transmission and wait until flags get cleared. Reported-by:
Justin Piszcz <jpiszcz@lucidpixels.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Acked-by:
Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Daniel Schwierzeck authored
commit fbe5e29e upstream. This oops have been already fixed with commit 27141666 atm: [br2684] Fix oops due to skb->dev being NULL It happens that if a packet arrives in a VC between the call to open it on the hardware and the call to change the backend to br2684, br2684_regvcc processes the packet and oopses dereferencing skb->dev because it is NULL before the call to br2684_push(). but have been introduced again with commit b6211ae7 atm: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by:
Daniel Schwierzeck <daniel.schwierzeck@googlemail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit 6d0e194d upstream. On AVERATEC 3200, pata_via causes memory corruption with ATAPI DMA, which often leads to random kernel oops. The cause of the problem is not well understood yet and only small subset of machines using the controller seem affected. Blacklist ATAPI DMA on the machine. Signed-off-by:
Tejun Heo <tj@kernel.org> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=11426Reported-and-tested-by:
Jim Bray <jimsantelmo@gmail.com> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by:
Jeff Garzik <jgarzik@pobox.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
John Stanley authored
commit 4b00e4b3 upstream. Two additional savage4 variants were added, but the S3_SAVAGE4_SERIES macro was incompletely modified, resulting in a false positive detection of a savage4 card regardless of which savage card is actually present. For non-savage4 series cards, such as a Savage/IX-MV card, this results in garbled video and/or a hard-hang at boot time. Fix this by changing an '||' to an '&&' in the S3_SAVAGE4_SERIES macro. Signed-off-by:
John P. Stanley <jpsinthemix@verizon.net> Reviewed-by:
Tormod Volden <debian.tormod@gmail.com> [ The macros have incomplete parenthesis too, but whatever .. -Linus ] Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-