- 31 Jul, 2012 2 commits
-
-
Eric Dumazet authored
Input path is mostly run under RCU and doesnt touch dst refcnt But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost. 24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started) before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit 404e0a8b (net: ipv4: fix RCU races on dst refcounts) tried to solve a race but added a problem at device/fib dismantle time : We really want to call dst_free() as soon as possible, even if sockets still have dst in their cache. dst_release() calls in free_fib_info_rcu() are not welcomed. Root of the problem was that now we also cache output routes (in nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in rt_free(), because output route lookups are done in process context. Based on feedback and initial patch from David Miller (adding another call_rcu_bh() call in fib, but it appears it was not the right fix) I left the inet_sk_rx_dst_set() helper and added __rcu attributes to nh_rth_output and nh_rth_input to better document what is going on in this code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 30 Jul, 2012 18 commits
-
-
stephen hemminger authored
Simple table that can be marked const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
After IP route cache removal, rt_cache_rebuild_count is no longer used. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit c6cffba4 (ipv4: Fix input route performance regression.) added various fatal races with dst refcounts. crashes happen on tcp workloads if routes are added/deleted at the same time. The dst_free() calls from free_fib_info_rcu() are clearly racy. We need instead regular dst refcounting (dst_release()) and make sure dst_release() is aware of RCU grace periods : Add DST_RCU_FREE flag so that dst_release() respects an RCU grace period before dst destruction for cached dst Introduce a new inet_sk_rx_dst_set() helper, using atomic_inc_not_zero() to make sure we dont increase a zero refcount (On a dst currently waiting an rcu grace period before destruction) rt_cache_route() must take a reference on the new cached route, and release it if was not able to install it. With this patch, my machines survive various benchmarks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
early_demux() handlers should be called in RCU context, and as we use skb_dst_set_noref(skb, dst), caller must not exit from RCU context before dst use (skb_dst(skb)) or release (skb_drop(dst)) Therefore, rcu_read_lock()/rcu_read_unlock() pairs around ->early_demux() are confusing and not needed : Protocol handlers are already in an RCU read lock section. (__netif_receive_skb() does the rcu_read_lock() ) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mathias Krause authored
The tun module leaks up to 36 bytes of memory by not fully initializing a structure located on the stack that gets copied to user memory by the TUNGETIFF and SIOCGIFHWADDR ioctl()s. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Spinlock should be taken before checking for tp->hw_stats. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
After Power-on-reset, the 5719's TX DMA length registers may contain uninitialized values and cause TX DMA to stall. Check for invalid values and set a register bit to flush the TX channels. The bit needs to be turned off after the DMA channels have been flushed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
The workaround was mis-applied to all 5719 and 5720 chips. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
to prevent PHY access conflict with APE firmware. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Li Wei authored
When userspace use RTM_GETROUTE to dump route table, with an already expired route entry, we always got an 'expires' value(2147157) calculated base on INT_MAX. The reason of this problem is in the following satement: rt->dst.expires - jiffies < INT_MAX gcc promoted the type of both sides of '<' to unsigned long, thus a small negative value would be considered greater than INT_MAX. With the help of Eric Dumazet, do the out of bound checks in rtnl_put_cacheinfo(), _after_ conversion to clock_t. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Karsten Keil authored
The test for the fillempty condition was wrong in one place. Changed the variable to the right boolean type. Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Devendra Naga authored
the driver sees wether the dev_seeq pointer is having a error that can be read by using the PTR_ERR, and returns it at error case, other wise 0 at success case. the PTR_RET does the same thing, and use PTR_RET instead of redoing the code of PTR_RET Signed-off-by: Devendra Naga <develkernel412222@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Devendra Naga authored
casting the void pointer is redundant (Documentation/CodingStyle) Signed-off-by: Devendra Naga <develkernel412222@gmail.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lin Ming authored
The first parameter struct trie *t is not used anymore. Remove it. Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lin Ming authored
It should print size of struct rt_trie_node * allocated instead of size of struct rt_trie_node. Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
brenohl@br.ibm.com authored
This patch fills the net_device vlan_features with the proper hardware features, thus, improving the vlan interface performance. With the patch applied, I can see around 148% improvement on a TCP_STREAM test, from 3.5 Gb/s to 8.7 Gb/s. On TCP_RR, I see a 11% improvement, from 18k to 20. The CPU utilization is almost the same on both cases, from the comparison above. Signed-off-by: Breno Leitao <brenohl@br.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 28 Jul, 2012 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: "Several bug fixes, some to new features appearing in this merge window, some that have been around for a while. I have a short list of known problems that need to be sorted out, but all of them can be solved easily during the run up to 3.6-final. I'll be offline until Sunday afternoon, but nothing need hold up 3.6-rc1 and the close of the merge window, networking wise, at this point. 1) Fix interface check in ipv4 TCP early demux, from Eric Dumazet. 2) Fix a long standing bug in TCP DMA to userspace offload that can hang applications using MSG_TRUNC, from Jiri Kosina. 3) Don't allow TCP_USER_TIMEOUT to be negative, from Hangbin Liu. 4) Don't use GFP_KERNEL under spinlock in kaweth driver, from Dan Carpenter" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: tcp: perform DMA to userspace only if there is a task waiting for it Revert "openvswitch: potential NULL deref in sample()" ipv4: fix TCP early demux net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling USB: kaweth.c: use GFP_ATOMIC under spin_lock tcp: Add TCP_USER_TIMEOUT negative value check bcma: add missing iounmap on error path bcma: fix regression in interrupt assignment on mips mac80211_hwsim: fix possible race condition in usage of info->control.sta & control.vif
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 updates from Ted Ts'o: "The usual collection of bug fixes and optimizations. Perhaps of greatest note is a speed up for parallel, non-allocating DIO writes, since we no longer take the i_mutex lock in that case. For bug fixes, we fix an incorrect overhead calculation which caused slightly incorrect results for df(1) and statfs(2). We also fixed bugs in the metadata checksum feature." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits) ext4: undo ext4_calc_metadata_amount if we fail to claim space ext4: don't let i_reserved_meta_blocks go negative ext4: fix hole punch failure when depth is greater than 0 ext4: remove unnecessary argument from __ext4_handle_dirty_metadata() ext4: weed out ext4_write_super ext4: remove unnecessary superblock dirtying ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super() ext4: remove useless marking of superblock dirty ext4: fix ext4 mismerge back in January ext4: remove dynamic array size in ext4_chksum() ext4: remove unused variable in ext4_update_super() ext4: make quota as first class supported feature ext4: don't take the i_mutex lock when doing DIO overwrites ext4: add a new nolock flag in ext4_map_blocks ext4: split ext4_file_write into buffered IO and direct IO ext4: remove an unused statement in ext4_mb_get_buddy_page_lock() ext4: fix out-of-date comments in extents.c ext4: use s_csum_seed instead of i_csum_seed for xattr block ext4: use proper csum calculation in ext4_rename ext4: fix overhead calculation used by ext4_statfs() ...
-
- 27 Jul, 2012 18 commits
-
-
git://git.linaro.org/people/rmk/linux-armLinus Torvalds authored
Pull ARM updates from Russell King: "First ARM push of this merge window, post me coming back from holiday. This is what has been in linux-next for the last few weeks. Not much to say which isn't described by the commit summaries." * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (32 commits) ARM: 7463/1: topology: Update cpu_power according to DT information ARM: 7462/1: topology: factorize the update of sibling masks ARM: 7461/1: topology: Add arch_scale_freq_power function ARM: 7456/1: ptrace: provide separate functions for tracing syscall {entry,exit} ARM: 7455/1: audit: move syscall auditing until after ptrace SIGTRAP handling ARM: 7454/1: entry: don't bother with syscall tracing on ret_from_fork path ARM: 7453/1: audit: only allow syscall auditing for pure EABI userspace ARM: 7452/1: delay: allow timer-based delay implementation to be selected ARM: 7451/1: arch timer: implement read_current_timer and get_cycles ARM: 7450/1: dcache: select DCACHE_WORD_ACCESS for little-endian ARMv6+ CPUs ARM: 7449/1: use generic strnlen_user and strncpy_from_user functions ARM: 7448/1: perf: remove arm_perf_pmu_ids global enumeration ARM: 7447/1: rwlocks: remove unused branch labels from trylock routines ARM: 7446/1: spinlock: use ticket algorithm for ARMv6+ locking implementation ARM: 7445/1: mm: update CONTEXTIDR register to contain PID of current process ARM: 7444/1: kernel: add arch-timer C3STOP feature ARM: 7460/1: remove asm/locks.h ARM: 7439/1: head.S: simplify initial page table mapping ARM: 7437/1: zImage: Allow DTB command line concatenation with ATAG_CMDLINE ARM: 7436/1: Do not map the vectors page as write-through on UP systems ...
-
Russell King authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirelessDavid S. Miller authored
John W. Linville says: ==================== These fixes are intended for the 3.6 stream. Hauke Mehrtens provides a pair of bcma fixes, one to fix a build regression on mips and another to correct a pair of missing iounmap calls. Thomas Huehn offers a mac80211_hwsim fix to avoid a possible use-after-free bug. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Kosina authored
Back in 2006, commit 1a2449a8 ("[I/OAT]: TCP recv offload to I/OAT") added support for receive offloading to IOAT dma engine if available. The code in tcp_rcv_established() tries to perform early DMA copy if applicable. It however does so without checking whether the userspace task is actually expecting the data in the buffer. This is not a problem under normal circumstances, but there is a corner case where this doesn't work -- and that's when MSG_TRUNC flag to recvmsg() is used. If the IOAT dma engine is not used, the code properly checks whether there is a valid ucopy.task and the socket is owned by userspace, but misses the check in the dmaengine case. This problem can be observed in real trivially -- for example 'tbench' is a good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they have been already early-copied in tcp_rcv_established() using dma engine. This patch introduces the same check we are performing in the simple iovec copy case to the IOAT case as well. It fixes the indefinite recvmsg(MSG_TRUNC) hangs. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jesse Gross authored
This reverts commit 5b3e7e6c. The problem that the original commit was attempting to fix can never happen in practice because validation is done one a per-flow basis rather than a per-packet basis. Adding additional checks at runtime is unnecessary and inconsistent with the rest of the code. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit 92101b3b (ipv4: Prepare for change of rt->rt_iif encoding.) invalidated TCP early demux, because rx_dst_ifindex is not properly initialized and checked. Also remove the use of inet_iif(skb) in favor or skb->skb_iif Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
When device flags are set using rtnetlink, IFF_PROMISC and IFF_ALLMULTI flags are handled specially. Function dev_change_flags sets IFF_PROMISC and IFF_ALLMULTI bits in dev->gflags according to the passed value but do_setlink passes a result of rtnl_dev_combine_flags which takes those bits from dev->flags. This can be easily trigerred by doing: tcpdump -i eth0 & ip l s up eth0 ip sets IFF_UP flag in ifi_flags and ifi_change, which is combined with IFF_PROMISC by rtnl_dev_combine_flags, causing __dev_change_flags to set IFF_PROMISC in gflags. Reported-by: Max Matveev <makc@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
The problem is that we call this with a spin lock held. The call tree is: kaweth_start_xmit() holds kaweth->device_lock. -> kaweth_async_set_rx_mode() -> kaweth_control() -> kaweth_internal_control_msg() The kaweth_internal_control_msg() function is only called from kaweth_control() which used GFP_ATOMIC for its allocations. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hangbin Liu authored
TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c75) didn't check the negative values. If a user assign -1 to it, the socket will set successfully and wait for 4294967295 miliseconds. This patch add a negative value check to avoid this issue. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds authored
Pull TTY/Serial patches from Greg Kroah-Hartman: "Here's the "tiny" set of patches for 3.6-rc1 for the tty layer and serial drivers. They were cherry-picked from the tty-next branch of the tty git tree, as they are small and "obvious" fixes. The larger changes, as mentioned before, will be saved for the 3.7-rc1 merge window. All of these changes have been in the linux-next releases for quite a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'tty-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: pch_uart: Fix parity setting issue pch_uart: Fix rx error interrupt setting issue pch_uart: Fix missing break for 16 byte fifo tty ldisc: Close/Reopen race prevention should check the proper flag pch_uart: Add eg20t_port lock field, avoid recursive spinlocks vt: fix race in vt_waitactive() serial/of-serial: Add LPC3220 standard UART compatible string serial/8250: Add LPC3220 standard UART type serial_core: Update buffer overrun statistics. serial: samsung: Fixed wrong comparison for baudclk_rate
-
git://github.com/congwang/linuxLinus Torvalds authored
Pull final kmap_atomic cleanups from Cong Wang: "This should be the final round of cleanup, as the definitions of enum km_type finally get removed from the whole tree. The patches have been in linux-next for a long time." * 'kmap_atomic' of git://github.com/congwang/linux: pipe: remove KM_USER0 from comments vmalloc: remove KM_USER0 from comments feature-removal-schedule.txt: remove kmap_atomic(page, km_type) tile: remove km_type definitions um: remove km_type definitions asm-generic: remove km_type definitions avr32: remove km_type definitions frv: remove km_type definitions powerpc: remove km_type definitions arm: remove km_type definitions highmem: remove the deprecated form of kmap_atomic tile: remove usage of enum km_type frv: remove the second parameter of kmap_atomic_primary() jbd2: remove the second argument of kmap_atomic
-
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpcLinus Torvalds authored
Pull powerpc fixes from Benjamin Herrenschmidt: "Here's a handful of powerpc patches, a couple of regression fixes for problems introduced in the main batch in this merge window, a couple of defconfig updates, and some trivials. The radeonfb one is something that was long standing in SLES which I forgot to pickup earlier." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ftrace: Trace function graph entry before updating index radeonfb: Add quirk for the graphics adapter in some JSxx powerpc: Lack of firmware flash support is not an error powerpc: Enable pseries hardware RNG and crypto modules powerpc: Update g5_defconfig powerpc/kvm/bookehv: Fix build regression powerpc: Set stack limit properly in crit_transfer_to_handler
-
Linus Torvalds authored
Merge tag 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus Pull cpumask changes from Rusty Russell: "Trivial comment changes to cpumask code. I guess it's getting boring." Boring is good. * tag 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: cpulist_parse() comments correction init: add comments to keep initcall-names in sync with initcall levels cpumask: add a few comments of cpumask functions
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "A mixed bag of fixes, some for merge window fallout (tegra, MXS), and a short series of fixes for marvell platforms that didn't make it in before 3.5." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: mxs: fix compile error caused by prom_update_property change ARM: dt: tegra trimslice: enable USB2 port ARM: dt: tegra trimslice: add vbus-gpio property ARM: vt8500: Add maintainer for VT8500 architecture ARM: Kirkwood: Replace mrvl with marvell ARM: Orion: fix driver probe error handling with respect to clk ARM: Dove: Fixup ge00 initialisation ARM: Kirkwood: Fix PHY disable clk problems ARM: Kirkwood: Ensure runit clock always ticks. ARM: versatile: Don't use platform clock for Integrator & VE ARM: tegra: harmony: add regulator supply name and its input supply
-
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-ledsLinus Torvalds authored
Pull LED subsystem update from Bryan Wu. * 'for-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: (50 commits) leds-lp8788: forgotten unlock at lp8788_led_work LEDS: propagate error codes in blinkm_detect() LEDS: memory leak in blinkm_led_common_set() leds: add new lp8788 led driver LEDS: add BlinkM RGB LED driver, documentation and update MAINTAINERS leds: max8997: Simplify max8997_led_set_mode implementation leds/leds-s3c24xx: use devm_gpio_request leds: convert Network Space v2 LED driver to devm_kzalloc() and cleanup error exit path leds: convert DAC124S085 LED driver to devm_kzalloc() leds: convert LM3530 LED driver to devm_kzalloc() and cleanup error exit path leds: convert TCA6507 LED driver to devm_kzalloc() leds: convert Freescale MC13783 LED driver to devm_kzalloc() and cleanup error exit path leds: convert ADP5520 LED driver to devm_kzalloc() and cleanup error exit path leds: convert PCA955x LED driver to devm_kzalloc() and cleanup error exit path leds: convert Sun Fire LED driver to devm_kzalloc() and cleanup error exit path leds: convert PCA9532 LED driver to devm_kzalloc() leds: convert LT3593 LED driver to devm_kzalloc() leds: convert Renesas TPU LED driver to devm_kzalloc() and cleanup error exit path leds: convert LP5523 LED driver to devm_kzalloc() and cleanup error exit path leds: convert PCA9633 LED driver to devm_kzalloc() ...
-
Steven Rostedt authored
As Colin Cross ported my x86 change to ARM, he also pointed out that powerpc is also behind in this fix. The commit 722b3c74 "ftrace/graph: Trace function entry before updating index" fixes an issue with function graph tracing for x86, where if the called entry function decides not to trace interrupts, it can fail the check if an interrupt comes in just after the curr_ret_stack is updated. The solution is to call the entry function first, then update the curr_ret_stack if the entry function wants to be traced. Cc: Colin Cross <ccross@android.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Tony Breeds authored
These devices are set to 640x480 by firmware, switch them to 800x600@60 so that the graphical installer can run on remote console. Reported by IBM during SLES10 SP2 beta testing: https://bugzilla.novell.com/show_bug.cgi?id=461002 LTC50817 Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-