- 29 Oct, 2008 12 commits
-
-
Eric Dumazet authored
Goals are : 1) Optimizing handling of incoming Unicast UDP frames, so that no memory writes should happen in the fast path. Note: Multicasts and broadcasts still will need to take a lock, because doing a full lockless lookup in this case is difficult. 2) No expensive operations in the socket bind/unhash phases : - No expensive synchronize_rcu() calls. - No added rcu_head in socket structure, increasing memory needs, but more important, forcing us to use call_rcu() calls, that have the bad property of making sockets structure cold. (rcu grace period between socket freeing and its potential reuse make this socket being cold in CPU cache). David did a previous patch using call_rcu() and noticed a 20% impact on TCP connection rates. Quoting Cristopher Lameter : "Right. That results in cacheline cooldown. You'd want to recycle the object as they are cache hot on a per cpu basis. That is screwed up by the delayed regular rcu processing. We have seen multiple regressions due to cacheline cooldown. The only choice in cacheline hot sensitive areas is to deal with the complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU." - Because udp sockets are allocated from dedicated kmem_cache, use of SLAB_DESTROY_BY_RCU can help here. Theory of operation : --------------------- As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()), special attention must be taken by readers and writers. Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed, reused, inserted in a different chain or in worst case in the same chain while readers could do lookups in the same time. In order to avoid loops, a reader must check each socket found in a chain really belongs to the chain the reader was traversing. If it finds a mismatch, lookup must start again at the begining. This *restart* loop is the reason we had to use rdlock for the multicast case, because we dont want to send same message several times to the same socket. We use RCU only for fast path. Thus, /proc/net/udp still takes spinlocks. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
UDP sockets are hashed in a 128 slots hash table. This hash table is protected by *one* rwlock. This rwlock is readlocked each time an incoming UDP message is handled. This rwlock is writelocked each time a socket must be inserted in hash table (bind time), or deleted from this table (close time) This is not scalable on SMP machines : 1) Even in read mode, lock() and unlock() are atomic operations and must dirty a contended cache line, shared by all cpus. 2) A writer might be starved if many readers are 'in flight'. This can happen on a machine with some NIC receiving many UDP messages. User process can be delayed a long time at socket creation/dismantle time. This patch prepares RCU migration, by introducing 'struct udp_table and struct udp_hslot', and using one spinlock per chain, to reduce contention on central rwlock. Introducing one spinlock per chain reduces latencies, for port randomization on heavily loaded UDP servers. This also speedup bindings to specific ports. udp_lib_unhash() was uninlined, becoming to big. Some cleanups were done to ease review of following patch (RCUification of UDP Unicast lookups) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Open code NIP6_FMT in the one call inside sscanf and one user of NIP6() that could use %p6 in the netfilter code. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Replace all uses of IPOIB_GID_FMT, IPOIB_GID_RAW_ARG() and IPOIB_GID_ARG() Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
This enables more ethtool information. The speed and settings of the underlying device are propagated up. This makes services like SNMP that use ethtool to get speed setting, work when managing a vlan, without adding silly heurtistics into SNMP daemon. For the driver info, just use existing driver strings. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lezcano authored
The veth network device is stored in a list in the netdev private. AFAICS, this list is never used so I removed this list from the code. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lezcano authored
The veth private structure contains a netdev pointer refering to its peer. This field is never used and it is pointless because if we can access, the veth_priv, that means we already have the netdev which is stored in veth_priv->dev. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 28 Oct, 2008 14 commits
-
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
The iscsi_ibft.c changes are almost certainly a bugfix as the pointer 'ip' is a u8 *, so they never print the last 8 bytes of the IPv6 address, and the eight bytes they do print have a zero byte with them in each 16-bit word. Other than that, this should cause no difference in functionality. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
The define in kernel.h can be done away with at a later time. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Takes a pointer to a IPv6 address and formats it in the usual colon-separated hex format: xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx Each 16 bit word is printed in network-endian byteorder. %#p6 is also supported and will omit the colons. %p6 is a replacement for NIP6_FMT and NIP6() %#p6 is a replacement for NIP6_SEQFMT and NIP6() Note that NIP6() took a struct in6_addr whereas this takes a pointer to a struct in6_addr. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin Willi authored
Add new_mapping() implementation to the netlink xfrm_mgr to notify address/port changes detected in UDP encapsulated ESP packets. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexey Dobriyan authored
call_rcu() will unconditionally rewrite RCU head anyway. Applies to struct neigh_parms struct neigh_table struct net struct cipso_v4_doi struct in_ifaddr struct in_device rt->u.dst Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexey Dobriyan authored
ifdef out * struct sk_buff::sp (pointer) * struct dst_entry::xfrm (pointer) * struct sock::sk_policy (2 pointers) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Patrick McHardy authored
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric W. Biederman authored
To make testing of the network namespace simpler allow the network namespace code and the sysfs code to be compiled and run at the same time. To do this only virtual devices are allowed in the additional network namespaces and those virtual devices are not placed in the kobject tree. Since virtual devices don't actually do anything interesting hardware wise that needs device management there should be no loss in keeping them out of the kobject tree and by implication sysfs. The gain in ease of testing and code coverage should be significant. Changelog: v2: As pointed out by Benjamin Thery it only makes sense to call device_rename in the initial network namespace for now. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Tested-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
A number of places still use %02x:...:%02x because it's in debug statements or for no real reason. Make a few of them use %pM. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
This converts pretty much everything to print_mac. There were a few things that had conflicts which I have just dropped for now, no harm done. I've built an allyesconfig with this and looked at the files that weren't built very carefully, but it's a huge patch. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Also remove a few stray DECLARE_MAC_BUF that were no longer used at all. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Add format specifiers for printing out six colon-separated bytes: MAC addresses (%pM): xx:xx:xx:xx:xx:xx %#pM is also supported and omits the colon separators. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Neil Horman authored
This is a patch to provide on demand route cache rebuilding. Currently, our route cache is rebulid periodically regardless of need. This introduced unneeded periodic latency. This patch offers a better approach. Using code provided by Eric Dumazet, we compute the standard deviation of the average hash bucket chain length while running rt_check_expire. Should any given chain length grow to larger that average plus 4 standard deviations, we trigger an emergency hash table rebuild for that net namespace. This allows for the common case in which chains are well behaved and do not grow unevenly to not incur any latency at all, while those systems (which may be being maliciously attacked), only rebuild when the attack is detected. This patch take 2 other factors into account: 1) chains with multiple entries that differ by attributes that do not affect the hash value are only counted once, so as not to unduly bias system to rebuilding if features like QOS are heavily used 2) if rebuilding crosses a certain threshold (which is adjustable via the added sysctl in this patch), route caching is disabled entirely for that net namespace, since constant rebuilding is less efficient that no caching at all Tested successfully by me. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 27 Oct, 2008 14 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: fw-sbp2: fix races firewire: fw-sbp2: delay first login to avoid retries firewire: fw-ohci: initialization failure path fixes firewire: fw-ohci: don't leak dma memory on module removal firewire: fix struct fw_node memory leak firewire: Survive more than 256 bus resets
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: ASoC: Blackfin: update SPORT0 port selector (v2) ALSA: hda - Restore default pin configs for realtek codecs sound: use a common working email address pci: use pci_ioremap_bar() in sound/
-
Takashi Iwai authored
Merge branches 'topic/fix/asoc', 'topic/fix/hda', 'topic/fix/misc' and 'topic/pci-ioremap-bar' into for-linus
-
Cliff Cai authored
- Setting the TFS pin selector for SPORT 0 based on whether the selected port id F or G. If the port is F then no conflict should exist for the TFS. When Port G is selected and EMAC then there is a conflict between the PHY interrupt line and TFS. Current settings prevent the conflict by ignoring the TFS pin when Port G is selected. This allows both ssm2602 using Port G and EMAC concurrently. - some code cleanup Signed-off-by: Cliff Cai <cliff.cai@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
-
Takashi Iwai authored
Some machines have broken BIOS resume that doesn't restore the default pin configuration properly, which results in a wrong detection of HP pin. This causes a silent speaker output due to missing HP detection. Related bug: Novell bug#406101 https://bugzilla.novell.com/show_bug.cgi?id=406101 This patch fixes the issue by saving/restoring the default pin configs by the driver itself. Signed-off-by: Takashi Iwai <tiwai@suse.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: syncookies: fix inclusion of tcp options in syn-ack libertas: free sk_buff with kfree_skb btsdio: free sk_buff with kfree_skb Phonet: do not reply to indication reset packets Phonet: include generic link-layer header size in MAX_PHONET_HEADER
-
Alan Cox authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alan Cox authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric Miao authored
The leds-da903x LED driver was missing the proper #include of linux/workqueue.h, but happened to compile on ARM due to implied includes through other header files. We do need the explict include on other architectures (reported at least for x86-64). Reported-tested-and-acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alan Cox authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
-
Florian Westphal authored
David Miller noticed that commit 33ad798c '(tcp: options clean up') did not move the req->cookie_ts check. This essentially disabled commit 4dfc2817 '[Syncookies]: Add support for TCP options via timestamps.'. This restores the original logic. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergio Luis authored
free sk_buff with kfree_skb, instead of kree Signed-off-by: Sergio Luis <sergio@larces.uece.br> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergio Luis authored
free sk_buff with kfree_skb, instead of kree Signed-off-by: Sergio Luis <sergio@larces.uece.br> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Remi Denis-Courmont authored
This fixes a potential error packet loop. Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-