- 19 Jul, 2012 40 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Conflicts: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
-
Yuchung Cheng authored
In trusted networks, e.g., intranet, data-center, the client does not need to use Fast Open cookie to mitigate DoS attacks. In cookie-less mode, sendmsg() with MSG_FASTOPEN flag will send SYN-data regardless of cookie availability. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
On paths with firewalls dropping SYN with data or experimental TCP options, Fast Open connections will have experience SYN timeout and bad performance. The solution is to track such incidents in the cookie cache and disables Fast Open temporarily. Since only the original SYN includes data and/or Fast Open option, the SYN-ACK has some tell-tale sign (tcp_rcv_fastopen_synack()) to detect such drops. If a path has recurring Fast Open SYN drops, Fast Open is disabled for 2^(recurring_losses) minutes starting from four minutes up to roughly one and half day. sendmsg with MSG_FASTOPEN flag will succeed but it behaves as connect() then write(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
sendmsg() (or sendto()) with MSG_FASTOPEN is a combo of connect(2) and write(2). The application should replace connect() with it to send data in the opening SYN packet. For blocking socket, sendmsg() blocks until all the data are buffered locally and the handshake is completed like connect() call. It returns similar errno like connect() if the TCP handshake fails. For non-blocking socket, it returns the number of bytes queued (and transmitted in the SYN-data packet) if cookie is available. If cookie is not available, it transmits a data-less SYN packet with Fast Open cookie request option and returns -EINPROGRESS like connect(). Using MSG_FASTOPEN on connecting or connected socket will result in simlar errno like repeating connect() calls. Therefore the application should only use this flag on new sockets. The buffer size of sendmsg() is independent of the MSS of the connection. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
On receiving the SYN-ACK after SYN-data, the client needs to a) update the cached MSS and cookie (if included in SYN-ACK) b) retransmit the data not yet acknowledged by the SYN-ACK in the final ACK of the handshake. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
This patch implements sending SYN-data in tcp_connect(). The data is from tcp_sendmsg() with flag MSG_FASTOPEN (implemented in a later patch). The length of the cookie in tcp_fastopen_req, init'd to 0, controls the type of the SYN. If the cookie is not cached (len==0), the host sends data-less SYN with Fast Open cookie request option to solicit a cookie from the remote. If cookie is not available (len > 0), the host sends a SYN-data with Fast Open cookie option. If cookie length is negative, the SYN will not include any Fast Open option (for fall back operations). To deal with middleboxes that may drop SYN with data or experimental TCP option, the SYN-data is only sent once. SYN retransmits do not include data or Fast Open options. The connection will fall back to regular TCP handshake. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
With help from Eric Dumazet, add Fast Open metrics in tcp metrics cache. The basic ones are MSS and the cookies. Later patch will cache more to handle unfriendly middleboxes. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
This patch impelements the common code for both the client and server. 1. TCP Fast Open option processing. Since Fast Open does not have an option number assigned by IANA yet, it shares the experiment option code 254 by implementing draft-ietf-tcpm-experimental-options with a 16 bits magic number 0xF989. This enables global experiments without clashing the scarce(2) experimental options available for TCP. When the draft status becomes standard (maybe), the client should switch to the new option number assigned while the server supports both numbers for transistion. 2. The new sysctl tcp_fastopen 3. A place holder init function Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Thadeu Lima de Souza Cascardo authored
In its receive path, mlx4_en driver maps each page chunk that it pushes to the hardware and unmaps it when pushing it up the stack. This limits throughput to about 3Gbps on a Power7 8-core machine. One solution is to map the entire allocated page at once. However, this requires that we keep track of every page fragment we give to a descriptor. We also need to work with the discipline that all fragments will be released (in the sense that it will not be reused by the driver anymore) in the order they are allocated to the driver. This requires that we don't reuse any fragments, every single one of them must be reallocated. We do that by releasing all the fragments that are processed and only after finished processing the descriptors, we start the refill. We also must somehow guarantee that we either refill all fragments in a descriptor or none at all, without resorting to giving up a page fragment that we would have already given. Otherwise, we would break the discipline of only releasing the fragments in the order they were allocated. This has passed page allocation fault injections (restricted to the driver by using required-start and required-end) and device hotplug while 16 TCP streams were able to deliver more than 9Gbps. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
Dynamically allocated sysfs attributes must be initialized using sysfs_attr_init(), otherwise lockdep complains: BUG: key <address> not in .data! Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
Update the references to bridge utilities and web pages to current locations Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjørn Mork authored
commit 9ac32e1b firmware: convert e100 driver to request_firmware() did a straight conversion of the in-driver ucode to external files. This introduced the possibility of the driver failing to enable an interface due to missing ucode. There was no evaluation of the importance of the ucode at the time. Based on comments in earlier versions of this driver, and in the source code for the FreeBSD fxp driver, we can assume that the ucode implements the "CPU Cycle Saver" feature on supported adapters. Although generally wanted, this is an optional feature. The ucode source is not available, preventing it from being included in free distributions. This creates unnecessary problems for the end users. Doing a network install based on a free distribution installer requires the user to download and insert the ucode into the installer. Making the ucode optional when possible improves the user experience and driver usability. The ucode for some adapters include a bugfix, making it essential. We continue to fail for these adapters unless the ucode is available. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Riesch authored
Since commit 16626b0c the asix driver depends on the phylib. Select phylib when the asix driver is selected. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: kernel-janitors@vger.kernel.org Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Riesch authored
This patch adds the asix_set_eeprom() function to provide support for programming the configuration EEPROM via ethtool. Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Riesch authored
The current code for reading the EEPROM via ethtool in the asix driver has a few issues. It cannot handle odd length values (accesses must be aligned at 16 bit boundaries) and interprets the offset provided by ethtool as 16 bit word offset instead as byte offset. The new code for asix_get_eeprom() introduced by this patch is modeled after the code in drivers/net/ethernet/atheros/atl1e/atl1e_ethtool.c and provides read access to the entire EEPROM with arbitrary offsets and lengths. Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dinh Nguyen authored
Because there are multiple variants to the stmmac/dwmac driver, the dts bindings should be updated to include version of the IP used. Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
brenohl@br.ibm.com authored
cxgb3 interface has a bad performance when VLAN is set. On my current setup, a PowerLinux 7R2, I am able to get around 7 Gbps on a TCP_STREAM (8 instances, 4k message). With this patch, I am able to reach 9.5 Gbps. Signed-off-by: Breno Leitao <brenohl@br.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
The Ethernet II wrapper is only used by IPX protocol, may have once been used by Appletalk but not currently. Therefore it makes sense to move it to the IPX dust bin and drop the exports. Build tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
include/net/dst_ops.h:28:20: warning: ‘struct sock’ declared inside parameter list Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node) Additional features : 1) We also mirror the queue_mapping of the incoming skb, so that answers use the same queue if possible. 2) Setting SOCK_USE_WRITE_QUEUE socket flag speedup sock_wfree() 3) We now limit the number of in-flight RST/ACK [1] packets per cpu, instead of per namespace, and we honor the sysctl_wmem_default limit dynamically. (Prior to this patch, sysctl_wmem_default value was copied at boot time, so any further change would not affect tcp_sock limit) [1] These packets are only generated when no socket was matched for the incoming packet. Reported-by: Bill Sommerfeld <wsommerfeld@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Anastasov authored
Use global seqlock for the nh_exceptions. Call fnhe_oldest with the right hash chain. Correct the diff value for dst_set_expires. v2: after suggestions from Eric Dumazet: * get rid of spin lock fnhe_lock, rearrange update_or_create_fnhe * continue daddr search in rt_bind_exception v3: * remove the daddr check before seqlock in rt_bind_exception * restart lookup in rt_bind_exception on detected seqlock change, as suggested by David Miller Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Reported-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Use RFS infrastructure and flow steering in HW to keep CPU affinity of rx interrupts and application per TCP stream. A flow steering filter is added to the HW whenever the RFS ndo callback is invoked by core networking code. Because the invocation takes place in interrupt context, the actual setup of HW is done using workqueue. Whenever new filter is added, the driver checks for expiry of existing filters. Since there's window in time between the point where the core RFS code invoked the ndo callback, to the point where the HW is configured from the workqueue context, the 2nd, 3rd etc packets from that stream will cause the net core to invoke the callback again and again. To prevent inefficient/double configuration of the HW, the filters are kept in a database which is indexed using hash function to enable fast access. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap. If supplied, the assigned IRQ is tracked using rmap infrastructure. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Define this macro is one common place instead of duplicating it over the code Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Anastasov authored
ip_options_compile can be called for forwarded packets, make sure the specific-destionation address is a local one as specified in RFC 1812, 4.2.2.2 Addresses in Options Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Anastasov authored
Move fib_compute_spec_dst at the only place where it is needed. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://neil.brown.name/mdLinus Torvalds authored
Pull three md bugfixes from NeilBrown: "One of the bugs was introduced in 3.5-rc1. Others have been there for longer." * tag 'md-3.5-fixes' of git://neil.brown.name/md: md/raid1: close some possible races on write errors during resync md: avoid crash when stopping md array races with closing other open fds. md: fix bug in handling of new_data_offset
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking changes from David Miller: "Ok, we should be good to go now" 1) We have to statically initialize the init_net device list head rather than do so in an initcall, otherwise netprio_cgroup crashes if it's built statically rather than modular (Mark D. Rustad) 2) Fix SKB null oopser in CIPSO ipv4 option processing (Paul Moore) 3) Qlogic maintainers update (Anirban Chakraborty) * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: Statically initialize init_net.dev_base_head MAINTAINERS: Changes in qlcnic and qlge maintainers list cipso: don't follow a NULL pointer when setsockopt() is called
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hidLinus Torvalds authored
Pull HID update from Jiri Kosina: "A final round of changes for HID for 3.5: just device ID additions." * 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: hid-multitouch: add support for Zytronic panels HID: add Sennheiser BTD500USB device support HID: add battery quirk for Apple Wireless ANSI
-
Ezequiel Garcia authored
The strcpy was being used to set the name of the board. Since the destination char* was read-only and the name is set statically at compile time; this was both wrong and redundant. The type of char* is changed to const char* to prevent future errors. Reported-by: Radek Masin <radek@masin.eu> Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com> [ Taking directly due to vacations - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Benjamin Tissoires authored
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@enac.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
-
NeilBrown authored
commit 4367af55 md/raid1: clear bad-block record when write succeeds. Added a 'reschedule_retry' call possibility at the end of end_sync_write, but didn't add matching code at the end of sync_request_write. So if the writes complete very quickly, or scheduling makes it seem that way, then we can miss rescheduling the request and the resync could hang. Also commit 73d5c38a md: avoid races when stopping resync. Fix a race condition in this same code in end_sync_write but didn't make the change in sync_request_write. This patch updates sync_request_write to fix both of those. Patch is suitable for 3.1 and later kernels. Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com> Original-version-by: Alexander Lyakas <alex.bolshoy@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
md will refuse to stop an array if any other fd (or mounted fs) is using it. When any fs is unmounted of when the last open fd is closed all pending IO will be flushed (e.g. sync_blockdev call in __blkdev_put) so there will be no pending IO to worry about when the array is stopped. However in order to send the STOP_ARRAY ioctl to stop the array one must first get and open fd on the block device. If some fd is being used to write to the block device and it is closed after mdadm open the block device, but before mdadm issues the STOP_ARRAY ioctl, then there will be no last-close on the md device so __blkdev_put will not call sync_blockdev. If this happens, then IO can still be in-flight while md tears down the array and bad things can happen (use-after-free and subsequent havoc). So in the case where do_md_stop is being called from an open file descriptor, call sync_block after taking the mutex to ensure there will be no new openers. This is needed when setting a read-write device to read-only too. Cc: stable@vger.kernel.org Reported-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
commit c6563a8c md: add possibility to change data-offset for devices. introduced a 'new_data_offset' attribute which should normally be the same as 'data_offset', but can be explicitly set to a different value to allow a reshape operation to move the data. Unfortunately when the 'data_offset' is explicitly set through sysfs, the new_data_offset is not also set, so the two would become out-of-sync incorrectly. One result of this is that trying to set the 'size' after the 'data_offset' would fail because it is not permitted to set the size when the 'data_offset' and 'new_data_offset' are different - as that can be confusing. Consequently when mdadm tried to do this while assembling an IMSM array it would fail. This bug was introduced in 3.5-rc1. Reported-by: Brian Downing <bdowning@lavos.net> Bisected-by: Brian Downing <bdowning@lavos.net> Tested-by: Brian Downing <bdowning@lavos.net> Signed-off-by: NeilBrown <neilb@suse.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds authored
Pull target fixes from Nicholas Bellinger: "This includes a bugfix from MDR to address a NULL pointer OOPs with FCoE aborts, along with a WRITE_SAME emulation bugfix for NOLB=0 cases, and persistent reservation return cleanups from Roland. All three patches are CC'ed to stable." * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: Fix range calculation in WRITE SAME emulation when num blocks == 0 target: Clean up returning errors in PR handling code tcm_fc: Fix crash seen with aborts and large reads
-
Olaf Hering authored
The referenced html file does not exist anymore. Replace the URL with the current project homepage. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yoichi Yuasa authored
Commit 37778088 ("bug.h: need linux/kernel.h for TAINT_WARN.") broke all MIPS builds: CC arch/mips/kernel/machine_kexec.o include/linux/log2.h: In function '__ilog2_u32': include/linux/log2.h:34:2: error: implicit declaration of function 'fls' [-Werror=implicit-function-declaration] include/linux/log2.h: In function '__ilog2_u64': include/linux/log2.h:42:2: error: implicit declaration of function 'fls64' [-Werror=implicit-function-declaration] ... Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org> Tested-by: John Crispin <blogic@openwrt.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Commit a7a20d10 ("sd: limit the scope of the async probe domain") make the SCSI device probing run device discovery in it's own async domain. However, as a result, the partition detection was no longer synchronized by async_synchronize_full() (which, despite the name, only synchronizes the global async space, not all of them). Which in turn meant that "wait_for_device_probe()" would not wait for the SCSI partitions to be parsed. And "wait_for_device_probe()" was what the boot time init code relied on for mounting the root filesystem. Now, most people never noticed this, because not only is it timing-dependent, but modern distributions all use initrd. So the root filesystem isn't actually on a disk at all. And then before they actually mount the final disk filesystem, they will have loaded the scsi-wait-scan module, which not only does the expected wait_for_device_probe(), but also does scsi_complete_async_scans(). [ Side note: scsi_complete_async_scans() had also been partially broken, but that was fixed in commit 43a8d39d ("fix async probe regression"), so that same commit a7a20d10 had actually broken setups even if you used scsi-wait-scan explicitly ] Solve this problem by just moving the scsi_complete_async_scans() call into wait_for_device_probe(). Everybody who wants to wait for device probing to finish really wants the SCSI probing to complete, so there's no reason not to do this. So now "wait_for_device_probe()" really does what the name implies, and properly waits for device probing to finish. This also removes the now unnecessary extra calls to scsi_complete_async_scans(). Reported-and-tested-by: Artem S. Tashkinov <t.artem@mailcity.com> Cc: Dan Williams <dan.j.williams@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: James Bottomley <jbottomley@parallels.com> Cc: Borislav Petkov <bp@amd64.org> Cc: linux-scsi <linux-scsi@vger.kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-