- 01 May, 2012 8 commits
-
-
Eric Dumazet authored
TCP coalesce can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We had to disable coalescing in this case, for performance reasons. We 'upgrade' skb->head as a fragment in itself. This reduces number of cache misses when user makes its copies, since a less sk_buff are fetched. This makes receive and ofo queues shorter and thus reduce cache line misses in TCP stack. This is a followup of patch "net: allow skb->head to be a page fragment" Tested with tg3 nic, with GRO on or off. We can see "TCPRcvCoalesce" counter being incremented. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
GRO can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We 'upgrade' skb->head as a fragment in itself This avoids the frag_list fallback, and permits to build true GRO skb (one sk_buff and up to 16 fragments), using less memory. This reduces number of cache misses when user makes its copy, since a single sk_buff is fetched. This is a followup of patch "net: allow skb->head to be a page fragment" Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
This patch converts tg3 driver, one of our reference drivers, to use new build_skb() api in frag mode. Instead of using kmalloc() to allocate the memory block that will be used by build_skb() as skb->head, we use a page fragment. This is a followup of patch "net: allow skb->head to be a page fragment" This allows GRO, TCP coalescing, and splice() to be more efficient. Incidentally, this also removes SLUB slow path contention in kfree() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
skb->head is currently allocated from kmalloc(). This is convenient but has the drawback the data cannot be converted to a page fragment if needed. We have three spots were it hurts : 1) GRO aggregation When a linear skb must be appended to another skb, GRO uses the frag_list fallback, very inefficient since we keep all struct sk_buff around. So drivers enabling GRO but delivering linear skbs to network stack aren't enabling full GRO power. 2) splice(socket -> pipe). We must copy the linear part to a page fragment. This kind of defeats splice() purpose (zero copy claim) 3) TCP coalescing. Recently introduced, this permits to group several contiguous segments into a single skb. This shortens queue lengths and save kernel memory, and greatly reduce probabilities of TCP collapses. This coalescing doesnt work on linear skbs (or we would need to copy data, this would be too slow) Given all these issues, the following patch introduces the possibility of having skb->head be a fragment in itself. We use a new skb flag, skb->head_frag to carry this information. build_skb() is changed to accept a frag_size argument. Drivers willing to provide a page fragment instead of kmalloc() data will set a non zero value, set to the fragment size. Then, on situations we need to convert the skb head to a frag in itself, we can check if skb->head_frag is set and avoid the copies or various fallbacks we have. This means drivers currently using frags could be updated to avoid the current skb->head allocation and reduce their memory footprint (aka skb truesize). (thats 512 or 1024 bytes saved per skb). This also makes bpf/netfilter faster since the 'first frag' will be part of skb linear part, no need to copy data. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Insert an skb_tx_timestamp call in both ndo_start_xmit routines Tested to work for the nv_start_xmit_optimized case Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 30 Apr, 2012 2 commits
-
-
Herbert Xu authored
Unfortunately it seems that I didn't properly test the case of an expired external querier in the recent multicast bridge series. The setup of the timer in that case is completely broken and leads to a NULL-pointer dereference. This patch fixes it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 29 Apr, 2012 10 commits
-
-
Benjamin LaHaise authored
Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6. Support has been tested with and without hardware offloading. This version fixes the L2TP over localhost issue with incorrect checksums being reported. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Benjamin LaHaise authored
Now that the sematics of udpv6_queue_rcv_skb() match IPv4's udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Benjamin LaHaise authored
In order to make sure that when the encap_rcv() hook is introduced it is not called with the socket lock held, move socket locking from callers into udpv6_queue_rcv_skb(), matching what happens in IPv4. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Benjamin LaHaise authored
This is the first step in reworking the IPv6 UDP code to be structured more like the IPv4 UDP code. This patch creates __udpv6_queue_rcv_skb() with the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the backlog_rcv method. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
RongQing.Li authored
If I understand correct, NETIF_F_IP_CSUM only means the hardware will compute the TCP/UDP checksum, IP checksum is always computed in software So as a workround of hardware unable to compute small packages checksum, do not need to compute IP header checksum. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
RongQing.Li authored
PCH_GBE_ETH_ALEN is equal to ETH_ALEN, so we can replace it with ETH_ALEN. If they are not equal, it must be a bug, since this is ethernet, and the address has been already stored to mc_addr_list as ETH_ALEN bytes when call pch_gbe_mac_mc_addr_list_update. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Ferre authored
Use the gpio_to_irq() function to retrieve the phy IRQ line from the GPIO pin specification. This fix is needed now that we have moved to irqdomains on AT91. Reported-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Andrew Victor <avictor.za@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Victor authored
The AT91RM9200 Ethernet controller still has a fixed IO mapping. So: * Remove the fixed IO mapping and AT91_VA_BASE_EMAC definition. * Pass the physical base-address via platform-resources to the driver. * Convert at91_ether.c driver to perform an ioremap(). * Ethernet PHY detection needs to be performed during the driver initialization process, it can no longer be done first. Signed-off-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jeffrin Jose authored
Fixed a coding style issue relating to spaces in net/core/sock.c Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 27 Apr, 2012 19 commits
-
-
Jacob Keller authored
This patch consolidates the case logic for checking whether a device supports WoL into a single place. Previously ethtool and probe used similar logic that was copied and maintained separately. This patch encapsulates the core logic into a function so that a user only has to update one place. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Matthew Vick authored
During igb_reset(), we initiate a hardware reset which will clear our flow control settings. For auto-negotiation, we re-negotiate them when linking up again, but we need to force them off properly for the forced speed case. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Bruce Allan authored
Previously, a workaround was added to address a hardware bug in the PCIm2PCI arbiter where a write by the driver of the Transmit/Receive Descriptor Tail register could happen concurrently with a write of any MAC CSR register by the Manageability Engine (ME) which could cause the Tail register to have an incorrect value. The arbiter is supposed to prevent the concurrent writes but there is a bug that can cause the Host (driver) access to be acknowledged later than it should. After further investigation, it was discovered that a driver write access of any MAC CSR register after being idle for some time can be lost when ME is accessing a MAC CSR register. When this happens, no further target access is claimed by the MAC which could hang the system. The workaround to check bit 24 in the FWSM register (set only when ME is accessing a MAC CSR register) and delay for a limited amount of time until it is cleared is now done for all driver writes of MAC CSR registers on 82579 with ME enabled. In the rare case when the driver is writing the Tail register and ME is accessing any MAC CSR register for a duration longer than the maximum delay, write the register and verify it has the correct value before continuing, otherwise reset the device. This patch also moves some pre-existing macros from the hardware-specific header file to the more appropriate generic driver header file. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Bruce Allan authored
In K1 mode (a MAC/PHY interconnect power mode), the 82579 device shuts down the Phase Lock Loop (PLL) of the interconnect to save power. When the PLL starts working, the 82579 device may start to transfer the packet through the interconnect before it is fully functional causing packet drops. This workaround disables shutting down the PLL in K1 mode for 1G link speed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Matthew Vick authored
Performance testing has shown that enabling DMA burst on 82574 improves performance on small packets, so enable it by default. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Matthew Vick authored
80003ES2LAN has an errata such that far-end loopback may be activated by bit errors producing a reserved symbol. In order to disable far-end loopback quickly enough, disable it immediately following a reset. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Eric Dumazet authored
Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to match !!sock_flag() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shan Wei authored
All parameter descriptions in /proc/sys/net/core/* now is separated two places. So, merge them into Documentation/sysctl/net.txt. Signed-off-by: Shan Wei <davidshan@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ajit Khaparde authored
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ajit Khaparde authored
logical speed returned by link_status_query needs to be multiplied by 10. Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ajit Khaparde authored
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
This has been used by me and others for ages, let's stop calling it experimental. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rick Jones authored
As none of the callers of bond_update_speed_duplex (need to) check its return value, there is little point in it returning anything. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manish Chopra authored
o 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F and 0xFF are the allowed capture masks. o Updated driver version to 5.0.28 Signed-off-by: Manish chopra <manish.chopra@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jitendra Kalsaria authored
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sucheta Chakraborty authored
o Without failing probe, register netdevice when device is in FAILED state. o Device will come up with minimum functionality. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hartleys authored
Include the header to pickup the definitions of the global symbols. Quiets the following sparse warnings: warning: symbol 'crush_find_rule' was not declared. Should it be static? warning: symbol 'crush_do_rule' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Sage Weil <sage@newdream.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hartleys authored
Remove the custom DIVA_{INIT,EXIT}_FUNCTION defines and use the standard __init,__exit markup. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Armin Schindler <mac@melware.de> Cc: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Quoting Tore Anderson from : https://bugzilla.kernel.org/show_bug.cgi?id=42572 When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment size does not take into account the size of the IPv6 Fragmentation header that needs to be included in outbound packets, causing every transmitted TCP segment to be fragmented across two IPv6 packets, the latter of which will only contain 8 bytes of actual payload. RTAX_FEATURE_ALLFRAG is typically set on a route in response to receving a ICMPv6 Packet Too Big message indicating a Path MTU of less than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6 PTBs with MTU < 1280 are still valid, in particular when an IPv6 packet is sent to an IPv4 destination through a stateless translator. Any ICMPv4 Need To Fragment packets originated from the IPv4 part of the path will be translated to ICMPv6 PTB which may then indicate an MTU of less than 1280. The Linux kernel refuses to reduce the effective MTU to anything below 1280 bytes, instead it sets it to exactly 1280 bytes, and RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header), instead of 1232 (additionally taking into account the 8 bytes required by the IPv6 Fragmentation extension header). This in turn results in rather inefficient transmission, as every transmitted TCP segment now is split in two fragments containing 1232+8 bytes of payload. After this patch, all the outgoing packets that includes a Fragmentation header all are "atomic" or "non-fragmented" fragments, i.e., they both have Offset=0 and More Fragments=0. With help from David S. Miller Reported-by: Tore Anderson <tore@fud.no> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Tested-by: Tore Anderson <tore@fud.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 26 Apr, 2012 1 commit
-