1. 30 Nov, 2012 3 commits
    • Jiri Bohac's avatar
      bonding: delete migrated IP addresses from the rlb hash table · e53665c6
      Jiri Bohac authored
      Bonding in balance-alb mode records information from ARP packets
      passing through the bond in a hash table (rx_hashtbl).
      
      At certain situations (e.g. link change of a slave),
      rlb_update_rx_clients() will send out ARP packets to update ARP
      caches of other hosts on the network to achieve RX load
      balancing.
      
      The problem is that once an IP address is recorded in the hash
      table, it stays there indefinitely. If this IP address is
      migrated to a different host in the network, bonding still sends
      out ARP packets that poison other systems' ARP caches with
      invalid information.
      
      This patch solves this by looking at all incoming ARP packets,
      and checking if the source IP address is one of the source
      addresses stored in the rx_hashtbl. If it is, but the MAC
      addresses differ, the corresponding hash table entries are
      removed. Thus, when an IP address is migrated, the first ARP
      broadcast by its new owner will purge the offending entries of
      rx_hashtbl.
      
      The hash table is hashed by ip_dst. To be able to do the above
      check efficiently (not walking the whole hash table), we need a
      reverse mapping (by ip_src).
      
      I added three new members in struct rlb_client_info:
         rx_hashtbl[x].src_first will point to the start of a list of
            entries for which hash(ip_src) == x.
         The list is linked with src_next and src_prev.
      
      When an incoming ARP packet arrives at rlb_arp_recv()
      rlb_purge_src_ip() can quickly walk only the entries on the
      corresponding lists, i.e. the entries that are likely to contain
      the offending IP address.
      
      To avoid confusion, I renamed these existing fields of struct
      rlb_client_info:
      	next -> used_next
      	prev -> used_prev
      	rx_hashtbl_head -> rx_hashtbl_used_head
      
      (The current linked list is _not_ a list of hash table
      entries with colliding ip_dst. It's a list of entries that are
      being used; its purpose is to avoid walking the whole hash table
      when looking for used entries.)
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e53665c6
    • zheng.li's avatar
      bonding: rlb mode of bond should not alter ARP originating via bridge · 567b871e
      zheng.li authored
      Do not modify or load balance ARP packets passing through balance-alb
      mode (wherein the ARP did not originate locally, and arrived via a bridge).
      
      Modifying pass-through ARP replies causes an incorrect MAC address
      to be placed into the ARP packet, rendering peers unable to communicate
      with the actual destination from which the ARP reply originated.
      
      Load balancing pass-through ARP requests causes an entry to be
      created for the peer in the rlb table, and bond_alb_monitor will
      occasionally issue ARP updates to all peers in the table instrucing them
      as to which MAC address they should communicate with; this occurs when
      some event sets rx_ntt.  In the bridged case, however, the MAC address
      used for the update would be the MAC of the slave, not the actual source
      MAC of the originating destination.  This would render peers unable to
      communicate with the destinations beyond the bridge.
      Signed-off-by: default avatarZheng Li <zheng.x.li@oracle.com>
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      567b871e
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch · e7165030
      David S. Miller authored
      Conflicts:
      	net/ipv6/exthdrs_core.c
      
      Jesse Gross says:
      
      ====================
      This series of improvements for 3.8/net-next contains four components:
       * Support for modifying IPv6 headers
       * Support for matching and setting skb->mark for better integration with
         things like iptables
       * Ability to recognize the EtherType for RARP packets
       * Two small performance enhancements
      
      The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge
      conflicts.  I left it as is but can do the merge if you want.  The conflicts
      are:
       * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of
         exthdrs_core.c.  Both should stay.
       * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c
         after this patch.  The IPVS user has two instances of the old constant
         name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7165030
  2. 29 Nov, 2012 4 commits
    • Rami Rosen's avatar
      core: make GRO methods static. · bb728820
      Rami Rosen authored
      This patch changes three methods to be static and removes their
      EXPORT_SYMBOLs in core/dev.c and their external declaration in
      netdevice.h. The methods, dev_gro_receive(), napi_frags_finish() and
      napi_skb_finish(), which are in the GRO rx path, are not used
      outside core/dev.c.
      Signed-off-by: default avatarRami Rosen <ramirose@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb728820
    • Rick Jones's avatar
      doc: make the description of how tcp_ecn works more explicit and clear · 7e3a2dc5
      Rick Jones authored
      Make the description of how tcp_ecn works a bit more explicit and clear.
      Signed-off-by: default avatarRick Jones <rick.jones2@hp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e3a2dc5
    • David S. Miller's avatar
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e9296e89
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Some more fixes trickled in over the past few days:
      
         1) PIM device names can overflow the IFNAMSIZ buffer unless we
            properly limit the allowed indexes, fix from Eric Dumazet.
      
         2) Under heavy load we can OOPS in icmp reply processing due to an
            unchecked inet_putpeer() call.  Fix from Neal Cardwell.
      
         3) SCTP round trip calculations need to use 64-bit math to avoid
            overflows, fix from Schoch Christian.
      
         4) Fix a memory leak and an error return flub in SCTP and IRDA
            triggerable by userspace.  Fix from Tommi Rantala and found by the
            syscall fuzzer (trinity).
      
         5) MLX4 driver gives bogus size to memcpy() call, fix from Amir
            Vadai.
      
         6) Fix length calculation in VHOST descriptor translation, from
            Michael S Tsirkin.
      
         7) Ambassador ATM driver loops forever while loading firmware, fix
            from Dan Carpenter.
      
         8) Over MTU packets in openvswitch warn about wrong device, fix from
            Jesse Gross.
      
         9) Netfilter IPSET's netlink code can overrun a string buffer because
            it's not properly limited to IFNAMSIZ.  Fix from Florian Westphal.
      
        10) PCAN USB driver sets wrong timestamp in SKB, from Oliver Hartkopp.
      
        11) Make sure the RX ifindex always has a valid value in the CAN BCM
            driver, even if we haven't received a frame yet.  Fix also from
            Oliver Hartkopp."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        team: fix hw_features setup
        atm: forever loop loading ambassador firmware
        vhost: fix length for cross region descriptor
        irda: irttp: fix memory leak in irttp_open_tsap() error path
        net: qmi_wwan: add Huawei E173
        net/mlx4_en: Can set maxrate only for TC0
        sctp: Error in calculation of RTTvar
        sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
        sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails
        net: ipmr: limit MRT_TABLE identifiers
        ipv4: avoid passing NULL to inet_putpeer() in icmpv4_xrlim_allow()
        can: bcm: initialize ifindex for timeouts without previous frame reception
        can: peak_usb: fix hwtstamp assignment
        netfilter: ipset: fix netiface set name overflow
        openvswitch: Store flow key len if ARP opcode is not request or reply.
        openvswitch: Print device when warning about over MTU packets.
      e9296e89
  3. 28 Nov, 2012 33 commits