1. 28 Jan, 2015 35 commits
  2. 21 Jan, 2015 5 commits
    • David Vrabel's avatar
      xen-netfront: use correct linear area after linearizing an skb · db76bea3
      David Vrabel authored
      commit 11d3d2a1 upstream.
      
      Commit 97a6d1bb (xen-netfront: Fix
      handling packets on compound pages with skb_linearize) attempted to
      fix a problem where an skb that would have required too many slots
      would be dropped causing TCP connections to stall.
      
      However, it filled in the first slot using the original buffer and not
      the new one and would use the wrong offset and grant access to the
      wrong page.
      
      Netback would notice the malformed request and stop all traffic on the
      VIF, reporting:
      
          vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144
          vif vif-3-0 vif3.0: fatal error; disabling device
      Reported-by: default avatarAnthony Wright <anthony@overnetdata.com>
      Tested-by: default avatarAnthony Wright <anthony@overnetdata.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      BugLink: http://bugs.launchpad.net/bugs/1317811Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      db76bea3
    • Zoltan Kiss's avatar
      xen-netfront: Fix handling packets on compound pages with skb_linearize · b35b8e78
      Zoltan Kiss authored
      commit 97a6d1bb upstream.
      
      There is a long known problem with the netfront/netback interface: if the guest
      tries to send a packet which constitues more than MAX_SKB_FRAGS + 1 ring slots,
      it gets dropped. The reason is that netback maps these slots to a frag in the
      frags array, which is limited by size. Having so many slots can occur since
      compound pages were introduced, as the ring protocol slice them up into
      individual (non-compound) page aligned slots. The theoretical worst case
      scenario looks like this (note, skbs are limited to 64 Kb here):
      linear buffer: at most PAGE_SIZE - 17 * 2 bytes, overlapping page boundary,
      using 2 slots
      first 15 frags: 1 + PAGE_SIZE + 1 bytes long, first and last bytes are at the
      end and the beginning of a page, therefore they use 3 * 15 = 45 slots
      last 2 frags: 1 + 1 bytes, overlapping page boundary, 2 * 2 = 4 slots
      Although I don't think this 51 slots skb can really happen, we need a solution
      which can deal with every scenario. In real life there is only a few slots
      overdue, but usually it causes the TCP stream to be blocked, as the retry will
      most likely have the same buffer layout.
      This patch solves this problem by linearizing the packet. This is not the
      fastest way, and it can fail much easier as it tries to allocate a big linear
      area for the whole packet, but probably easier by an order of magnitude than
      anything else. Probably this code path is not touched very frequently anyway.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Paul Durrant <paul.durrant@citrix.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      BugLink: http://bugs.launchpad.net/bugs/1317811Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b35b8e78
    • Florian Westphal's avatar
      netfilter: conntrack: disable generic tracking for known protocols · 25bde5dd
      Florian Westphal authored
      commit db29a950 upstream.
      
      Given following iptables ruleset:
      
      -P FORWARD DROP
      -A FORWARD -m sctp --dport 9 -j ACCEPT
      -A FORWARD -p tcp --dport 80 -j ACCEPT
      -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT
      
      One would assume that this allows SCTP on port 9 and TCP on port 80.
      Unfortunately, if the SCTP conntrack module is not loaded, this allows
      *all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
      which we think is a security issue.
      
      This is because on the first SCTP packet on port 9, we create a dummy
      "generic l4" conntrack entry without any port information (since
      conntrack doesn't know how to extract this information).
      
      All subsequent packets that are unknown will then be in established
      state since they will fallback to proto_generic and will match the
      'generic' entry.
      
      Our originally proposed version [1] completely disabled generic protocol
      tracking, but Jozsef suggests to not track protocols for which a more
      suitable helper is available, hence we now mitigate the issue for in
      tree known ct protocol helpers only, so that at least NAT and direction
      information will still be preserved for others.
      
       [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html
      
      Joint work with Daniel Borkmann.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      25bde5dd
    • Cong Wang's avatar
      macvlan: unregister net device when netdev_upper_dev_link() fails · 1bfbf30d
      Cong Wang authored
      commit da37705c upstream.
      
      rtnl_newlink() doesn't unregister it for us on failure.
      
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Zefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1bfbf30d
    • Jay Vosburgh's avatar
      net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding · 4fdd841d
      Jay Vosburgh authored
      [ Upstream commit 2c26d34b ]
      
      When using VXLAN tunnels and a sky2 device, I have experienced
      checksum failures of the following type:
      
      [ 4297.761899] eth0: hw csum failure
      [...]
      [ 4297.765223] Call Trace:
      [ 4297.765224]  <IRQ>  [<ffffffff8172f026>] dump_stack+0x46/0x58
      [ 4297.765235]  [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50
      [ 4297.765238]  [<ffffffff8161c1a0>] ? skb_push+0x40/0x40
      [ 4297.765240]  [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0
      [ 4297.765243]  [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950
      [ 4297.765246]  [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360
      
      	These are reliably reproduced in a network topology of:
      
      container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch
      
      	When VXLAN encapsulated traffic is received from a similarly
      configured peer, the above warning is generated in the receive
      processing of the encapsulated packet.  Note that the warning is
      associated with the container eth0.
      
              The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
      because the packet is an encapsulated Ethernet frame, the checksum
      generated by the hardware includes the inner protocol and Ethernet
      headers.
      
      	The receive code is careful to update the skb->csum, except in
      __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
      calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
      to skip over the Ethernet header, but does not update skb->csum when
      doing so.
      
      	This patch resolves the problem by adding a call to
      skb_postpull_rcsum to update the skb->csum after the call to
      eth_type_trans.
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4fdd841d