1. 07 Apr, 2017 5 commits
  2. 06 Apr, 2017 13 commits
    • Gao Feng's avatar
      netfilter: ctnetlink: Expectations must have a conntrack helper area · 2c62e0bc
      Gao Feng authored
      The expect check function __nf_ct_expect_check() asks the master_help is
      necessary. So it is unnecessary to go ahead in ctnetlink_alloc_expect
      when there is no help.
      
      Actually the commit bc01befd ("netfilter: ctnetlink: add support for
      user-space expectation helpers") permits ctnetlink create one expect
      even though there is no master help. But the latter commit 3d058d7b
      ("netfilter: rework user-space expectation helper support") disables it
      again.
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2c62e0bc
    • Florian Westphal's avatar
      netfilter: nat: avoid use of nf_conn_nat extension · 6e699867
      Florian Westphal authored
      successful insert into the bysource hash sets IPS_SRC_NAT_DONE status bit
      so we can check that instead of presence of nat extension which requires
      extra deref.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6e699867
    • Gao Feng's avatar
      netfilter: nat: nf_nat_mangle_{udp,tcp}_packet returns boolean · cba81cc4
      Gao Feng authored
      nf_nat_mangle_{udp,tcp}_packet() returns int. However, it is used as
      bool type in many spots. Fix this by consistently handle this return
      value as a boolean.
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cba81cc4
    • Gao Feng's avatar
      netfilter: nf_ct_expect: Add nf_ct_remove_expect() · ec0e3f01
      Gao Feng authored
      When remove one expect, it needs three statements. And there are
      multiple duplicated codes in current code. So add one common function
      nf_ct_remove_expect to consolidate this.
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ec0e3f01
    • Gao Feng's avatar
      netfilter: expect: Make sure the max_expected limit is effective · 92f73221
      Gao Feng authored
      Because the type of expecting, the member of nf_conn_help, is u8, it
      would overflow after reach U8_MAX(255). So it doesn't work when we
      configure the max_expected exceeds 255 with expect policy.
      
      Now add the check for max_expected. Return the -EINVAL when it exceeds
      the limit.
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      92f73221
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add nft_is_base_chain() helper · f323d954
      Pablo Neira Ayuso authored
      This new helper function allows us to check if this is a basechain.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f323d954
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6f14f443
      David S. Miller authored
      Mostly simple cases of overlapping changes (adding code nearby,
      a function whose name changes, for example).
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f14f443
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ea6b1720
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Reject invalid updates to netfilter expectation policies, from Pablo
          Neira Ayuso.
      
       2) Fix memory leak in nfnl_cthelper, from Jeffy Chen.
      
       3) Don't do stupid things if we get a neigh_probe() on a neigh entry
          whose ops lack a solicit method. From Eric Dumazet.
      
       4) Don't transmit packets in r8152 driver when the carrier is off, from
          Hayes Wang.
      
       5) Fix ipv6 packet type detection in aquantia driver, from Pavel
          Belous.
      
       6) Don't write uninitialized data into hw registers in bna driver, from
          Arnd Bergmann.
      
       7) Fix locking in ping_unhash(), from Eric Dumazet.
      
       8) Make BPF verifier range checks able to understand certain sequences
          emitted by LLVM, from Alexei Starovoitov.
      
       9) Fix use after free in ipconfig, from Mark Rutland.
      
      10) Fix refcount leak on force commit in openvswitch, from Jarno
          Rajahalme.
      
      11) Fix various overflow checks in AF_PACKET, from Andrey Konovalov.
      
      12) Fix endianness bug in be2net driver, from Suresh Reddy.
      
      13) Don't forget to wake TX queues when processing a timeout, from
          Grygorii Strashko.
      
      14) ARP header on-stack storage is wrong in flow dissector, from Simon
          Horman.
      
      15) Lost retransmit and reordering SNMP stats in TCP can be
          underreported. From Yuchung Cheng.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
        nfp: fix potential use after free on xdp prog
        tcp: fix reordering SNMP under-counting
        tcp: fix lost retransmit SNMP under-counting
        sctp: get sock from transport in sctp_transport_update_pmtu
        net: ethernet: ti: cpsw: fix race condition during open()
        l2tp: fix PPP pseudo-wire auto-loading
        bnx2x: fix spelling mistake in macros HW_INTERRUT_ASSERT_SET_*
        l2tp: take reference on sessions being dumped
        tcp: minimize false-positives on TCP/GRO check
        sctp: check for dst and pathmtu update in sctp_packet_config
        flow dissector: correct size of storage for ARP
        net: ethernet: ti: cpsw: wake tx queues on ndo_tx_timeout
        l2tp: take a reference on sessions used in genetlink handlers
        l2tp: hold session while sending creation notifications
        l2tp: fix duplicate session creation
        l2tp: ensure session can't get removed during pppol2tp_session_ioctl()
        l2tp: fix race in l2tp_recv_common()
        sctp: use right in and out stream cnt
        bpf: add various verifier test cases for self-tests
        bpf, verifier: fix rejection of unaligned access checks for map_value_adj
        ...
      ea6b1720
    • Jakub Kicinski's avatar
      nfp: fix potential use after free on xdp prog · c383bdd1
      Jakub Kicinski authored
      We should unregister the net_device first, before we give back
      our reference on xdp_prog.  Otherwise xdp_prog may be freed
      before .ndo_stop() disabled the datapath.  Found by code inspection.
      
      Fixes: ecd63a02 ("nfp: add XDP support in the driver")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c383bdd1
    • Jarod Wilson's avatar
      bonding: attempt to better support longer hw addresses · faeeb317
      Jarod Wilson authored
      People are using bonding over Infiniband IPoIB connections, and who knows
      what else. Infiniband has a hardware address length of 20 octets
      (INFINIBAND_ALEN), and the network core defines a MAX_ADDR_LEN of 32.
      Various places in the bonding code are currently hard-wired to 6 octets
      (ETH_ALEN), such as the 3ad code, which I've left untouched here. Besides,
      only alb is currently possible on Infiniband links right now anyway, due
      to commit 1533e773, so the alb code is where most of the changes are.
      
      One major component of this change is the addition of a bond_hw_addr_copy
      function that takes a length argument, instead of using ether_addr_copy
      everywhere that hardware addresses need to be copied about. The other
      major component of this change is converting the bonding code from using
      struct sockaddr for address storage to struct sockaddr_storage, as the
      former has an address storage space of only 14, while the latter is 128
      minus a few, which is necessary to support bonding over device with up to
      MAX_ADDR_LEN octet hardware addresses. Additionally, this probably fixes
      up some memory corruption issues with the current code, where it's
      possible to write an infiniband hardware address into a sockaddr declared
      on the stack.
      
      Lightly tested on a dual mlx4 IPoIB setup, which properly shows a 20-octet
      hardware address now:
      
      $ cat /proc/net/bonding/bond0
      Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
      
      Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
      Primary Slave: mlx4_ib0 (primary_reselect always)
      Currently Active Slave: mlx4_ib0
      MII Status: up
      MII Polling Interval (ms): 100
      Up Delay (ms): 100
      Down Delay (ms): 100
      
      Slave Interface: mlx4_ib0
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:1d:67:01
      Slave queue ID: 0
      
      Slave Interface: mlx4_ib1
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:09:fe:80:00:00:00:00:00:01:e4:1d:2d:03:00:1d:67:02
      Slave queue ID: 0
      
      Also tested with a standard 1Gbps NIC bonding setup (with a mix of
      e1000 and e1000e cards), running LNST's bonding tests.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      faeeb317
    • Yuchung Cheng's avatar
      tcp: fix reordering SNMP under-counting · 2d2517ee
      Yuchung Cheng authored
      Currently the reordering SNMP counters only increase if a connection
      sees a higher degree then it has previously seen. It ignores if the
      reordering degree is not greater than the default system threshold.
      This significantly under-counts the number of reordering events
      and falsely convey that reordering is rare on the network.
      
      This patch properly and faithfully records the number of reordering
      events detected by the TCP stack, just like the comment says "this
      exciting event is worth to be remembered". Note that even so TCP
      still under-estimate the actual reordering events because TCP
      requires TS options or certain packet sequences to detect reordering
      (i.e. ACKing never-retransmitted sequence in recovery or disordered
       state).
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d2517ee
    • Yuchung Cheng's avatar
      tcp: fix lost retransmit SNMP under-counting · ecde8f36
      Yuchung Cheng authored
      The lost retransmit SNMP stat is under-counting retransmission
      that uses segment offloading. This patch fixes that so all
      retransmission related SNMP counters are consistent.
      
      Fixes: 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time")
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ecde8f36
    • Edward Cree's avatar
      sfc: don't insert mc_list on low-latency firmware if it's too long · 148cbab6
      Edward Cree authored
      If the mc_list is longer than 256 addresses, we enter mc_promisc mode.
      If we're in mc_promisc mode and the firmware doesn't support cascaded
       multicast, normally we also insert our mc_list, to prevent stealing by
       another VI.  However, if the mc_list was too long, this isn't really
       helpful - the MC groups that didn't fit in the list can still get
       stolen, and having only some of them stealable will probably cause
       more confusing behaviour than having them all stealable.  Since
       inserting 256 multicast filters takes a long time and can lead to MCDI
       state machine timeouts, just skip the mc_list insert in this overflow
       condition.
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      148cbab6
  3. 05 Apr, 2017 22 commits