1. 02 Jun, 2018 11 commits
    • Florian Westphal's avatar
      netfilter: nf_tables: handle chain name lookups via rhltable · 1b2470e5
      Florian Westphal authored
      If there is a significant amount of chains list search is too slow, so
      add an rhlist table for this.
      
      This speeds up ruleset loading: for every new rule we have to check if
      the name already exists in current generation.
      
      We need to be able to cope with duplicate chain names in case a transaction
      drops the nfnl mutex (for request_module) and the abort of this old
      transaction is still pending.
      
      The list is kept -- we need a way to iterate chains even if hash resize is
      in progress without missing an entry.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1b2470e5
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add connlimit support · 290180e2
      Pablo Neira Ayuso authored
      This features which allows you to limit the maximum number of
      connections per arbitrary key. The connlimit expression is stateful,
      therefore it can be used from meters to dynamically populate a set, this
      provides a mapping to the iptables' connlimit match. This patch also
      comes that allows you define static connlimit policies.
      
      This extension depends on the nf_conncount infrastructure.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      290180e2
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add destroy_clone expression · 371ebcbb
      Pablo Neira Ayuso authored
      Before this patch, cloned expressions are released via ->destroy. This
      is a problem for the new connlimit expression since the ->destroy path
      drop a reference on the conntrack modules and it unregisters hooks. The
      new ->destroy_clone provides context that this expression is being
      released from the packet path, so it is mirroring ->clone(), where
      neither module reference is dropped nor hooks need to be unregistered -
      because this done from the control plane path from the ->init() path.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      371ebcbb
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: garbage collection for stateful expressions · 79b174ad
      Pablo Neira Ayuso authored
      Use garbage collector to schedule removal of elements based of feedback
      from expression that this element comes with. Therefore, the garbage
      collector is not guided by timeout expirations in this new mode.
      
      The new connlimit expression sets on the NFT_EXPR_GC flag to enable this
      behaviour, the dynset expression needs to explicitly enable the garbage
      collector via set->ops->gc_init call.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      79b174ad
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: pass ctx to nf_tables_expr_destroy() · 3453c927
      Pablo Neira Ayuso authored
      nft_set_elem_destroy() can be called from call_rcu context. Annotate
      netns and table in set object so we can populate the context object.
      Moreover, pass context object to nf_tables_set_elem_destroy() from the
      commit phase, since it is already available from there.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3453c927
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conncount: expose connection list interface · 5e5cbc7b
      Pablo Neira Ayuso authored
      This patch provides an interface to maintain the list of connections and
      the lookup function to obtain the number of connections in the list.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      5e5cbc7b
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: pass context to object destroy indirection · 00bfb320
      Pablo Neira Ayuso authored
      The new connlimit object needs this to properly deal with conntrack
      dependencies.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      00bfb320
    • Máté Eckl's avatar
      netfilter: Libify xt_TPROXY · 45ca4e0c
      Máté Eckl authored
      The extracted functions will likely be usefull to implement tproxy
      support in nf_tables.
      
      Extrancted functions:
      	- nf_tproxy_sk_is_transparent
      	- nf_tproxy_laddr4
      	- nf_tproxy_handle_time_wait4
      	- nf_tproxy_get_sock_v4
      	- nf_tproxy_laddr6
      	- nf_tproxy_handle_time_wait6
      	- nf_tproxy_get_sock_v6
      
      (nf_)tproxy_handle_time_wait6 also needed some refactor as its current
      implementation was xtables-specific.
      Signed-off-by: default avatarMáté Eckl <ecklm94@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      45ca4e0c
    • Máté Eckl's avatar
      netfilter: Decrease code duplication regarding transparent socket option · 8d6e5557
      Máté Eckl authored
      There is a function in include/net/netfilter/nf_socket.h to decide if a
      socket has IP(V6)_TRANSPARENT socket option set or not. However this
      does the same as inet_sk_transparent() in include/net/tcp.h
      
      include/net/tcp.h:1733
      /* This helper checks if socket has IP_TRANSPARENT set */
      static inline bool inet_sk_transparent(const struct sock *sk)
      {
      	switch (sk->sk_state) {
      	case TCP_TIME_WAIT:
      		return inet_twsk(sk)->tw_transparent;
      	case TCP_NEW_SYN_RECV:
      		return inet_rsk(inet_reqsk(sk))->no_srccheck;
      	}
      	return inet_sk(sk)->transparent;
      }
      
      tproxy_sk_is_transparent has also been refactored to use this function
      instead of reimplementing it.
      Signed-off-by: default avatarMáté Eckl <ecklm94@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      8d6e5557
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 1ffdd8e1
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for your net-next
      tree, the most relevant things in this batch are:
      
      1) Compile masquerade infrastructure into NAT module, from Florian Westphal.
         Same thing with the redirection support.
      
      2) Abort transaction if early initialization of the commit phase fails.
         Also from Florian.
      
      3) Get rid of synchronize_rcu() by using rule array in nf_tables, from
         Florian.
      
      4) Abort nf_tables batch if fatal signal is pending, from Florian.
      
      5) Use .call_rcu nfnetlink from nf_tables to make dumps fully lockless.
         From Florian Westphal.
      
      6) Support to match transparent sockets from nf_tables, from Máté Eckl.
      
      7) Audit support for nf_tables, from Phil Sutter.
      
      8) Validate chain dependencies from commit phase, fall back to fine grain
         validation only in case of errors.
      
      9) Attach dst to skbuff from netfilter flowtable packet path, from
         Jason A. Donenfeld.
      
      10) Use artificial maximum attribute cap to remove VLA from nfnetlink.
          Patch from Kees Cook.
      
      11) Add extension to allow to forward packets through neighbour layer.
      
      12) Add IPv6 conntrack helper support to IPVS, from Julian Anastasov.
      
      13) Add IPv6 FTP conntrack support to IPVS, from Julian Anastasov.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ffdd8e1
    • David S. Miller's avatar
      Merge tag 'mlx5e-updates-2018-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f39c6b29
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5e-updates-2018-06-01
      
      1) From Tariq, Two patches to Fix IPoIB issues introduced in
         "net/mlx5e: TX, Use actual WQE size for SQ edge fill"
      
      2) From Eran, Additional improvements to mlx5e statistics reporting
      
      3) From Maor, Increase aRFS flow tables size
      
      4) From Adi, Support MTU change for ethernet representors
      
      5) From Ilan and Adi, Handle QP error events in FPGA
      
      6) From Tariq, last 10 patches mainly deals with RX buffer scheme improvements for legacy RQ
         to use only order-0 pages and fragmented SKBs for large MTUs.
      
      -  Tariq starts with some refactoring and removing HW LRO support from traditional
         (legacy) RQ, since it complicates the buffer scheme and removing it makes it smoother
         to move to cyclic descriptor buffer for traditional RQ.
      
      - Use cyclic WQ in legacy RQ, which has many benefits and paves the way for fragmented SKBs
        for large MTUs.
      
      - Enhance legacy Receive Queue memory scheme, such that only order-0 pages are used.
        Whenever possible, prefer using a linear SKB, and build it wrapping the WQE buffer.
        Otherwise (for example, jumbo frames on x86), use non-linear SKB, with as many frags
        as needed. In this case, multiple WQE scatter entries are used, up to a maximum of 4
        frags and 10KB of MTU.
      
      - TX statistics access improvements.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f39c6b29
  2. 01 Jun, 2018 29 commits