1. 10 Jan, 2022 21 commits
  2. 09 Jan, 2022 19 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 77bbcb60
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains Netfilter updates for net-next. This
      includes one patch to update ovs and act_ct to use nf_ct_put() instead
      of nf_conntrack_put().
      
      1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet.
      
      2) Remove redundant rcu read-size lock in nf_tables packet path.
      
      3) Replace BUG() by WARN_ON_ONCE() in nft_payload.
      
      4) Consolidate rule verdict tracing.
      
      5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core.
      
      6) Make counter support built-in in nf_tables.
      
      7) Add new field to conntrack object to identify locally generated
         traffic, from Florian Westphal.
      
      8) Prevent NAT from shadowing well-known ports, from Florian Westphal.
      
      9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from
         Florian.
      
      10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King.
      
      11) Replace opencoded max() in conntrack, from Jiapeng Chong.
      
      12) Update conntrack to use refcount_t API, from Florian Westphal.
      
      13) Move ip_ct_attach indirection into the nf_ct_hook structure.
      
      14) Constify several pointer object in the netfilter codebase,
          from Florian Westphal.
      
      15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also
          from Florian.
      
      16) Fix egress splat due to incorrect rcu notation, from Florian.
      
      17) Move stateful fields of connlimit, last, quota, numgen and limit
          out of the expression data area.
      
      18) Build a blob to represent the ruleset in nf_tables, this is a
          requirement of the new register tracking infrastructure.
      
      19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers.
      
      20) Add register tracking infrastructure to skip redundant
          store-to-register operations, this includes support for payload,
          meta and bitwise expresssions.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits)
        netfilter: nft_meta: cancel register tracking after meta update
        netfilter: nft_payload: cancel register tracking after payload update
        netfilter: nft_bitwise: track register operations
        netfilter: nft_meta: track register operations
        netfilter: nft_payload: track register operations
        netfilter: nf_tables: add register tracking infrastructure
        netfilter: nf_tables: add NFT_REG32_NUM
        netfilter: nf_tables: add rule blob layout
        netfilter: nft_limit: move stateful fields out of expression data
        netfilter: nft_limit: rename stateful structure
        netfilter: nft_numgen: move stateful fields out of expression data
        netfilter: nft_quota: move stateful fields out of expression data
        netfilter: nft_last: move stateful fields out of expression data
        netfilter: nft_connlimit: move stateful fields out of expression data
        netfilter: egress: avoid a lockdep splat
        net: prefer nf_ct_put instead of nf_conntrack_put
        netfilter: conntrack: avoid useless indirection during conntrack destruction
        netfilter: make function op structures const
        netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
        netfilter: conntrack: convert to refcount_t api
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      77bbcb60
    • Pablo Neira Ayuso's avatar
      netfilter: nft_meta: cancel register tracking after meta update · 4a80e026
      Pablo Neira Ayuso authored
      The meta expression might mangle the packet metadata, cancel register
      tracking since any metadata in the registers is stale.
      
      Finer grain register tracking cancellation by inspecting the meta type
      on the register is also possible.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4a80e026
    • Pablo Neira Ayuso's avatar
      netfilter: nft_payload: cancel register tracking after payload update · cc003c7e
      Pablo Neira Ayuso authored
      The payload expression might mangle the packet, cancel register tracking
      since any payload data in the registers is stale.
      
      Finer grain register tracking cancellation by inspecting the payload
      base, offset and length on the register is also possible.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cc003c7e
    • Pablo Neira Ayuso's avatar
      netfilter: nft_bitwise: track register operations · be5650f8
      Pablo Neira Ayuso authored
      Check if the destination register already contains the data that this
      bitwise expression performs. This allows to skip this redundant
      operation.
      
      If the destination contains a different bitwise operation, cancel the
      register tracking information. If the destination contains no bitwise
      operation, update the register tracking information.
      
      Update the payload and meta expression to check if this bitwise
      operation has been already performed on the register. Hence, both the
      payload/meta and the bitwise expressions are reduced.
      
      There is also a special case: If source register != destination register
      and source register is not updated by a previous bitwise operation, then
      transfer selector from the source register to the destination register.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      be5650f8
    • Pablo Neira Ayuso's avatar
      netfilter: nft_meta: track register operations · 9b17afb2
      Pablo Neira Ayuso authored
      Check if the destination register already contains the data that this
      meta store expression performs. This allows to skip this redundant
      operation. If the destination contains a different selector, update
      the register tracking information.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      9b17afb2
    • Pablo Neira Ayuso's avatar
      netfilter: nft_payload: track register operations · a7c176bf
      Pablo Neira Ayuso authored
      Check if the destination register already contains the data that this
      payload store expression performs. This allows to skip this redundant
      operation. If the destination contains a different selector, update
      the register tracking information.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a7c176bf
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add register tracking infrastructure · 12e4ecfa
      Pablo Neira Ayuso authored
      This patch adds new infrastructure to skip redundant selector store
      operations on the same register to achieve a performance boost from
      the packet path.
      
      This is particularly noticeable in pure linear rulesets but it also
      helps in rulesets which are already heaving relying in maps to avoid
      ruleset linear inspection.
      
      The idea is to keep data of the most recurrent store operations on
      register to reuse them with cmp and lookup expressions.
      
      This infrastructure allows for dynamic ruleset updates since the ruleset
      blob reduction happens from the kernel.
      
      Userspace still needs to be updated to maximize register utilization to
      cooperate to improve register data reuse / reduce number of store on
      register operations.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      12e4ecfa
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add NFT_REG32_NUM · 642c8eff
      Pablo Neira Ayuso authored
      Add a definition including the maximum number of 32-bits registers that
      are used a scratchpad memory area to store data.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      642c8eff
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: add rule blob layout · 2c865a8a
      Pablo Neira Ayuso authored
      This patch adds a blob layout per chain to represent the ruleset in the
      packet datapath.
      
      	size (unsigned long)
      	struct nft_rule_dp
      	  struct nft_expr
      	  ...
              struct nft_rule_dp
                struct nft_expr
                ...
              struct nft_rule_dp (is_last=1)
      
      The new structure nft_rule_dp represents the rule in a more compact way
      (smaller memory footprint) compared to the control-plane nft_rule
      structure.
      
      The ruleset blob is a read-only data structure. The first field contains
      the blob size, then the rules containing expressions. There is a trailing
      rule which is used by the tracing infrastructure which is equivalent to
      the NULL rule marker in the previous representation. The blob size field
      does not include the size of this trailing rule marker.
      
      The ruleset blob is generated from the commit path.
      
      This patch reuses the infrastructure available since 0cbc06b3
      ("netfilter: nf_tables: remove synchronize_rcu in commit phase") to
      build the array of rules per chain.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2c865a8a
    • Pablo Neira Ayuso's avatar
      netfilter: nft_limit: move stateful fields out of expression data · 3b9e2ea6
      Pablo Neira Ayuso authored
      In preparation for the rule blob representation.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3b9e2ea6
    • Pablo Neira Ayuso's avatar
      netfilter: nft_limit: rename stateful structure · 369b6cb5
      Pablo Neira Ayuso authored
      From struct nft_limit to nft_limit_priv.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      369b6cb5
    • Pablo Neira Ayuso's avatar
      netfilter: nft_numgen: move stateful fields out of expression data · 567882eb
      Pablo Neira Ayuso authored
      In preparation for the rule blob representation.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      567882eb
    • Pablo Neira Ayuso's avatar
      netfilter: nft_quota: move stateful fields out of expression data · ed0a0c60
      Pablo Neira Ayuso authored
      In preparation for the rule blob representation.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ed0a0c60
    • Pablo Neira Ayuso's avatar
      netfilter: nft_last: move stateful fields out of expression data · 33a24de3
      Pablo Neira Ayuso authored
      In preparation for the rule blob representation.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      33a24de3
    • Pablo Neira Ayuso's avatar
      netfilter: nft_connlimit: move stateful fields out of expression data · 37f319f3
      Pablo Neira Ayuso authored
      In preparation for the rule blob representation.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      37f319f3
    • Florian Westphal's avatar
      netfilter: egress: avoid a lockdep splat · 6316136e
      Florian Westphal authored
      include/linux/netfilter_netdev.h:97 suspicious rcu_dereference_check() usage!
      2 locks held by sd-resolve/1100:
       0: ..(rcu_read_lock_bh){1:3}, at: ip_finish_output2
       1: ..(rcu_read_lock_bh){1:3}, at: __dev_queue_xmit
       __dev_queue_xmit+0 ..
      
      The helper has two callers, one uses rcu_read_lock, the other
      rcu_read_lock_bh().  Annotate the dereference to reflect this.
      
      Fixes: 42df6e1d ("netfilter: Introduce egress hook")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6316136e
    • Florian Westphal's avatar
      net: prefer nf_ct_put instead of nf_conntrack_put · 408bdcfc
      Florian Westphal authored
      Its the same as nf_conntrack_put(), but without the
      need for an indirect call.  The downside is a module dependency on
      nf_conntrack, but all of these already depend on conntrack anyway.
      
      Cc: Paul Blakey <paulb@mellanox.com>
      Cc: dev@openvswitch.org
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      408bdcfc
    • Florian Westphal's avatar
      netfilter: conntrack: avoid useless indirection during conntrack destruction · 6ae7989c
      Florian Westphal authored
      nf_ct_put() results in a usesless indirection:
      
      nf_ct_put -> nf_conntrack_put -> nf_conntrack_destroy -> rcu readlock +
      indirect call of ct_hooks->destroy().
      
      There are two _put helpers:
      nf_ct_put and nf_conntrack_put.  The latter is what should be used in
      code that MUST NOT cause a linker dependency on the conntrack module
      (e.g. calls from core network stack).
      
      Everyone else should call nf_ct_put() instead.
      
      A followup patch will convert a few nf_conntrack_put() calls to
      nf_ct_put(), in particular from modules that already have a conntrack
      dependency such as act_ct or even nf_conntrack itself.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6ae7989c
    • Florian Westphal's avatar
      netfilter: make function op structures const · 285c8a7a
      Florian Westphal authored
      No functional changes, these structures should be const.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      285c8a7a