Commit 5fccd64a authored by David S. Miller's avatar David S. Miller
Browse files

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains a large Netfilter update for net-next,
to summarise:

1) Add support for stateful objects. This series provides a nf_tables
   native alternative to the extended accounting infrastructure for
   nf_tables. Two initial stateful objects are supported: counters and
   quotas. Objects are identified by a user-defined name, you can fetch
   and reset them anytime. You can also use a maps to allow fast lookups
   using any arbitrary key combination. More info at:

   http://marc.info/?l=netfilter-devel&m=148029128323837&w=2



2) On-demand registration of nf_conntrack and defrag hooks per netns.
   Register nf_conntrack hooks if we have a stateful ruleset, ie.
   state-based filtering or NAT. The new nf_conntrack_default_on sysctl
   enables this from newly created netnamespaces. Default behaviour is not
   modified. Patches from Florian Westphal.

3) Allocate 4k chunks and then use these for x_tables counter allocation
   requests, this improves ruleset load time and also datapath ruleset
   evaluation, patches from Florian Westphal.

4) Add support for ebpf to the existing x_tables bpf extension.
   From Willem de Bruijn.

5) Update layer 4 checksum if any of the pseudoheader fields is updated.
   This provides a limited form of 1:1 stateless NAT that make sense in
   specific scenario, eg. load balancing.

6) Add support to flush sets in nf_tables. This series comes with a new
   set->ops->deactivate_one() indirection given that we have to walk
   over the list of set elements, then deactivate them one by one.
   The existing set->ops->deactivate() performs an element lookup that
   we don't need.

7) Two patches to avoid cloning packets, thus speed up packet forwarding
   via nft_fwd from ingress. From Florian Westphal.

8) Two IPVS patches via Simon Horman: Decrement ttl in all modes to
   prevent infinite loops, patch from Dwip Banerjee. And one minor
   refactoring from Gao feng.

9) Revisit recent log support for nf_tables netdev families: One patch
   to ensure that we correctly handle non-ethernet packets. Another
   patch to add missing logger definition for netdev. Patches from
   Liping Zhang.

10) Three patches for nft_fib, one to address insufficient register
    initialization and another to solve incorrect (although harmless)
    byteswap operation. Moreover update xt_rpfilter and nft_fib to match
    lbcast packets with zeronet as source, eg. DHCP Discover packets
    (0.0.0.0 -> 255.255.255.255). Also from Liping Zhang.

11) Built-in DCCP, SCTP and UDPlite conntrack and NAT support, from
    Davide Caratti. While DCCP is rather hopeless lately, and UDPlite has
    been broken in many-cast mode for some little time, let's give them a
    chance by placing them at the same level as other existing protocols.
    Thus, users don't explicitly have to modprobe support for this and
    NAT rules work for them. Some people point to the lack of support in
    SOHO Linux-based routers that make deployment of new protocols harder.
    I guess other middleboxes outthere on the Internet are also to blame.
    Anyway, let's see if this has any impact in the midrun.

12) Skip software SCTP software checksum calculation if the NIC comes
    with SCTP checksum offload support. From Davide Caratti.

13) Initial core factoring to prepare conversion to hook array. Three
    patches from Aaron Conole.

14) Gao Feng made a wrong conversion to switch in the xt_multiport
    extension in a patch coming in the previous batch. Fix it in this
    batch.

15) Get vmalloc call in sync with kmalloc flags to avoid a warning
    and likely OOM killer intervention from x_tables. From Marcelo
    Ricardo Leitner.

16) Update Arturo Borrero's email address in all source code headers.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 63c36c40 73c25fb1
...@@ -96,6 +96,17 @@ nf_conntrack_max - INTEGER ...@@ -96,6 +96,17 @@ nf_conntrack_max - INTEGER
Size of connection tracking table. Default value is Size of connection tracking table. Default value is
nf_conntrack_buckets value * 4. nf_conntrack_buckets value * 4.
nf_conntrack_default_on - BOOLEAN
0 - don't register conntrack in new net namespaces
1 - register conntrack in new net namespaces (default)
This controls wheter newly created network namespaces have connection
tracking enabled by default. It will be enabled automatically
regardless of this setting if the new net namespace requires
connection tracking, e.g. when NAT rules are created.
This setting is only visible in initial user namespace, it has no
effect on existing namespaces.
nf_conntrack_tcp_be_liberal - BOOLEAN nf_conntrack_tcp_be_liberal - BOOLEAN
0 - disabled (default) 0 - disabled (default)
not 0 - enabled not 0 - enabled
......
...@@ -75,10 +75,39 @@ struct nf_hook_ops { ...@@ -75,10 +75,39 @@ struct nf_hook_ops {
struct nf_hook_entry { struct nf_hook_entry {
struct nf_hook_entry __rcu *next; struct nf_hook_entry __rcu *next;
struct nf_hook_ops ops; nf_hookfn *hook;
void *priv;
const struct nf_hook_ops *orig_ops; const struct nf_hook_ops *orig_ops;
}; };
static inline void
nf_hook_entry_init(struct nf_hook_entry *entry, const struct nf_hook_ops *ops)
{
entry->next = NULL;
entry->hook = ops->hook;
entry->priv = ops->priv;
entry->orig_ops = ops;
}
static inline int
nf_hook_entry_priority(const struct nf_hook_entry *entry)
{
return entry->orig_ops->priority;
}
static inline int
nf_hook_entry_hookfn(const struct nf_hook_entry *entry, struct sk_buff *skb,
struct nf_hook_state *state)
{
return entry->hook(entry->priv, skb, state);
}
static inline const struct nf_hook_ops *
nf_hook_entry_ops(const struct nf_hook_entry *entry)
{
return entry->orig_ops;
}
static inline void nf_hook_state_init(struct nf_hook_state *p, static inline void nf_hook_state_init(struct nf_hook_state *p,
unsigned int hook, unsigned int hook,
u_int8_t pf, u_int8_t pf,
......
...@@ -25,7 +25,7 @@ enum ct_dccp_roles { ...@@ -25,7 +25,7 @@ enum ct_dccp_roles {
#define CT_DCCP_ROLE_MAX (__CT_DCCP_ROLE_MAX - 1) #define CT_DCCP_ROLE_MAX (__CT_DCCP_ROLE_MAX - 1)
#ifdef __KERNEL__ #ifdef __KERNEL__
#include <net/netfilter/nf_conntrack_tuple.h> #include <linux/netfilter/nf_conntrack_tuple_common.h>
struct nf_ct_dccp { struct nf_ct_dccp {
u_int8_t role[IP_CT_DIR_MAX]; u_int8_t role[IP_CT_DIR_MAX];
......
...@@ -403,38 +403,14 @@ static inline unsigned long ifname_compare_aligned(const char *_a, ...@@ -403,38 +403,14 @@ static inline unsigned long ifname_compare_aligned(const char *_a,
return ret; return ret;
} }
struct xt_percpu_counter_alloc_state {
unsigned int off;
const char __percpu *mem;
};
/* On SMP, ip(6)t_entry->counters.pcnt holds address of the bool xt_percpu_counter_alloc(struct xt_percpu_counter_alloc_state *state,
* real (percpu) counter. On !SMP, its just the packet count, struct xt_counters *counter);
* so nothing needs to be done there. void xt_percpu_counter_free(struct xt_counters *cnt);
*
* xt_percpu_counter_alloc returns the address of the percpu
* counter, or 0 on !SMP. We force an alignment of 16 bytes
* so that bytes/packets share a common cache line.
*
* Hence caller must use IS_ERR_VALUE to check for error, this
* allows us to return 0 for single core systems without forcing
* callers to deal with SMP vs. NONSMP issues.
*/
static inline unsigned long xt_percpu_counter_alloc(void)
{
if (nr_cpu_ids > 1) {
void __percpu *res = __alloc_percpu(sizeof(struct xt_counters),
sizeof(struct xt_counters));
if (res == NULL)
return -ENOMEM;
return (__force unsigned long) res;
}
return 0;
}
static inline void xt_percpu_counter_free(u64 pcnt)
{
if (nr_cpu_ids > 1)
free_percpu((void __percpu *) (unsigned long) pcnt);
}
static inline struct xt_counters * static inline struct xt_counters *
xt_get_this_cpu_counter(struct xt_counters *cnt) xt_get_this_cpu_counter(struct xt_counters *cnt)
......
...@@ -19,6 +19,7 @@ static inline int nf_hook_ingress(struct sk_buff *skb) ...@@ -19,6 +19,7 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
{ {
struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress); struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress);
struct nf_hook_state state; struct nf_hook_state state;
int ret;
/* Must recheck the ingress hook head, in the event it became NULL /* Must recheck the ingress hook head, in the event it became NULL
* after the check in nf_hook_ingress_active evaluated to true. * after the check in nf_hook_ingress_active evaluated to true.
...@@ -29,7 +30,11 @@ static inline int nf_hook_ingress(struct sk_buff *skb) ...@@ -29,7 +30,11 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
nf_hook_state_init(&state, NF_NETDEV_INGRESS, nf_hook_state_init(&state, NF_NETDEV_INGRESS,
NFPROTO_NETDEV, skb->dev, NULL, NULL, NFPROTO_NETDEV, skb->dev, NULL, NULL,
dev_net(skb->dev), NULL); dev_net(skb->dev), NULL);
return nf_hook_slow(skb, &state, e); ret = nf_hook_slow(skb, &state, e);
if (ret == 0)
return -1;
return ret;
} }
static inline void nf_hook_ingress_init(struct net_device *dev) static inline void nf_hook_ingress_init(struct net_device *dev)
......
...@@ -15,6 +15,15 @@ extern struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv4; ...@@ -15,6 +15,15 @@ extern struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv4;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp4; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp4;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_icmp; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_icmp;
#ifdef CONFIG_NF_CT_PROTO_DCCP
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_dccp4;
#endif
#ifdef CONFIG_NF_CT_PROTO_SCTP
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_sctp4;
#endif
#ifdef CONFIG_NF_CT_PROTO_UDPLITE
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite4;
#endif
int nf_conntrack_ipv4_compat_init(void); int nf_conntrack_ipv4_compat_init(void);
void nf_conntrack_ipv4_compat_fini(void); void nf_conntrack_ipv4_compat_fini(void);
......
#ifndef _NF_DEFRAG_IPV4_H #ifndef _NF_DEFRAG_IPV4_H
#define _NF_DEFRAG_IPV4_H #define _NF_DEFRAG_IPV4_H
void nf_defrag_ipv4_enable(void); struct net;
int nf_defrag_ipv4_enable(struct net *);
#endif /* _NF_DEFRAG_IPV4_H */ #endif /* _NF_DEFRAG_IPV4_H */
...@@ -6,6 +6,15 @@ extern struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv6; ...@@ -6,6 +6,15 @@ extern struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv6;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp6; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp6;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6;
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6; extern struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6;
#ifdef CONFIG_NF_CT_PROTO_DCCP
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_dccp6;
#endif
#ifdef CONFIG_NF_CT_PROTO_SCTP
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_sctp6;
#endif
#ifdef CONFIG_NF_CT_PROTO_UDPLITE
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite6;
#endif
#include <linux/sysctl.h> #include <linux/sysctl.h>
extern struct ctl_table nf_ct_ipv6_sysctl_table[]; extern struct ctl_table nf_ct_ipv6_sysctl_table[];
......
#ifndef _NF_DEFRAG_IPV6_H #ifndef _NF_DEFRAG_IPV6_H
#define _NF_DEFRAG_IPV6_H #define _NF_DEFRAG_IPV6_H
void nf_defrag_ipv6_enable(void); struct net;
int nf_defrag_ipv6_enable(struct net *);
int nf_ct_frag6_init(void); int nf_ct_frag6_init(void);
void nf_ct_frag6_cleanup(void); void nf_ct_frag6_cleanup(void);
......
...@@ -181,6 +181,10 @@ static inline void nf_ct_put(struct nf_conn *ct) ...@@ -181,6 +181,10 @@ static inline void nf_ct_put(struct nf_conn *ct)
int nf_ct_l3proto_try_module_get(unsigned short l3proto); int nf_ct_l3proto_try_module_get(unsigned short l3proto);
void nf_ct_l3proto_module_put(unsigned short l3proto); void nf_ct_l3proto_module_put(unsigned short l3proto);
/* load module; enable/disable conntrack in this namespace */
int nf_ct_netns_get(struct net *net, u8 nfproto);
void nf_ct_netns_put(struct net *net, u8 nfproto);
/* /*
* Allocate a hashtable of hlist_head (if nulls == 0), * Allocate a hashtable of hlist_head (if nulls == 0),
* or hlist_nulls_head (if nulls == 1) * or hlist_nulls_head (if nulls == 1)
......
...@@ -52,6 +52,10 @@ struct nf_conntrack_l3proto { ...@@ -52,6 +52,10 @@ struct nf_conntrack_l3proto {
int (*tuple_to_nlattr)(struct sk_buff *skb, int (*tuple_to_nlattr)(struct sk_buff *skb,
const struct nf_conntrack_tuple *t); const struct nf_conntrack_tuple *t);
/* Called when netns wants to use connection tracking */
int (*net_ns_get)(struct net *);
void (*net_ns_put)(struct net *);
/* /*
* Calculate size of tuple nlattr * Calculate size of tuple nlattr
*/ */
...@@ -63,18 +67,24 @@ struct nf_conntrack_l3proto { ...@@ -63,18 +67,24 @@ struct nf_conntrack_l3proto {
size_t nla_size; size_t nla_size;
/* Init l3proto pernet data */
int (*init_net)(struct net *net);
/* Module (if any) which this is connected to. */ /* Module (if any) which this is connected to. */
struct module *me; struct module *me;
}; };
extern struct nf_conntrack_l3proto __rcu *nf_ct_l3protos[AF_MAX]; extern struct nf_conntrack_l3proto __rcu *nf_ct_l3protos[AF_MAX];
#ifdef CONFIG_SYSCTL
/* Protocol pernet registration. */ /* Protocol pernet registration. */
int nf_ct_l3proto_pernet_register(struct net *net, int nf_ct_l3proto_pernet_register(struct net *net,
struct nf_conntrack_l3proto *proto); struct nf_conntrack_l3proto *proto);
#else
static inline int nf_ct_l3proto_pernet_register(struct net *n,
struct nf_conntrack_l3proto *p)
{
return 0;
}
#endif
void nf_ct_l3proto_pernet_unregister(struct net *net, void nf_ct_l3proto_pernet_unregister(struct net *net,
struct nf_conntrack_l3proto *proto); struct nf_conntrack_l3proto *proto);
......
...@@ -2,5 +2,6 @@ ...@@ -2,5 +2,6 @@
#define _NF_DUP_NETDEV_H_ #define _NF_DUP_NETDEV_H_
void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif); void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif);
void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif);
#endif #endif
...@@ -109,7 +109,9 @@ void nf_log_dump_packet_common(struct nf_log_buf *m, u_int8_t pf, ...@@ -109,7 +109,9 @@ void nf_log_dump_packet_common(struct nf_log_buf *m, u_int8_t pf,
const struct net_device *out, const struct net_device *out,
const struct nf_loginfo *loginfo, const struct nf_loginfo *loginfo,
const char *prefix); const char *prefix);
void nf_log_l2packet(struct net *net, u_int8_t pf, unsigned int hooknum, void nf_log_l2packet(struct net *net, u_int8_t pf,
__be16 protocol,
unsigned int hooknum,
const struct sk_buff *skb, const struct sk_buff *skb,
const struct net_device *in, const struct net_device *in,
const struct net_device *out, const struct net_device *out,
......
...@@ -54,6 +54,15 @@ extern const struct nf_nat_l4proto nf_nat_l4proto_udp; ...@@ -54,6 +54,15 @@ extern const struct nf_nat_l4proto nf_nat_l4proto_udp;
extern const struct nf_nat_l4proto nf_nat_l4proto_icmp; extern const struct nf_nat_l4proto nf_nat_l4proto_icmp;
extern const struct nf_nat_l4proto nf_nat_l4proto_icmpv6; extern const struct nf_nat_l4proto nf_nat_l4proto_icmpv6;
extern const struct nf_nat_l4proto nf_nat_l4proto_unknown; extern const struct nf_nat_l4proto nf_nat_l4proto_unknown;
#ifdef CONFIG_NF_NAT_PROTO_DCCP
extern const struct nf_nat_l4proto nf_nat_l4proto_dccp;
#endif
#ifdef CONFIG_NF_NAT_PROTO_SCTP
extern const struct nf_nat_l4proto nf_nat_l4proto_sctp;
#endif
#ifdef CONFIG_NF_NAT_PROTO_UDPLITE
extern const struct nf_nat_l4proto nf_nat_l4proto_udplite;
#endif
bool nf_nat_l4proto_in_range(const struct nf_conntrack_tuple *tuple, bool nf_nat_l4proto_in_range(const struct nf_conntrack_tuple *tuple,
enum nf_nat_manip_type maniptype, enum nf_nat_manip_type maniptype,
......
...@@ -259,7 +259,8 @@ struct nft_expr; ...@@ -259,7 +259,8 @@ struct nft_expr;
* @lookup: look up an element within the set * @lookup: look up an element within the set
* @insert: insert new element into set * @insert: insert new element into set
* @activate: activate new element in the next generation * @activate: activate new element in the next generation
* @deactivate: deactivate element in the next generation * @deactivate: lookup for element and deactivate it in the next generation
* @deactivate_one: deactivate element in the next generation
* @remove: remove element from set * @remove: remove element from set
* @walk: iterate over all set elemeennts * @walk: iterate over all set elemeennts
* @privsize: function to return size of set private data * @privsize: function to return size of set private data
...@@ -294,6 +295,9 @@ struct nft_set_ops { ...@@ -294,6 +295,9 @@ struct nft_set_ops {
void * (*deactivate)(const struct net *net, void * (*deactivate)(const struct net *net,
const struct nft_set *set, const struct nft_set *set,
const struct nft_set_elem *elem); const struct nft_set_elem *elem);
bool (*deactivate_one)(const struct net *net,
const struct nft_set *set,
void *priv);
void (*remove)(const struct nft_set *set, void (*remove)(const struct nft_set *set,
const struct nft_set_elem *elem); const struct nft_set_elem *elem);
void (*walk)(const struct nft_ctx *ctx, void (*walk)(const struct nft_ctx *ctx,
...@@ -326,6 +330,7 @@ void nft_unregister_set(struct nft_set_ops *ops); ...@@ -326,6 +330,7 @@ void nft_unregister_set(struct nft_set_ops *ops);
* @name: name of the set * @name: name of the set
* @ktype: key type (numeric type defined by userspace, not used in the kernel) * @ktype: key type (numeric type defined by userspace, not used in the kernel)
* @dtype: data type (verdict or numeric type defined by userspace) * @dtype: data type (verdict or numeric type defined by userspace)
* @objtype: object type (see NFT_OBJECT_* definitions)
* @size: maximum set size * @size: maximum set size
* @nelems: number of elements * @nelems: number of elements
* @ndeact: number of deactivated elements queued for removal * @ndeact: number of deactivated elements queued for removal
...@@ -347,6 +352,7 @@ struct nft_set { ...@@ -347,6 +352,7 @@ struct nft_set {
char name[NFT_SET_MAXNAMELEN]; char name[NFT_SET_MAXNAMELEN];
u32 ktype; u32 ktype;
u32 dtype; u32 dtype;
u32 objtype;
u32 size; u32 size;
atomic_t nelems; atomic_t nelems;
u32 ndeact; u32 ndeact;
...@@ -416,6 +422,7 @@ void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set, ...@@ -416,6 +422,7 @@ void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
* @NFT_SET_EXT_EXPIRATION: element expiration time * @NFT_SET_EXT_EXPIRATION: element expiration time
* @NFT_SET_EXT_USERDATA: user data associated with the element * @NFT_SET_EXT_USERDATA: user data associated with the element
* @NFT_SET_EXT_EXPR: expression assiociated with the element * @NFT_SET_EXT_EXPR: expression assiociated with the element
* @NFT_SET_EXT_OBJREF: stateful object reference associated with element
* @NFT_SET_EXT_NUM: number of extension types * @NFT_SET_EXT_NUM: number of extension types
*/ */
enum nft_set_extensions { enum nft_set_extensions {
...@@ -426,6 +433,7 @@ enum nft_set_extensions { ...@@ -426,6 +433,7 @@ enum nft_set_extensions {
NFT_SET_EXT_EXPIRATION, NFT_SET_EXT_EXPIRATION,
NFT_SET_EXT_USERDATA, NFT_SET_EXT_USERDATA,
NFT_SET_EXT_EXPR, NFT_SET_EXT_EXPR,
NFT_SET_EXT_OBJREF,
NFT_SET_EXT_NUM NFT_SET_EXT_NUM
}; };
...@@ -554,6 +562,11 @@ static inline struct nft_set_ext *nft_set_elem_ext(const struct nft_set *set, ...@@ -554,6 +562,11 @@ static inline struct nft_set_ext *nft_set_elem_ext(const struct nft_set *set,
return elem + set->ops->elemsize; return elem + set->ops->elemsize;
} }
static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext)
{
return nft_set_ext(ext, NFT_SET_EXT_OBJREF);
}
void *nft_set_elem_init(const struct nft_set *set, void *nft_set_elem_init(const struct nft_set *set,
const struct nft_set_ext_tmpl *tmpl, const struct nft_set_ext_tmpl *tmpl,
const u32 *key, const u32 *data, const u32 *key, const u32 *data,
...@@ -875,6 +888,7 @@ unsigned int nft_do_chain(struct nft_pktinfo *pkt, void *priv); ...@@ -875,6 +888,7 @@ unsigned int nft_do_chain(struct nft_pktinfo *pkt, void *priv);
* @list: used internally * @list: used internally
* @chains: chains in the table * @chains: chains in the table
* @sets: sets in the table * @sets: sets in the table
* @objects: stateful objects in the table
* @hgenerator: handle generator state * @hgenerator: handle generator state
* @use: number of chain references to this table * @use: number of chain references to this table
* @flags: table flag (see enum nft_table_flags) * @flags: table flag (see enum nft_table_flags)
...@@ -885,6 +899,7 @@ struct nft_table { ...@@ -885,6 +899,7 @@ struct nft_table {
struct list_head list; struct list_head list;
struct list_head chains; struct list_head chains;
struct list_head sets; struct list_head sets;
struct list_head objects;
u64 hgenerator; u64 hgenerator;
u32 use; u32 use;
u16 flags:14, u16 flags:14,
...@@ -934,6 +949,80 @@ void nft_unregister_expr(struct nft_expr_type *); ...@@ -934,6 +949,80 @@ void nft_unregister_expr(struct nft_expr_type *);
int nft_verdict_dump(struct sk_buff *skb, int type, int nft_verdict_dump(struct sk_buff *skb, int type,
const struct nft_verdict *v); const struct nft_verdict *v);
/**
* struct nft_object - nf_tables stateful object
*
* @list: table stateful object list node
* @table: table this object belongs to
* @type: pointer to object type
* @data: pointer to object data
* @name: name of this stateful object
* @genmask: generation mask
* @use: number of references to this stateful object
* @data: object data, layout depends on type
*/
struct nft_object {
struct list_head list;
char name[NFT_OBJ_MAXNAMELEN];
struct nft_table *table;
u32 genmask:2,
use:30;
/* runtime data below here */
const struct nft_object_type *type ____cacheline_aligned;
unsigned char data[]
__attribute__((aligned(__alignof__(u64))));
};
static inline void *nft_obj_data(const struct nft_object *obj)
{
return (void *)obj->data;
}
#define nft_expr_obj(expr) *((struct nft_object **)nft_expr_priv(expr))
struct nft_object *nf_tables_obj_lookup(const struct nft_table *table,
const struct nlattr *nla, u32 objtype,
u8 genmask);
int nft_obj_notify(struct net *net, struct nft_table *table,
struct nft_object *obj, u32 portid, u32 seq,
int event, int family, int report, gfp_t gfp);
/**
* struct nft_object_type - stateful object type
*
* @eval: stateful object evaluation function
* @list: list node in list of object types
* @type: stateful object numeric type
* @size: stateful object size
* @owner: module owner
* @maxattr: maximum netlink attribute
* @policy: netlink attribute policy
* @init: initialize object from netlink attributes
* @destroy: release existing stateful object
* @dump: netlink dump stateful object
*/
struct nft_object_type {
void (*eval)(struct nft_object *obj,
struct nft_regs *regs,
const struct nft_pktinfo *pkt);
struct list_head list;
u32 type;
unsigned int size;
unsigned int maxattr;
struct module *owner;
const struct nla_policy *policy;
int (*init)(const struct nlattr * const tb[],
struct nft_object *obj);
void (*destroy)(struct nft_object *obj);
int (*dump)(struct sk_buff *skb,
struct nft_object *obj,
bool reset);
};
int nft_register_obj(struct nft_object_type *obj_type);
void nft_unregister_obj(struct nft_object_type *obj_type);
/** /**
* struct nft_traceinfo - nft tracing information and state * struct nft_traceinfo - nft tracing information and state
* *
...@@ -981,6 +1070,9 @@ void nft_trace_notify(struct nft_traceinfo *info); ...@@ -981,6 +1070,9 @@ void nft_trace_notify(struct nft_traceinfo *info);
#define MODULE_ALIAS_NFT_SET() \ #define MODULE_ALIAS_NFT_SET() \
MODULE_ALIAS("nft-set") MODULE_ALIAS("nft-set")
#define MODULE_ALIAS_NFT_OBJ(type) \
MODULE_ALIAS("nft-obj-" __stringify(type))
/* /*
* The gencursor defines two generations, the currently active and the * The gencursor defines two generations, the currently active and the
* next one. Objects contain a bitmask of 2 bits specifying the generations * next one. Objects contain a bitmask of 2 bits specifying the generations
...@@ -1157,4 +1249,11 @@ struct nft_trans_elem { ...@@ -1157,4 +1249,11 @@ struct nft_trans_elem {
#define nft_trans_elem(trans) \ #define nft_trans_elem(trans) \
(((struct nft_trans_elem *)trans->data)->elem) (((struct nft_trans_elem *)trans->data)->elem)
struct nft_trans_obj {
struct nft_object *obj;
};
#define nft_trans_obj(trans) \
(((struct nft_trans_obj *)trans->data)->obj)
#endif /* _NET_NF_TABLES_H */ #endif /* _NET_NF_TABLES_H */
...@@ -45,6 +45,7 @@ struct nft_payload_set { ...@@ -45,6 +45,7 @@ struct nft_payload_set {
enum nft_registers sreg:8; enum nft_registers sreg:8;
u8 csum_type; u8 csum_type;
u8 csum_offset; u8 csum_offset;
u8 csum_flags;
}; };
extern const struct nft_expr_ops nft_payload_fast_ops; extern const struct nft_expr_ops nft_payload_fast_ops;
......
...@@ -6,6 +6,12 @@ ...@@ -6,6 +6,12 @@
#include <linux/atomic.h> #include <linux/atomic.h>
#include <linux/workqueue.h> #include <linux/workqueue.h>
#include <linux/netfilter/nf_conntrack_tcp.h> #include <linux/netfilter/nf_conntrack_tcp.h>
#ifdef CONFIG_NF_CT_PROTO_DCCP
#include <linux/netfilter/nf_conntrack_dccp.h>
#endif
#ifdef CONFIG_NF_CT_PROTO_SCTP
#include <linux/netfilter/nf_conntrack_sctp.h>
#endif
#include <linux/seqlock.h> #include <linux/seqlock.h>
struct ctl_table_header; struct ctl_table_header;
...@@ -48,12 +54,49 @@ struct nf_icmp_net { ...@@ -48,12 +54,49 @@ struct nf_icmp_net {
unsigned int timeout; unsigned int timeout;
}; };
#ifdef CONFIG_NF_CT_PROTO_DCCP
struct nf_dccp_net {
struct nf_proto_net pn;
int dccp_loose;
unsigned int dccp_timeout[CT_DCCP_MAX + 1];
};
#endif
#ifdef CONFIG_NF_CT_PROTO_SCTP
struct nf_sctp_net {
struct nf_proto_net pn;
unsigned int timeouts[SCTP_CONNTRACK_MAX];
};
#endif
#ifdef CONFIG_NF_CT_PROTO_UDPLITE
enum udplite_conntrack {
UDPLITE_CT_UNREPLIED,
UDPLITE_CT_REPLIED,
UDPLITE_CT_MAX
};
struct nf_udplite_net {
struct nf_proto_net pn;
unsigned int timeouts[UDPLITE_CT_MAX];
};
#endif
struct nf_ip_net { struct nf_ip_net {
struct nf_generic_net generic; struct nf_generic_net generic;
struct nf_tcp_net tcp; struct nf_tcp_net tcp;
struct nf_udp_net udp; struct nf_udp_net udp;
struct nf_icmp_net icmp; struct nf_icmp_net icmp;
struct nf_icmp_net icmpv6; struct nf_icmp_net icmpv6;
#ifdef CONFIG_NF_CT_PROTO_DCCP
struct nf_dccp_net dccp;
#endif
#ifdef CONFIG_NF_CT_PROTO_SCTP
struct nf_sctp_net sctp;
#endif
#ifdef CONFIG_NF_CT_PROTO_UDPLITE
struct nf_udplite_net udplite;
#endif
}; };
struct ct_pcpu { struct ct_pcpu {
......
...@@ -17,5 +17,11 @@ struct netns_nf { ...@@ -17,5 +17,11 @@ struct netns_nf {
struct ctl_table_header *nf_log_dir_header; struct ctl_table_header *nf_log_dir_header;
#endif #endif
struct nf_hook_entry __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; struct nf_hook_entry __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4)
bool defrag_ipv4;
#endif
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
bool defrag_ipv6;
#endif
}; };
#endif #endif
...@@ -2,7 +2,10 @@ ...@@ -2,7 +2,10 @@
#define _NF_CONNTRACK_TUPLE_COMMON_H #define _NF_CONNTRACK_TUPLE_COMMON_H
#include <linux/types.h> #include <linux/types.h>
#ifndef __KERNEL__
#include <linux/netfilter.h> #include <linux/netfilter.h>
#endif
#include <linux/netfilter/nf_conntrack_common.h> /* IP_CT_IS_REPLY */
enum ip_conntrack_dir { enum ip_conntrack_dir {
IP_CT_DIR_ORIGINAL, IP_CT_DIR_ORIGINAL,
......
...@@ -4,6 +4,7 @@ ...@@ -4,6 +4,7 @@
#define NFT_TABLE_MAXNAMELEN 32 #define NFT_TABLE_MAXNAMELEN 32
#define NFT_CHAIN_MAXNAMELEN 32 #define NFT_CHAIN_MAXNAMELEN 32
#define NFT_SET_MAXNAMELEN 32 #define NFT_SET_MAXNAMELEN 32
#define NFT_OBJ_MAXNAMELEN 32
#define NFT_USERDATA_MAXLEN 256 #define NFT_USERDATA_MAXLEN 256
/** /**
...@@ -85,6 +86,10 @@ enum nft_verdicts { ...@@ -85,6 +86,10 @@ enum nft_verdicts {
* @NFT_MSG_NEWGEN: announce a new generation, only for events (enum nft_gen_attributes) * @NFT_MSG_NEWGEN: announce a new generation, only for events (enum nft_gen_attributes)
* @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes) * @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes)
* @NFT_MSG_TRACE: trace event (enum nft_trace_attributes) * @NFT_MSG_TRACE: trace event (enum nft_trace_attributes)
* @NFT_MSG_NEWOBJ: create a stateful object (enum nft_obj_attributes)
* @NFT_MSG_GETOBJ: get a stateful object (enum nft_obj_attributes)
* @NFT_MSG_DELOBJ: delete a stateful object (enum nft_obj_attributes)
* @NFT_MSG_GETOBJ_RESET: get and reset a stateful object (enum nft_obj_attributes)
*/ */
enum nf_tables_msg_types { enum nf_tables_msg_types {
NFT_MSG_NEWTABLE, NFT_MSG_NEWTABLE,
...@@ -105,6 +110,10 @@ enum nf_tables_msg_types { ...@@ -105,6 +110,10 @@ enum nf_tables_msg_types {
NFT_MSG_NEWGEN, NFT_MSG_NEWGEN,
NFT_MSG_GETGEN, NFT_MSG_GETGEN,
NFT_MSG_TRACE, NFT_MSG_TRACE,
NFT_MSG_NEWOBJ,
NFT_MSG_GETOBJ,
NFT_MSG_DELOBJ,
NFT_MSG_GETOBJ_RESET,
NFT_MSG_MAX, NFT_MSG_MAX,
}; };
...@@ -246,6 +255,7 @@ enum nft_rule_compat_attributes { ...@@ -246,6 +255,7 @@ enum nft_rule_compat_attributes {
* @NFT_SET_MAP: set is used as a dictionary * @NFT_SET_MAP: set is used as a dictionary
* @NFT_SET_TIMEOUT: set uses timeouts * @NFT_SET_TIMEOUT: set uses timeouts
* @NFT_SET_EVAL: set contains expressions for evaluation * @NFT_SET_EVAL: set contains expressions for evaluation
* @NFT_SET_OBJECT: set contains stateful objects
*/ */
enum nft_set_flags { enum nft_set_flags {
NFT_SET_ANONYMOUS = 0x1, NFT_SET_ANONYMOUS = 0x1,
...@@ -254,6 +264,7 @@ enum nft_set_flags { ...@@ -254,6 +264,7 @@ enum nft_set_flags {
NFT_SET_MAP = 0x8, NFT_SET_MAP = 0x8,
NFT_SET_TIMEOUT = 0x10, NFT_SET_TIMEOUT = 0x10,
NFT_SET_EVAL = 0x20, NFT_SET_EVAL = 0x20,
NFT_SET_OBJECT = 0x40,
}; };
/** /**
...@@ -295,6 +306,7 @@ enum nft_set_desc_attributes { ...@@ -295,6 +306,7 @@ enum nft_set_desc_attributes {
* @NFTA_SET_TIMEOUT: default timeout value (NLA_U64) * @NFTA_SET_TIMEOUT: default timeout value (NLA_U64)
* @NFTA_SET_GC_INTERVAL: garbage collection interval (NLA_U32) * @NFTA_SET_GC_INTERVAL: garbage collection interval (NLA_U32)
* @NFTA_SET_USERDATA: user data (NLA_BINARY) * @NFTA_SET_USERDATA: user data (NLA_BINARY)
* @NFTA_SET_OBJ_TYPE: stateful object type (NLA_U32: NFT_OBJECT_*)
*/ */
enum nft_set_attributes { enum nft_set_attributes {
NFTA_SET_UNSPEC, NFTA_SET_UNSPEC,
...@@ -312,6 +324,7 @@ enum nft_set_attributes { ...@@ -312,6 +324,7 @@ enum nft_set_attributes {
NFTA_SET_GC_INTERVAL, NFTA_SET_GC_INTERVAL,
NFTA_SET_USERDATA, NFTA_SET_USERDATA,
NFTA_SET_PAD, NFTA_SET_PAD,
NFTA_SET_OBJ_TYPE,
__NFTA_SET_MAX __NFTA_SET_MAX
}; };
#define NFTA_SET_MAX (__NFTA_SET_MAX - 1) #define NFTA_SET_MAX (__NFTA_SET_MAX - 1)
...@@ -335,6 +348,7 @@ enum nft_set_elem_flags { ...@@ -335,6 +348,7 @@ enum nft_set_elem_flags {
* @NFTA_SET_ELEM_EXPIRATION: expiration time (NLA_U64) * @NFTA_SET_ELEM_EXPIRATION: expiration time (NLA_U64)
* @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY)
* @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes)
* @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING)
*/ */
enum nft_set_elem_attributes { enum nft_set_elem_attributes {
NFTA_SET_ELEM_UNSPEC, NFTA_SET_ELEM_UNSPEC,
...@@ -346,6 +360,7 @@ enum nft_set_elem_attributes { ...@@ -346,6 +360,7 @@ enum nft_set_elem_attributes {
NFTA_SET_ELEM_USERDATA, NFTA_SET_ELEM_USERDATA,
NFTA_SET_ELEM_EXPR, NFTA_SET_ELEM_EXPR,
NFTA_SET_ELEM_PAD, NFTA_SET_ELEM_PAD,
NFTA_SET_ELEM_OBJREF,
__NFTA_SET_ELEM_MAX __NFTA_SET_ELEM_MAX
}; };
#define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1)
...@@ -659,6 +674,10 @@ enum nft_payload_csum_types { ...@@ -659,6 +674,10 @@ enum nft_payload_csum_types {
NFT_PAYLOAD_CSUM_INET, NFT_PAYLOAD_CSUM_INET,
}; };
enum nft_payload_csum_flags {
NFT_PAYLOAD_L4CSUM_PSEUDOHDR = (1 << 0),
};
/** /**
* enum nft_payload_attributes - nf_tables payload expression netlink attributes * enum nft_payload_attributes - nf_tables payload expression netlink attributes
* *
...@@ -669,6 +688,7 @@ enum nft_payload_csum_types { ...@@ -669,6 +688,7 @@ enum nft_payload_csum_types {
* @NFTA_PAYLOAD_SREG: source register to load data from (NLA_U32: nft_registers) * @NFTA_PAYLOAD_SREG: source register to load data from (NLA_U32: nft_registers)
* @NFTA_PAYLOAD_CSUM_TYPE: checksum type (NLA_U32) * @NFTA_PAYLOAD_CSUM_TYPE: checksum type (NLA_U32)
* @NFTA_PAYLOAD_CSUM_OFFSET: checksum offset relative to base (NLA_U32) * @NFTA_PAYLOAD_CSUM_OFFSET: checksum offset relative to base (NLA_U32)
* @NFTA_PAYLOAD_CSUM_FLAGS: checksum flags (NLA_U32)
*/ */
enum nft_payload_attributes { enum nft_payload_attributes {
NFTA_PAYLOAD_UNSPEC, NFTA_PAYLOAD_UNSPEC,
...@@ -679,6 +699,7 @@ enum nft_payload_attributes { ...@@ -679,6 +699,7 @@ enum nft_payload_attributes {
NFTA_PAYLOAD_SREG, NFTA_PAYLOAD_SREG,
NFTA_PAYLOAD_CSUM_TYPE, NFTA_PAYLOAD_CSUM_TYPE,
NFTA_PAYLOAD_CSUM_OFFSET, NFTA_PAYLOAD_CSUM_OFFSET,
NFTA_PAYLOAD_CSUM_FLAGS,
__NFTA_PAYLOAD_MAX __NFTA_PAYLOAD_MAX
}; };
#define NFTA_PAYLOAD_MAX (__NFTA_PAYLOAD_MAX - 1) #define NFTA_PAYLOAD_MAX (__NFTA_PAYLOAD_MAX - 1)
...@@ -968,6 +989,7 @@ enum nft_queue_attributes { ...@@ -968,6 +989,7 @@ enum nft_queue_attributes {
enum nft_quota_flags { enum nft_quota_flags {
NFT_QUOTA_F_INV = (1 << 0), NFT_QUOTA_F_INV = (1 << 0),
NFT_QUOTA_F_DEPLETED = (1 << 1),
}; };
/** /**
...@@ -975,12 +997,14 @@ enum nft_quota_flags { ...@@ -975,12 +997,14 @@ enum nft_quota_flags {
* *
* @NFTA_QUOTA_BYTES: quota in bytes (NLA_U16) * @NFTA_QUOTA_BYTES: quota in bytes (NLA_U16)
* @NFTA_QUOTA_FLAGS: flags (NLA_U32) * @NFTA_QUOTA_FLAGS: flags (NLA_U32)
* @NFTA_QUOTA_CONSUMED: quota already consumed in bytes (NLA_U64)
*/ */
enum nft_quota_attributes { enum nft_quota_attributes {
NFTA_QUOTA_UNSPEC, NFTA_QUOTA_UNSPEC,
NFTA_QUOTA_BYTES, NFTA_QUOTA_BYTES,
NFTA_QUOTA_FLAGS, NFTA_QUOTA_FLAGS,
NFTA_QUOTA_PAD, NFTA_QUOTA_PAD,
NFTA_QUOTA_CONSUMED,
__NFTA_QUOTA_MAX __NFTA_QUOTA_MAX
}; };
#define NFTA_QUOTA_MAX (__NFTA_QUOTA_MAX - 1) #define NFTA_QUOTA_MAX (__NFTA_QUOTA_MAX - 1)
...@@ -1124,6 +1148,26 @@ enum nft_fwd_attributes { ...@@ -1124,6 +1148,26 @@ enum nft_fwd_attributes {
}; };
#define NFTA_FWD_MAX (__NFTA_FWD_MAX - 1) #define NFTA_FWD_MAX (__NFTA_FWD_MAX - 1)
/**
* enum nft_objref_attributes - nf_tables stateful object expression netlink attributes
*
* @NFTA_OBJREF_IMM_TYPE: object type for immediate reference (NLA_U32: nft_register)
* @NFTA_OBJREF_IMM_NAME: object name for immediate reference (NLA_STRING)
* @NFTA_OBJREF_SET_SREG: source register of the data to look for (NLA_U32: nft_registers)
* @NFTA_OBJREF_SET_NAME: name of the set where to look for (NLA_STRING)
* @NFTA_OBJREF_SET_ID: id of the set where to look for in this transaction (NLA_U32)
*/
enum nft_objref_attributes {
NFTA_OBJREF_UNSPEC,
NFTA_OBJREF_IMM_TYPE,
NFTA_OBJREF_IMM_NAME,
NFTA_OBJREF_SET_SREG,
NFTA_OBJREF_SET_NAME,
NFTA_OBJREF_SET_ID,
__NFTA_OBJREF_MAX
};
#define NFTA_OBJREF_MAX (__NFTA_OBJREF_MAX - 1)
/** /**
* enum nft_gen_attributes - nf_tables ruleset generation attributes * enum nft_gen_attributes - nf_tables ruleset generation attributes
* *
...@@ -1172,6 +1216,32 @@ enum nft_fib_flags { ...@@ -1172,6 +1216,32 @@ enum nft_fib_flags {
NFTA_FIB_F_OIF = 1 << 4, /* restrict to oif */ NFTA_FIB_F_OIF = 1 << 4, /* restrict to oif */
}; };
#define NFT_OBJECT_UNSPEC 0
#define NFT_OBJECT_COUNTER 1
#define NFT_OBJECT_QUOTA 2
#define __NFT_OBJECT_MAX 3
#define NFT_OBJECT_MAX (__NFT_OBJECT_MAX - 1)
/**
* enum nft_object_attributes - nf_tables stateful object netlink attributes
*
* @NFTA_OBJ_TABLE: name of the table containing the expression (NLA_STRING)
* @NFTA_OBJ_NAME: name of this expression type (NLA_STRING)
* @NFTA_OBJ_TYPE: stateful object type (NLA_U32)
* @NFTA_OBJ_DATA: stateful object data (NLA_NESTED)
* @NFTA_OBJ_USE: number of references to this expression (NLA_U32)
*/
enum nft_object_attributes {
NFTA_OBJ_UNSPEC,
NFTA_OBJ_TABLE,
NFTA_OBJ_NAME,
NFTA_OBJ_TYPE,
NFTA_OBJ_DATA,
NFTA_OBJ_USE,
__NFTA_OBJ_MAX
};
#define NFTA_OBJ_MAX (__NFTA_OBJ_MAX - 1)
/** /**
* enum nft_trace_attributes - nf_tables trace netlink attributes * enum nft_trace_attributes - nf_tables trace netlink attributes
* *
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment