1. 01 Mar, 2019 11 commits
    • Florian Westphal's avatar
      netfilter: nf_tables: nat: merge nft_masq protocol specific modules · a9ce849e
      Florian Westphal authored
      The family specific masq modules are way too small to warrant
      an extra module, just place all of them in nft_masq.
      
      before:
        text	   data	    bss	    dec	    hex	filename
         1001	    832	      0	   1833	    729	nft_masq.ko
          766	    896	      0	   1662	    67e	nft_masq_ipv4.ko
          764	    896	      0	   1660	    67c	nft_masq_ipv6.ko
      
      after:
         2010	    960	      0	   2970	    b9a	nft_masq.ko
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a9ce849e
    • Florian Westphal's avatar
      netfilter: nf_tables: nat: merge nft_redir protocol specific modules · c78efc99
      Florian Westphal authored
      before:
       text	   data	    bss	    dec	    hex	filename
       990	    832	      0	   1822	    71e nft_redir.ko
       697	    896	      0	   1593	    639 nft_redir_ipv4.ko
       713	    896	      0	   1609	    649	nft_redir_ipv6.ko
      
      after:
       text	   data	    bss	    dec	    hex	filename
       1910	    960	      0	   2870	    b36	nft_redir.ko
      
      size is reduced, all helpers from nft_redir.ko can be made static.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c78efc99
    • Sami Tolvanen's avatar
      netfilter: xt_IDLETIMER: fix sysfs callback function type · 20fdaf6e
      Sami Tolvanen authored
      Use struct device_attribute instead of struct idletimer_tg_attr, and
      the correct callback function type to avoid indirect call mismatches
      with Control Flow Integrity checking.
      Signed-off-by: default avatarSami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      20fdaf6e
    • Li RongQing's avatar
      netfilter: nf_conntrack: ensure that CONNTRACK_LOCKS is power of 2 · 2e7b162c
      Li RongQing authored
      CONNTRACK_LOCKS is divisor when computer array index, if it is power of
      2, compiler will optimize modulo operation as bitwise AND, or else
      modulo will lower performance.
      Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2e7b162c
    • Li RongQing's avatar
      netfilter: nf_tables: check the result of dereferencing base_chain->stats · a9f5e78c
      Li RongQing authored
      Check the result of dereferencing base_chain->stats, instead of result
      of this_cpu_ptr with NULL.
      
      base_chain->stats maybe be changed to NULL when a chain is updated and a
      new NULL counter can be attached.
      
      And we do not need to check returning of this_cpu_ptr since
      base_chain->stats is from percpu allocator if it is non-NULL,
      this_cpu_ptr returns a valid value.
      
      And fix two sparse error by replacing rcu_access_pointer and
      rcu_dereference with READ_ONCE under rcu_read_lock.
      
      Thanks for Eric's help to finish this patch.
      
      Fixes: 00924094 ("netfilter: nf_tables: don't assume chain stats are set when jumplabel is set")
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarZhang Yu <zhangyu31@baidu.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a9f5e78c
    • David Ahern's avatar
      netfilter: bridge: Don't sabotage nf_hook calls for an l3mdev slave · cd642898
      David Ahern authored
      Followup to a173f066 ("netfilter: bridge: Don't sabotage nf_hook
      calls from an l3mdev"). Some packets (e.g., ndisc) do not have the skb
      device flipped to the l3mdev (e.g., VRF) device. Update ip_sabotage_in
      to not drop packets for slave devices too. Currently, neighbor
      solicitation packets for 'dev -> bridge (addr) -> vrf' setups are getting
      dropped. This patch enables IPv6 communications for bridges with an
      address that are enslaved to a VRF.
      
      Fixes: 73e20b76 ("net: vrf: Add support for PREROUTING rules on vrf device")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cd642898
    • Xin Long's avatar
      ipvs: get sctphdr by sctphoff in sctp_csum_check · f52a40fb
      Xin Long authored
      sctp_csum_check() is called by sctp_s/dnat_handler() where it calls
      skb_make_writable() to ensure sctphdr to be linearized.
      
      So there's no need to get sctphdr by calling skb_header_pointer()
      in sctp_csum_check().
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f52a40fb
    • Li RongQing's avatar
      netfilter: convert the proto argument from u8 to u16 · 11d4dd0b
      Li RongQing authored
      The proto in struct xt_match and struct xt_target is u16, when
      calling xt_check_target/match, their proto argument is u8,
      and will cause truncation, it is harmless to ip packet, since
      ip proto is u8
      
      if a etable's match/target has proto that is u16, will cause
      the check failure.
      
      and convert be16 to short in bridge/netfilter/ebtables.c
      Signed-off-by: default avatarZhang Yu <zhangyu31@baidu.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      11d4dd0b
    • wenxu's avatar
      netfilter: nft_tunnel: Add dst_cache support · 3e511d56
      wenxu authored
      The metadata_dst does not initialize the dst_cache field, this causes
      problems to ip_md_tunnel_xmit() since it cannot use this cache, hence,
      Triggering a route lookup for every packet.
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3e511d56
    • Florian Westphal's avatar
      netfilter: conntrack: tcp: only close if RST matches exact sequence · be0502a3
      Florian Westphal authored
      TCP resets cause instant transition from established to closed state
      provided the reset is in-window.  Endpoints that implement RFC 5961
      require resets to match the next expected sequence number.
      RST segments that are in-window (but that do not match RCV.NXT) are
      ignored, and a "challenge ACK" is sent back.
      
      Main problem for conntrack is that its a middlebox, i.e.  whereas an end
      host might have ACK'd SEQ (and would thus accept an RST with this
      sequence number), conntrack might not have seen this ACK (yet).
      
      Therefore we can't simply flag RSTs with non-exact match as invalid.
      
      This updates RST processing as follows:
      
      1. If the connection is in a state other than ESTABLISHED, nothing is
         changed, RST is subject to normal in-window check.
      
      2. If the RSTs sequence number either matches exactly RCV.NXT,
         connection state moves to CLOSE.
      
      3. The same applies if the RST sequence number aligns with a previous
         packet in the same direction.
      
      In all other cases, the connection remains in ESTABLISHED state.
      If the normal-in-window check passes, the timeout will be lowered
      to that of CLOSE.
      
      If the peer sends a challenge ack, connection timeout will be reset.
      
      If the challenge ACK triggers another RST (RST was valid after all),
      this 2nd RST will match expected sequence and conntrack state changes to
      CLOSE.
      
      If no challenge ACK is received, the connection will time out after
      CLOSE seconds (10 seconds by default), just like without this patch.
      
      Packetdrill test case:
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      0.000 bind(3, ..., ...) = 0
      0.000 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 win 64240 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      
      // Receive a segment.
      0.210 < P. 1:1001(1000) ack 1 win 46
      0.210 > . 1:1(0) ack 1001
      
      // Application writes 1000 bytes.
      0.250 write(4, ..., 1000) = 1000
      0.250 > P. 1:1001(1000) ack 1001
      
      // First reset, old sequence. Conntrack (correctly) considers this
      // invalid due to failed window validation (regardless of this patch).
      0.260 < R  2:2(0) ack 1001 win 260
      
      // 2nd reset, but too far ahead sequence.  Same: correctly handled
      // as invalid.
      0.270 < R 99990001:99990001(0) ack 1001 win 260
      
      // in-window, but not exact sequence.
      // Current Linux kernels might reply with a challenge ack, and do not
      // remove connection.
      // Without this patch, conntrack state moves to CLOSE.
      // With patch, timeout is lowered like CLOSE, but connection stays
      // in ESTABLISHED state.
      0.280 < R 1010:1010(0) ack 1001 win 260
      
      // Expect challenge ACK
      0.281 > . 1001:1001(0) ack 1001 win 501
      
      // With or without this patch, RST will cause connection
      // to move to CLOSE (sequence number matches)
      // 0.282 < R 1001:1001(0) ack 1001 win 260
      
      // ACK
      0.300 < . 1001:1001(0) ack 1001 win 257
      
      // more data could be exchanged here, connection
      // is still established
      
      // Client closes the connection.
      0.610 < F. 1001:1001(0) ack 1001 win 260
      0.650 > . 1001:1001(0) ack 1002
      
      // Close the connection without reading outstanding data
      0.700 close(4) = 0
      
      // so one more reset.  Will be deemed acceptable with patch as well:
      // connection is already closing.
      0.701 > R. 1001:1001(0) ack 1002 win 501
      // End packetdrill test case.
      
      With patch, this generates following conntrack events:
         [NEW] 120 SYN_SENT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [UNREPLIED]
      [UPDATE] 60 SYN_RECV src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80
      [UPDATE] 432000 ESTABLISHED src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED]
      [UPDATE] 120 FIN_WAIT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED]
      [UPDATE] 60 CLOSE_WAIT src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED]
      [UPDATE] 10 CLOSE src=10.0.2.1 dst=10.0.0.1 sport=5437 dport=80 [ASSURED]
      
      Without patch, first RST moves connection to close, whereas socket state
      does not change until FIN is received.
         [NEW] 120 SYN_SENT src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [UNREPLIED]
      [UPDATE] 60 SYN_RECV src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80
      [UPDATE] 432000 ESTABLISHED src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED]
      [UPDATE] 10 CLOSE src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED]
      
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      be0502a3
    • Andrea Claudi's avatar
      ipvs: change some data types from int to bool · f25a9b85
      Andrea Claudi authored
      Change the data type of the following variables from int to bool
      across ipvs code:
      
        - found
        - loop
        - need_full_dest
        - need_full_svc
        - payload_csum
      
      Also change the following functions to use bool full_entry param
      instead of int:
      
        - ip_vs_genl_parse_dest()
        - ip_vs_genl_parse_service()
      
      This patch does not change any functionality but makes the source
      code slightly easier to read.
      Signed-off-by: default avatarAndrea Claudi <aclaudi@redhat.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f25a9b85
  2. 27 Feb, 2019 18 commits
  3. 26 Feb, 2019 11 commits