1. 15 Sep, 2018 8 commits
    • Florian Westphal's avatar
      tcp: do not restart timewait timer on rst reception · e4b6c5fd
      Florian Westphal authored
      [ Upstream commit 63cc357f ]
      
      RFC 1337 says:
       ''Ignore RST segments in TIME-WAIT state.
         If the 2 minute MSL is enforced, this fix avoids all three hazards.''
      
      So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
      expire rather than removing it instantly when a reset is received.
      
      However, Linux will also re-start the TIME-WAIT timer.
      
      This causes connect to fail when tying to re-use ports or very long
      delays (until syn retry interval exceeds MSL).
      
      packetdrill test case:
      // Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
      `sysctl net.ipv4.tcp_rfc1337=1`
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      0.000 bind(3, ..., ...) = 0
      0.000 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      
      // Receive first segment
      0.310 < P. 1:1001(1000) ack 1 win 46
      
      // Send one ACK
      0.310 > . 1:1(0) ack 1001
      
      // read 1000 byte
      0.310 read(4, ..., 1000) = 1000
      
      // Application writes 100 bytes
      0.350 write(4, ..., 100) = 100
      0.350 > P. 1:101(100) ack 1001
      
      // ACK
      0.500 < . 1001:1001(0) ack 101 win 257
      
      // close the connection
      0.600 close(4) = 0
      0.600 > F. 101:101(0) ack 1001 win 244
      
      // Our side is in FIN_WAIT_1 & waits for ack to fin
      0.7 < . 1001:1001(0) ack 102 win 244
      
      // Our side is in FIN_WAIT_2 with no outstanding data.
      0.8 < F. 1001:1001(0) ack 102 win 244
      0.8 > . 102:102(0) ack 1002 win 244
      
      // Our side is now in TIME_WAIT state, send ack for fin.
      0.9 < F. 1002:1002(0) ack 102 win 244
      0.9 > . 102:102(0) ack 1002 win 244
      
      // Peer reopens with in-window SYN:
      1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      // Therefore, reply with ACK.
      1.000 > . 102:102(0) ack 1002 win 244
      
      // Peer sends RST for this ACK.  Normally this RST results
      // in tw socket removal, but rfc1337=1 setting prevents this.
      1.100 < R 1002:1002(0) win 244
      
      // second syn. Due to rfc1337=1 expect another pure ACK.
      31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      31.0 > . 102:102(0) ack 1002 win 244
      
      // .. and another RST from peer.
      31.1 < R 1002:1002(0) win 244
      31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`
      
      // third syn after one minute.  Time-Wait socket should have expired by now.
      63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      // so we expect a syn-ack & 3whs to proceed from here on.
      63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      Without this patch, 'ss' shows restarts of tw timer and last packet is
      thus just another pure ack, more than one minute later.
      
      This restores the original code from commit 283fd6cf0be690a83
      ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
      
      For some reason the else branch was removed/lost in 1f28b683339f7
      ("Merge in TCP/UDP optimizations and [..]") and timer restart became
      unconditional.
      Reported-by: default avatarMichal Tesar <mtesar@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4b6c5fd
    • Anthony Wong's avatar
      r8169: add support for NCube 8168 network card · 3eada53d
      Anthony Wong authored
      [ Upstream commit 9fd0e09a ]
      
      This card identifies itself as:
        Ethernet controller [0200]: NCube Device [10ff:8168] (rev 06)
        Subsystem: TP-LINK Technologies Co., Ltd. Device [7470:3468]
      
      Adding a new entry to rtl8169_pci_tbl makes the card work.
      
      Link: http://launchpad.net/bugs/1788730Signed-off-by: default avatarAnthony Wong <anthony.wong@ubuntu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eada53d
    • Manish Chopra's avatar
      qlge: Fix netdev features configuration. · d19688e3
      Manish Chopra authored
      [ Upstream commit 6750c870 ]
      
      qlge_fix_features() is not supposed to modify hardware or
      driver state, rather it is supposed to only fix requested
      fetures bits. Currently qlge_fix_features() also goes for
      interface down and up unnecessarily if there is not even
      any change in features set.
      
      This patch changes/fixes following -
      
      1) Move reload of interface or device re-config from
         qlge_fix_features() to qlge_set_features().
      2) Reload of interface in qlge_set_features() only if
         relevant feature bit (NETIF_F_HW_VLAN_CTAG_RX) is changed.
      3) Get rid of qlge_fix_features() since driver is not really
         required to fix any features bit.
      Signed-off-by: default avatarManish <manish.chopra@cavium.com>
      Reviewed-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d19688e3
    • Kees Cook's avatar
      net: sched: Fix memory exposure from short TCA_U32_SEL · 7f1e6ec4
      Kees Cook authored
      [ Upstream commit 98c8f125 ]
      
      Via u32_change(), TCA_U32_SEL has an unspecified type in the netlink
      policy, so max length isn't enforced, only minimum. This means nkeys
      (from userspace) was being trusted without checking the actual size of
      nla_len(), which could lead to a memory over-read, and ultimately an
      exposure via a call to u32_dump(). Reachability is CAP_NET_ADMIN within
      a namespace.
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f1e6ec4
    • Anssi Hannula's avatar
      net: macb: do not disable MDIO bus at open/close time · cb765f5c
      Anssi Hannula authored
      [ Upstream commit 0da70f80 ]
      
      macb_reset_hw() is called from macb_close() and indirectly from
      macb_open(). macb_reset_hw() zeroes the NCR register, including the MPE
      (Management Port Enable) bit.
      
      This will prevent accessing any other PHYs for other Ethernet MACs on
      the MDIO bus, which remains registered at macb_reset_hw() time, until
      macb_init_hw() is called from macb_open() which sets the MPE bit again.
      
      I.e. currently the MDIO bus has a short disruption at open time and is
      disabled at close time until the interface is opened again.
      
      Fix that by only touching the RE and TE bits when enabling and disabling
      RX/TX.
      
      v2: Make macb_init_hw() NCR write a single statement.
      
      Fixes: 6c36a707 ("macb: Use generic PHY layer")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Reviewed-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb765f5c
    • Doug Berger's avatar
      net: bcmgenet: use MAC link status for fixed phy · 1ef819e4
      Doug Berger authored
      [ Upstream commit c3c397c1 ]
      
      When using the fixed PHY with GENET (e.g. MOCA) the PHY link
      status can be determined from the internal link status captured
      by the MAC. This allows the PHY state machine to use the correct
      link state with the fixed PHY even if MAC link event interrupts
      are missed when the net device is opened.
      
      Fixes: 8d88c6eb ("net: bcmgenet: enable MoCA link state change detection")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ef819e4
    • Eric Dumazet's avatar
      ipv4: tcp: send zero IPID for RST and ACK sent in SYN-RECV and TIME-WAIT state · a16405ad
      Eric Dumazet authored
      [ Upstream commit 431280ee ]
      
      tcp uses per-cpu (and per namespace) sockets (net->ipv4.tcp_sk) internally
      to send some control packets.
      
      1) RST packets, through tcp_v4_send_reset()
      2) ACK packets in SYN-RECV and TIME-WAIT state, through tcp_v4_send_ack()
      
      These packets assert IP_DF, and also use the hashed IP ident generator
      to provide an IPv4 ID number.
      
      Geoff Alexander reported this could be used to build off-path attacks.
      
      These packets should not be fragmented, since their size is smaller than
      IPV4_MIN_MTU. Only some tunneled paths could eventually have to fragment,
      regardless of inner IPID.
      
      We really can use zero IPID, to address the flaw, and as a bonus,
      avoid a couple of atomic operations in ip_idents_reserve()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarGeoff Alexander <alexandg@cs.unm.edu>
      Tested-by: default avatarGeoff Alexander <alexandg@cs.unm.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a16405ad
    • Cong Wang's avatar
      act_ife: fix a potential use-after-free · a08d7ea1
      Cong Wang authored
      [ Upstream commit 6d784f16 ]
      
      Immediately after module_put(), user could delete this
      module, so e->ops could be already freed before we call
      e->ops->release().
      
      Fix this by moving module_put() after ops->release().
      
      Fixes: ef6980b6 ("introduce IFE action")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a08d7ea1
  2. 09 Sep, 2018 32 commits