1. 27 Feb, 2019 4 commits
    • Rajasingh Thavamani's avatar
      net: phy: Micrel KSZ8061: link failure after cable connect · 232ba3a5
      Rajasingh Thavamani authored
      With Micrel KSZ8061 PHY, the link may occasionally not come up after
      Ethernet cable connect. The vendor's (Microchip, former Micrel) errata
      sheet 80000688A.pdf descripes the problem and possible workarounds in
      detail, see below.
      The batch implements workaround 1, which permanently fixes the issue.
      
      DESCRIPTION
      Link-up may not occur properly when the Ethernet cable is initially
      connected. This issue occurs more commonly when the cable is connected
      slowly, but it may occur any time a cable is connected. This issue occurs
      in the auto-negotiation circuit, and will not occur if auto-negotiation
      is disabled (which requires that the two link partners be set to the
      same speed and duplex).
      
      END USER IMPLICATIONS
      When this issue occurs, link is not established. Subsequent cable
      plug/unplaug cycle will not correct the issue.
      
      WORk AROUND
      There are four approaches to work around this issue:
      1. This issue can be prevented by setting bit 15 in MMD device address 1,
         register 2, prior to connecting the cable or prior to setting the
         Restart Auto-negotiation bit in register 0h. The MMD registers are
         accessed via the indirect access registers Dh and Eh, or via the Micrel
         EthUtil utility as shown here:
         . if using the EthUtil utility (usually with a Micrel KSZ8061
           Evaluation Board), type the following commands:
           > address 1
           > mmd 1
           > iw 2 b61a
         . Alternatively, write the following registers to write to the
           indirect MMD register:
           Write register Dh, data 0001h
           Write register Eh, data 0002h
           Write register Dh, data 4001h
           Write register Eh, data B61Ah
      2. The issue can be avoided by disabling auto-negotiation in the KSZ8061,
         either by the strapping option, or by clearing bit 12 in register 0h.
         Care must be taken to ensure that the KSZ8061 and the link partner
         will link with the same speed and duplex. Note that the KSZ8061
         defaults to full-duplex when auto-negotiation is off, but other
         devices may default to half-duplex in the event of failed
         auto-negotiation.
      3. The issue can be avoided by connecting the cable prior to powering-up
         or resetting the KSZ8061, and leaving it plugged in thereafter.
      4. If the above measures are not taken and the problem occurs, link can
         be recovered by setting the Restart Auto-Negotiation bit in
         register 0h, or by resetting or power cycling the device. Reset may
         be either hardware reset or software reset (register 0h, bit 15).
      
      PLAN
      This errata will not be corrected in the future revision.
      
      Fixes: 7ab59dc1 ("drivers/net/phy/micrel_phy: Add support for new PHYs")
      Signed-off-by: default avatarAlexander Onnasch <alexander.onnasch@landisgyr.com>
      Signed-off-by: default avatarRajasingh Thavamani <T.Rajasingh@landisgyr.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      232ba3a5
    • Andy Shevchenko's avatar
      enc28j60: Correct description of debug module parameter · 287beb28
      Andy Shevchenko authored
      The netif_msg_init() API takes on input the amount of bits to be set. The
      description of debug parameter in the enc28j60 module is misleading in this
      sense and passing 0xffff does not give an expected behaviour.
      
      Fix the description of debug module parameter to show what exactly is expected.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      287beb28
    • Andy Shevchenko's avatar
      net: dev: Use unsigned integer as an argument to left-shift · f4d7b3e2
      Andy Shevchenko authored
      1 << 31 is Undefined Behaviour according to the C standard.
      Use U type modifier to avoid theoretical overflow.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4d7b3e2
    • Michael Chan's avatar
      bnxt_en: Drop oversize TX packets to prevent errors. · 2b3c6885
      Michael Chan authored
      There have been reports of oversize UDP packets being sent to the
      driver to be transmitted, causing error conditions.  The issue is
      likely caused by the dst of the SKB switching between 'lo' with
      64K MTU and the hardware device with a smaller MTU.  Patches are
      being proposed by Mahesh Bandewar <maheshb@google.com> to fix the
      issue.
      
      In the meantime, add a quick length check in the driver to prevent
      the error.  The driver uses the TX packet size as index to look up an
      array to setup the TX BD.  The array is large enough to support all MTU
      sizes supported by the driver.  The oversize TX packet causes the
      driver to index beyond the array and put garbage values into the
      TX BD.  Add a simple check to prevent this.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b3c6885
  2. 26 Feb, 2019 6 commits
    • Tung Nguyen's avatar
      tipc: fix race condition causing hung sendto · bfd07f3d
      Tung Nguyen authored
      When sending multicast messages via blocking socket,
      if sending link is congested (tsk->cong_link_cnt is set to 1),
      the sending thread will be put into sleeping state. However,
      tipc_sk_filter_rcv() is called under socket spin lock but
      tipc_wait_for_cond() is not. So, there is no guarantee that
      the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
      CPU-1 will be perceived by CPU-0. If that is the case, the sending
      thread in CPU-0 after being waken up, will continue to see
      tsk->cong_link_cnt as 1 and put the sending thread into sleeping
      state again. The sending thread will sleep forever.
      
      CPU-0                                | CPU-1
      tipc_wait_for_cond()                 |
      {                                    |
       // condition_ = !tsk->cong_link_cnt |
       while ((rc_ = !(condition_))) {     |
        ...                                |
        release_sock(sk_);                 |
        wait_woken();                      |
                                           | if (!sock_owned_by_user(sk))
                                           |  tipc_sk_filter_rcv()
                                           |  {
                                           |   ...
                                           |   tipc_sk_proto_rcv()
                                           |   {
                                           |    ...
                                           |    tsk->cong_link_cnt--;
                                           |    ...
                                           |    sk->sk_write_space(sk);
                                           |    ...
                                           |   }
                                           |   ...
                                           |  }
        sched_annotate_sleep();            |
        lock_sock(sk_);                    |
        remove_wait_queue();               |
       }                                   |
      }                                    |
      
      This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
      and tipc_wait_for_cond().
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfd07f3d
    • Haiyang Zhang's avatar
      hv_netvsc: Fix IP header checksum for coalesced packets · bf48648d
      Haiyang Zhang authored
      Incoming packets may have IP header checksum verified by the host.
      They may not have IP header checksum computed after coalescing.
      This patch re-compute the checksum when necessary, otherwise the
      packets may be dropped, because Linux network stack always checks it.
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf48648d
    • David S. Miller's avatar
      Merge branch 'net-fail-route' · d8e96745
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: Fail route add with unsupported nexthop attribute
      
      RTA_VIA was added for MPLS as a way of specifying a gateway from a
      different address family. IPv4 and IPv6 do not currently support RTA_VIA
      so using it leads to routes that are not what the user intended. Catch
      and fail - returning a proper error message.
      
      MPLS on the other hand does not support RTA_GATEWAY since it does not
      make sense to have a nexthop from the MPLS address family. Similarly,
      catch and fail - returning a proper error message.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8e96745
    • David Ahern's avatar
      mpls: Return error for RTA_GATEWAY attribute · be48220e
      David Ahern authored
      MPLS does not support nexthops with an MPLS address family.
      Specifically, it does not handle RTA_GATEWAY attribute. Make it
      clear by returning an error.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be48220e
    • David Ahern's avatar
      ipv6: Return error for RTA_VIA attribute · e3818541
      David Ahern authored
      IPv6 currently does not support nexthops outside of the AF_INET6 family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        $ ip ro ls
        ...
        2001:db8:2::/64 dev eth0 metric 1024 pref medium
      
      Catch this and fail the route add:
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        Error: IPv6 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3818541
    • David Ahern's avatar
      ipv4: Return error for RTA_VIA attribute · b6e9e5df
      David Ahern authored
      IPv4 currently does not support nexthops outside of the AF_INET family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        $ ip ro ls
        ...
        172.16.1.0/24 dev eth0
      
      Catch this and fail the route add:
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        Error: IPv4 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6e9e5df
  3. 25 Feb, 2019 10 commits
    • Nazarov Sergey's avatar
      net: avoid use IPCB in cipso_v4_error · 3da1ed7a
      Nazarov Sergey authored
      Extract IP options in cipso_v4_error and use __icmp_send.
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3da1ed7a
    • Nazarov Sergey's avatar
      net: Add __icmp_send helper. · 9ef6b42a
      Nazarov Sergey authored
      Add __icmp_send function having ip_options struct parameter
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ef6b42a
    • Timur Celik's avatar
      tun: remove unnecessary memory barrier · ecef67cb
      Timur Celik authored
      Replace set_current_state with __set_current_state since no memory
      barrier is needed at this point.
      Signed-off-by: default avatarTimur Celik <mail@timurcelik.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ecef67cb
    • Eric Biggers's avatar
      net: socket: set sock->sk to NULL after calling proto_ops::release() · ff7b11aa
      Eric Biggers authored
      Commit 9060cb71 ("net: crypto set sk to NULL when af_alg_release.")
      fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
      closed concurrently with fchownat().  However, it ignored that many
      other proto_ops::release() methods don't set sock->sk to NULL and
      therefore allow the same use-after-free:
      
          - base_sock_release
          - bnep_sock_release
          - cmtp_sock_release
          - data_sock_release
          - dn_release
          - hci_sock_release
          - hidp_sock_release
          - iucv_sock_release
          - l2cap_sock_release
          - llcp_sock_release
          - llc_ui_release
          - rawsock_release
          - rfcomm_sock_release
          - sco_sock_release
          - svc_release
          - vcc_release
          - x25_release
      
      Rather than fixing all these and relying on every socket type to get
      this right forever, just make __sock_release() set sock->sk to NULL
      itself after calling proto_ops::release().
      
      Reproducer that produces the KASAN splat when any of these socket types
      are configured into the kernel:
      
          #include <pthread.h>
          #include <stdlib.h>
          #include <sys/socket.h>
          #include <unistd.h>
      
          pthread_t t;
          volatile int fd;
      
          void *close_thread(void *arg)
          {
              for (;;) {
                  usleep(rand() % 100);
                  close(fd);
              }
          }
      
          int main()
          {
              pthread_create(&t, NULL, close_thread, NULL);
              for (;;) {
                  fd = socket(rand() % 50, rand() % 11, 0);
                  fchownat(fd, "", 1000, 1000, 0x1000);
                  close(fd);
              }
          }
      
      Fixes: 86741ec2 ("net: core: Add a UID field to struct sock.")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff7b11aa
    • Vlad Buslov's avatar
      net: sched: act_tunnel_key: fix NULL pointer dereference during init · a3df633a
      Vlad Buslov authored
      Metadata pointer is only initialized for action TCA_TUNNEL_KEY_ACT_SET, but
      it is unconditionally dereferenced in tunnel_key_init() error handler.
      Verify that metadata pointer is not NULL before dereferencing it in
      tunnel_key_init error handling code.
      
      Fixes: ee28bb56 ("net/sched: fix memory leak in act_tunnel_key_init()")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3df633a
    • Wen Yang's avatar
      net: dsa: fix a leaked reference by adding missing of_node_put · 9919a363
      Wen Yang authored
      The call to of_parse_phandle returns a node pointer with refcount
      incremented thus it must be explicitly decremented after the last
      usage.
      
      Detected by coccinelle with the following warnings:
      ./net/dsa/port.c:294:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 284, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:627:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:630:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:636:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:639:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      Signed-off-by: default avatarWen Yang <wen.yang99@zte.com.cn>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9919a363
    • Timur Celik's avatar
      tun: fix blocking read · 71828b22
      Timur Celik authored
      This patch moves setting of the current state into the loop. Otherwise
      the task may end up in a busy wait loop if none of the break conditions
      are met.
      Signed-off-by: default avatarTimur Celik <mail@timurcelik.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71828b22
    • Hauke Mehrtens's avatar
      net: dsa: lantiq: Add GPHY firmware files · cffde201
      Hauke Mehrtens authored
      This adds the file names of the FW files which this driver handles into
      the module description.
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cffde201
    • Davide Caratti's avatar
      net/sched: act_skbedit: fix refcount leak when replace fails · 6191da98
      Davide Caratti authored
      when act_skbedit was converted to use RCU in the data plane, we added an
      error path, but we forgot to drop the action refcount in case of failure
      during a 'replace' operation:
      
       # tc actions add action skbedit ptype otherhost pass index 100
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 1 bind 0
       # tc actions replace action skbedit ptype otherhost drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'params_new' allocation failed,
      also when the action is being replaced.
      
      Fixes: c749cdda ("net/sched: act_skbedit: don't use spinlock in the data path")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6191da98
    • Davide Caratti's avatar
      net/sched: act_ipt: fix refcount leak when replace fails · 8f67c90e
      Davide Caratti authored
      After commit 4e8ddd7f ("net: sched: don't release reference on action
      overwrite"), the error path of all actions was converted to drop refcount
      also when the action was being overwritten. But we forgot act_ipt_init(),
      in case allocation of 'tname' was not successful:
      
       # tc action add action xt -j LOG --log-prefix hello index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "hello" index 100
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 1 bind 0
       # tc action replace action xt -j LOG --log-prefix world index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "world" index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'tname' allocation failed, also
      when the action is being replaced.
      
      Fixes: 4e8ddd7f ("net: sched: don't release reference on action overwrite")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f67c90e
  4. 24 Feb, 2019 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · c3619a48
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Bug fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: MMU: record maximum physical address width in kvm_mmu_extended_role
        kvm: x86: Return LA57 feature based on hardware capability
        x86/kvm/mmu: fix switch between root and guest MMUs
        s390: vsie: Use effective CRYCBD.31 to check CRYCBD validity
      c3619a48
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c4eb1e18
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Hopefully the last pull request for this release. Fingers crossed:
      
         1) Only refcount ESP stats on full sockets, from Martin Willi.
      
         2) Missing barriers in AF_UNIX, from Al Viro.
      
         3) RCU protection fixes in ipv6 route code, from Paolo Abeni.
      
         4) Avoid false positives in untrusted GSO validation, from Willem de
            Bruijn.
      
         5) Forwarded mesh packets in mac80211 need more tailroom allocated,
            from Felix Fietkau.
      
         6) Use operstate consistently for linkup in team driver, from George
            Wilkie.
      
         7) ThunderX bug fixes from Vadim Lomovtsev. Mostly races between VF
            and PF code paths.
      
         8) Purge ipv6 exceptions during netdevice removal, from Paolo Abeni.
      
         9) nfp eBPF code gen fixes from Jiong Wang.
      
        10) bnxt_en firmware timeout fix from Michael Chan.
      
        11) Use after free in udp/udpv6 error handlers, from Paolo Abeni.
      
        12) Fix a race in x25_bind triggerable by syzbot, from Eric Dumazet"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
        net: phy: realtek: Dummy IRQ calls for RTL8366RB
        tcp: repaired skbs must init their tso_segs
        net/x25: fix a race in x25_bind()
        net: dsa: Remove documentation for port_fdb_prepare
        Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"
        selftests: fib_tests: sleep after changing carrier. again.
        net: set static variable an initial value in atl2_probe()
        net: phy: marvell10g: Fix Multi-G advertisement to only advertise 10G
        bpf, doc: add bpf list as secondary entry to maintainers file
        udp: fix possible user after free in error handler
        udpv6: fix possible user after free in error handler
        fou6: fix proto error handler argument type
        udpv6: add the required annotation to mib type
        mdio_bus: Fix use-after-free on device_register fails
        net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
        bnxt_en: Wait longer for the firmware message response to complete.
        bnxt_en: Fix typo in firmware message timeout logic.
        nfp: bpf: fix ALU32 high bits clearance bug
        nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
        Documentation: networking: switchdev: Update port parent ID section
        ...
      c4eb1e18
    • Linus Walleij's avatar
      net: phy: realtek: Dummy IRQ calls for RTL8366RB · 4c8e0459
      Linus Walleij authored
      This fixes a regression introduced by
      commit 0d2e778e
      "net: phy: replace PHY_HAS_INTERRUPT with a check for
      config_intr and ack_interrupt".
      
      This assumes that a PHY cannot trigger interrupt unless
      it has .config_intr() or .ack_interrupt() implemented.
      A later patch makes the code assume both need to be
      implemented for interrupts to be present.
      
      But this PHY (which is inside a DSA) will happily
      fire interrupts without either callback.
      
      Implement dummy callbacks for .config_intr() and
      .ack_interrupt() in the phy header to fix this.
      
      Tested on the RTL8366RB on D-Link DIR-685.
      
      Fixes: 0d2e778e ("net: phy: replace PHY_HAS_INTERRUPT with a check for config_intr and ack_interrupt")
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c8e0459
    • Eric Dumazet's avatar
      tcp: repaired skbs must init their tso_segs · bf50b606
      Eric Dumazet authored
      syzbot reported a WARN_ON(!tcp_skb_pcount(skb))
      in tcp_send_loss_probe() [1]
      
      This was caused by TCP_REPAIR sent skbs that inadvertenly
      were missing a call to tcp_init_tso_segs()
      
      [1]
      WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       panic+0x2cb/0x65c kernel/panic.c:214
       __warn.cold+0x20/0x45 kernel/panic.c:571
       report_bug+0x263/0x2b0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       fixup_bug arch/x86/kernel/traps.c:173 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
      RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89
      RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206
      RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb
      RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005
      RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1
      R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040
      R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860
       tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583
       tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
       </IRQ>
      RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58
      Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
      RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc
      RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000
       arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555
       default_idle_call+0x36/0x90 kernel/sched/idle.c:93
       cpuidle_idle_call kernel/sched/idle.c:153 [inline]
       do_idle+0x386/0x570 kernel/sched/idle.c:262
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353
       start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 79861919 ("tcp: fix TCP_REPAIR xmit queue setup")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf50b606
    • Eric Dumazet's avatar
      net/x25: fix a race in x25_bind() · 797a22bd
      Eric Dumazet authored
      syzbot was able to trigger another soft lockup [1]
      
      I first thought it was the O(N^2) issue I mentioned in my
      prior fix (f657d22ee1f "net/x25: do not hold the cpu
      too long in x25_new_lci()"), but I eventually found
      that x25_bind() was not checking SOCK_ZAPPED state under
      socket lock protection.
      
      This means that multiple threads can end up calling
      x25_insert_socket() for the same socket, and corrupt x25_list
      
      [1]
      watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492]
      Modules linked in:
      irq event stamp: 27515
      hardirqs last  enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c
      hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c
      softirqs last  enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336
      softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341
      CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97
      Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2
      RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000
      RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628
      RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628
      R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000
      FS:  00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __x25_find_socket net/x25/af_x25.c:327 [inline]
       x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342
       x25_new_lci net/x25/af_x25.c:355 [inline]
       x25_connect+0x380/0xde0 net/x25/af_x25.c:784
       __sys_connect+0x266/0x330 net/socket.c:1662
       __do_sys_connect net/socket.c:1673 [inline]
       __se_sys_connect net/socket.c:1670 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1670
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29
      RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005
      RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4
      R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
      RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86
      Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00
      RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206
      RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00
      RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff
      R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003
      FS:  00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
       do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
       _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
       x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
       x25_bind+0x273/0x340 net/x25/af_x25.c:703
       __sys_bind+0x23f/0x290 net/socket.c:1481
       __do_sys_bind net/socket.c:1492 [inline]
       __se_sys_bind net/socket.c:1490 [inline]
       __x64_sys_bind+0x73/0xb0 net/socket.c:1490
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      
      Fixes: 90c27297 ("X.25 remove bkl in bind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: andrew hendry <andrew.hendry@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      797a22bd
    • Hauke Mehrtens's avatar
      net: dsa: Remove documentation for port_fdb_prepare · 99407d8f
      Hauke Mehrtens authored
      This callback was removed some time ago, also remove the documentation.
      
      Fixes: 1b6dd556 ("net: dsa: Remove prepare phase for FDB")
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99407d8f
    • Hangbin Liu's avatar
      Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" · 278e2148
      Hangbin Liu authored
      This reverts commit 5a2de63f ("bridge: do not add port to router list
      when receives query with source 0.0.0.0") and commit 0fe5119e ("net:
      bridge: remove ipv6 zero address check in mcast queries")
      
      The reason is RFC 4541 is not a standard but suggestive. Currently we
      will elect 0.0.0.0 as Querier if there is no ip address configured on
      bridge. If we do not add the port which recives query with source
      0.0.0.0 to router list, the IGMP reports will not be about to forward
      to Querier, IGMP data will also not be able to forward to dest.
      
      As Nikolay suggested, revert this change first and add a boolopt api
      to disable none-zero election in future if needed.
      Reported-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Reported-by: default avatarSebastian Gottschall <s.gottschall@newmedia-net.de>
      Fixes: 5a2de63f ("bridge: do not add port to router list when receives query with source 0.0.0.0")
      Fixes: 0fe5119e ("net: bridge: remove ipv6 zero address check in mcast queries")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278e2148
    • Thadeu Lima de Souza Cascardo's avatar
      selftests: fib_tests: sleep after changing carrier. again. · af548a27
      Thadeu Lima de Souza Cascardo authored
      Just like commit e2ba732a ("selftests: fib_tests: sleep after
      changing carrier"), wait one second to allow linkwatch to propagate the
      carrier change to the stack.
      
      There are two sets of carrier tests. The first slept after the carrier
      was set to off, and when the second set ran, it was likely that the
      linkwatch would be able to run again without much delay, reducing the
      likelihood of a race. However, if you run 'fib_tests.sh -t carrier' on a
      loop, you will quickly notice the failures.
      
      Sleeping on the second set of tests make the failures go away.
      
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af548a27
  5. 23 Feb, 2019 12 commits