1. 18 Jun, 2019 2 commits
  2. 17 Jun, 2019 33 commits
    • David S. Miller's avatar
      Merge branch 'UDP-GSO-audit-tests' · f97252a8
      David S. Miller authored
      Fred Klassen says:
      
      ====================
      UDP GSO audit tests
      
      Updates to UDP GSO selftests ot optionally stress test CMSG
      subsytem, and report the reliability and performance of both
      TX Timestamping and ZEROCOPY messages.
      ====================
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f97252a8
    • Fred Klassen's avatar
      net/udpgso_bench.sh test fails on error · 4ffc37f5
      Fred Klassen authored
      Ensure that failure on any individual test results in an overall
      failure of the test script.
      Signed-off-by: default avatarFred Klassen <fklassen@appneta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ffc37f5
    • Fred Klassen's avatar
      net/udpgso_bench.sh add UDP GSO audit tests · ade90d69
      Fred Klassen authored
      Audit tests count the total number of messages sent and compares
      with total number of CMSG received on error queue. Example:
      
          udp gso zerocopy timestamp audit
          udp rx:   1599 MB/s  1166414 calls/s
          udp tx:   1615 MB/s    27395 calls/s  27395 msg/s
          udp rx:   1634 MB/s  1192261 calls/s
          udp tx:   1633 MB/s    27699 calls/s  27699 msg/s
          udp rx:   1633 MB/s  1191358 calls/s
          udp tx:   1631 MB/s    27678 calls/s  27678 msg/s
          Summary over 4.000 seconds...
          sum udp tx:   1665 MB/s      82772 calls (27590/s)      82772 msgs (27590/s)
          Tx Timestamps:               82772 received                 0 errors
          Zerocopy acks:               82772 received
      
      Errors are thrown if CMSG count does not equal send count,
      example:
      
          Summary over 4.000 seconds...
          sum tcp tx:   7451 MB/s     493706 calls (123426/s)     493706 msgs (123426/s)
          ./udpgso_bench_tx: Unexpected number of Zerocopy completions:    493706 expected    493704 received
      
      Also reduce individual test time from 4 to 3 seconds so that
      overall test time does not increase significantly.
      
      v3: Enhancements as per Willem de Bruijn <willemb@google.com>
          - document -P option for TCP audit
      Signed-off-by: default avatarFred Klassen <fklassen@appneta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ade90d69
    • Fred Klassen's avatar
      net/udpgso_bench_tx: options to exercise TX CMSG · 79ebc3c2
      Fred Klassen authored
      This enhancement adds options that facilitate load testing with
      additional TX CMSG options, and to optionally print results of
      various send CMSG operations.
      
      These options are especially useful in isolating situations
      where error-queue messages are lost when combined with other
      CMSG operations (e.g. SO_ZEROCOPY).
      
      New options:
          -a - count all CMSG messages and match to sent messages
          -T - add TX CMSG that requests TX software timestamps
          -H - similar to -T except request TX hardware timestamps
          -P - call poll() before reading error queue
          -v - print detailed results
      
      v2: Enhancements as per Willem de Bruijn <willemb@google.com>
          - Updated control and buffer parameters for recvmsg
          - poll() parameter cleanup
          - fail on bad audit results
          - remove TOS options
          - improved reporting
      
      v3: Enhancements as per Willem de Bruijn <willemb@google.com>
          - add SOF_TIMESTAMPING_OPT_TSONLY to eliminate MSG_TRUNC
          - general code cleanup
      Signed-off-by: default avatarFred Klassen <fklassen@appneta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79ebc3c2
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 29f785ff
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "MS_MOVE regression fix + breakage in fsmount(2) (also introduced in
        this cycle, along with fsmount(2) itself).
      
        I'm still digging through the piles of mail, so there might be more
        fixes to follow, but these two are obvious and self-contained, so
        there's no point delaying those..."
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs/namespace: fix unprivileged mount propagation
        vfs: fsmount: add missing mntget()
      29f785ff
    • David S. Miller's avatar
      Merge branch 'net-ipv4-remove-erroneous-advancement-of-list-pointer' · 4bd366ce
      David S. Miller authored
      Florian Westphal says:
      
      ====================
      net: ipv4: remove erroneous advancement of list pointer
      
      Tariq reported a soft lockup on net-next that Mellanox was able to
      bisect to 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list").
      
      While reviewing above patch I found a regression when addresses have a
      lifetime specified.
      
      Second patch extends rtnetlink.sh to trigger crash
      (without first patch applied).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bd366ce
    • Florian Westphal's avatar
      selftests: rtnetlink: add addresses with fixed life time · 3cfa1488
      Florian Westphal authored
      This exercises kernel code path that deal with addresses that have
      a limited lifetime.
      
      Without previous fix, this triggers following crash on net-next:
       BUG: KASAN: null-ptr-deref in check_lifetime+0x403/0x670
       Read of size 8 at addr 0000000000000010 by task kworker [..]
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3cfa1488
    • Florian Westphal's avatar
      net: ipv4: remove erroneous advancement of list pointer · 40008e92
      Florian Westphal authored
      Causes crash when lifetime expires on an adress as garbage is
      dereferenced soon after.
      
      This used to look like this:
      
       for (ifap = &ifa->ifa_dev->ifa_list;
            *ifap != NULL; ifap = &(*ifap)->ifa_next) {
                if (*ifap == ifa) ...
      
      but this was changed to:
      
      struct in_ifaddr *tmp;
      
      ifap = &ifa->ifa_dev->ifa_list;
      tmp = rtnl_dereference(*ifap);
      while (tmp) {
         tmp = rtnl_dereference(tmp->ifa_next); // Bogus
         if (rtnl_dereference(*ifap) == ifa) {
           ...
         ifap = &tmp->ifa_next;		// Can be NULL
         tmp = rtnl_dereference(*ifap);	// Dereference
         }
      }
      
      Remove the bogus assigment/list entry skip.
      
      Fixes: 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40008e92
    • Arnd Bergmann's avatar
      net: dsa: sja1105: fix ptp link error · 78fe8a28
      Arnd Bergmann authored
      Due to a reversed dependency, it is possible to build
      the lower ptp driver as a loadable module and the actual
      driver using it as built-in, causing a link error:
      
      drivers/net/dsa/sja1105/sja1105_spi.o: In function `sja1105_static_config_upload':
      sja1105_spi.c:(.text+0x6f0): undefined reference to `sja1105_ptp_reset'
      drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x2d4): undefined reference to `sja1105et_ptp_cmd'
      drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x604): undefined reference to `sja1105pqrs_ptp_cmd'
      drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_remove':
      sja1105_main.c:(.text+0x8d4): undefined reference to `sja1105_ptp_clock_unregister'
      drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_rxtstamp_work':
      sja1105_main.c:(.text+0x964): undefined reference to `sja1105_tstamp_reconstruct'
      drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_setup':
      sja1105_main.c:(.text+0xb7c): undefined reference to `sja1105_ptp_clock_register'
      drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_port_deferred_xmit':
      sja1105_main.c:(.text+0x1fa0): undefined reference to `sja1105_ptpegr_ts_poll'
      sja1105_main.c:(.text+0x1fc4): undefined reference to `sja1105_tstamp_reconstruct'
      drivers/net/dsa/sja1105/sja1105_main.o:(.rodata+0x5b0): undefined reference to `sja1105_get_ts_info'
      
      Change the Makefile logic to always build the ptp module
      the same way as the rest. Another option would be to
      just add it to the same module and remove the exports,
      but I don't know if there was a good reason to keep them
      separate.
      
      Fixes: bb77f36a ("net: dsa: sja1105: Add support for the PTP clock")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78fe8a28
    • Arnd Bergmann's avatar
      net: stmmac: fix unused-variable warning · c63d1e5c
      Arnd Bergmann authored
      When building without CONFIG_OF, we get a harmless build warning:
      
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_phy_setup':
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:973:22: error: unused variable 'node' [-Werror=unused-variable]
        struct device_node *node = priv->plat->phy_node;
      
      Reword it so we always use the local variable, by making it the
      fwnode pointer instead of the device_node.
      
      Fixes: 74371272 ("net: stmmac: Convert to phylink and remove phylib logic")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c63d1e5c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · da0f3820
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Lots of bug fixes here:
      
         1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer.
      
         2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John
            Crispin.
      
         3) Use after free in psock backlog workqueue, from John Fastabend.
      
         4) Fix source port matching in fdb peer flow rule of mlx5, from Raed
            Salem.
      
         5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet.
      
         6) Network header needs to be set for packet redirect in nfp, from
            John Hurley.
      
         7) Fix udp zerocopy refcnt, from Willem de Bruijn.
      
         8) Don't assume linear buffers in vxlan and geneve error handlers,
            from Stefano Brivio.
      
         9) Fix TOS matching in mlxsw, from Jiri Pirko.
      
        10) More SCTP cookie memory leak fixes, from Neil Horman.
      
        11) Fix VLAN filtering in rtl8366, from Linus Walluij.
      
        12) Various TCP SACK payload size and fragmentation memory limit fixes
            from Eric Dumazet.
      
        13) Use after free in pneigh_get_next(), also from Eric Dumazet.
      
        14) LAPB control block leak fix from Jeremy Sowden"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
        lapb: fixed leak of control-blocks.
        tipc: purge deferredq list for each grp member in tipc_group_delete
        ax25: fix inconsistent lock state in ax25_destroy_timer
        neigh: fix use-after-free read in pneigh_get_next
        tcp: fix compile error if !CONFIG_SYSCTL
        hv_sock: Suppress bogus "may be used uninitialized" warnings
        be2net: Fix number of Rx queues used for flow hashing
        net: handle 802.1P vlan 0 packets properly
        tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
        tcp: add tcp_min_snd_mss sysctl
        tcp: tcp_fragment() should apply sane memory limits
        tcp: limit payload size of sacked skbs
        Revert "net: phylink: set the autoneg state in phylink_phy_change"
        bpf: fix nested bpf tracepoints with per-cpu data
        bpf: Fix out of bounds memory access in bpf_sk_storage
        vsock/virtio: set SOCK_DONE on peer shutdown
        net: dsa: rtl8366: Fix up VLAN filtering
        net: phylink: set the autoneg state in phylink_phy_change
        net: add high_order_alloc_disable sysctl/static key
        tcp: add tcp_tx_skb_cache sysctl
        ...
      da0f3820
    • Christian Brauner's avatar
      fs/namespace: fix unprivileged mount propagation · d728cf79
      Christian Brauner authored
      When propagating mounts across mount namespaces owned by different user
      namespaces it is not possible anymore to move or umount the mount in the
      less privileged mount namespace.
      
      Here is a reproducer:
      
        sudo mount -t tmpfs tmpfs /mnt
        sudo --make-rshared /mnt
      
        # create unprivileged user + mount namespace and preserve propagation
        unshare -U -m --map-root --propagation=unchanged
      
        # now change back to the original mount namespace in another terminal:
        sudo mkdir /mnt/aaa
        sudo mount -t tmpfs tmpfs /mnt/aaa
      
        # now in the unprivileged user + mount namespace
        mount --move /mnt/aaa /opt
      
      Unfortunately, this is a pretty big deal for userspace since this is
      e.g. used to inject mounts into running unprivileged containers.
      So this regression really needs to go away rather quickly.
      
      The problem is that a recent change falsely locked the root of the newly
      added mounts by setting MNT_LOCKED. Fix this by only locking the mounts
      on copy_mnt_ns() and not when adding a new mount.
      
      Fixes: 3bd045cc ("separate copying and locking mount tree on cross-userns copies")
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d728cf79
    • Eric Biggers's avatar
      vfs: fsmount: add missing mntget() · 1b0b9cc8
      Eric Biggers authored
      sys_fsmount() needs to take a reference to the new mount when adding it
      to the anonymous mount namespace.  Otherwise the filesystem can be
      unmounted while it's still in use, as found by syzkaller.
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com
      Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com
      Fixes: 93766fbd ("vfs: syscall: Add fsmount() to create a mount for a superblock")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      1b0b9cc8
    • Jiri Pirko's avatar
      net: sched: cls_matchall: allow to delete filter · f517f271
      Jiri Pirko authored
      Currently user is unable to delete the filter. See following example:
      $ tc filter add dev ens16np1 ingress pref 1 handle 1 matchall action drop
      $ tc filter show dev ens16np1 ingress
      filter protocol all pref 1 matchall chain 0
      filter protocol all pref 1 matchall chain 0 handle 0x1
        in_hw
              action order 1: gact action drop
               random type none pass val 0
               index 1 ref 1 bind 1
      
      $ tc filter del dev ens16np1 ingress pref 1 handle 1 matchall action drop
      RTNETLINK answers: Operation not supported
      
      Implement tcf_proto_ops->delete() op and allow user to delete the filter.
      Reported-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f517f271
    • Colin Ian King's avatar
      net: hns3: fix dereference of ae_dev before it is null checked · ad9bf545
      Colin Ian King authored
      Pointer ae_dev is null checked however, prior to that it is dereferenced
      when assigned pointer ops. Fix this by assigning pointer ops after ae_dev
      has been null checked.
      
      Addresses-Coverity: ("Dereference before null check")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad9bf545
    • David S. Miller's avatar
      Merge branch 'net-sched-act_ctinfo-fixes' · 43321251
      David S. Miller authored
      Kevin Darbyshire-Bryant says:
      
      ====================
      net: sched: act_ctinfo: fixes
      
      This is first attempt at sending a small series.  Order is important
      because one bug (policy validation) prevents us from encountering the
      more important 'OOPS' generating bug in action creation.  Fix the OOPS
      first.
      
      Confession time: Until very recently, development of this module has
      been done on 'net-next' tree to 'clean compile' level with run-time
      testing on backports to 4.14 & 4.19 kernels under openwrt.  It turns out
      that sched: action: based code has been under more active change than I
      realised.
      
      During the back & forward porting during development & testing, the
      critical ACT_P_CREATED return code got missed despite being in the 4.14
      & 4.19 backports.  I have now gone through the init functions, using
      act_csum as reference with a fine toothed comb and am happy they do the
      same things.
      
      This issue hadn't been caught till now due to another issue caused by
      new strict nla_parse_nested function failing parsing validation before
      action creation.
      
      Thanks to Marcelo Leitner <marcelo.leitner@gmail.com> for flagging
      extack deficiency (fixed in 733f0766 sched: act_ctinfo: use extack
      error reporting) which led to b424e432 ("netlink: add validation of
      NLA_F_NESTED flag") and 8cb08174 ("netlink: make validation more
      configurable for future strictness”) which led to the policy validation
      fix, which then led to the action creation fix both contained in this
      series.
      
      If I ever get to a developer conference please feel free to
      tar/feather/apply cone of shame.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43321251
    • Kevin Darbyshire-Bryant's avatar
      net: sched: act_ctinfo: fix policy validation · c197d636
      Kevin Darbyshire-Bryant authored
      Fix nla_policy definition by specifying an exact length type attribute
      to CTINFO action paraneter block structure.  Without this change,
      netlink parsing will fail validation and the action will not be
      instantiated.
      
      8cb08174 ("netlink: make validation more configurable for future")
      introduced much stricter checking to attributes being passed via
      netlink.  Existing actions were updated to use less restrictive
      deprecated versions of nla_parse_nested.
      
      As a new module, act_ctinfo should be designed to use the strict
      checking model otherwise, well, what was the point of implementing it.
      
      Confession time: Until very recently, development of this module has
      been done on 'net-next' tree to 'clean compile' level with run-time
      testing on backports to 4.14 & 4.19 kernels under openwrt.  This is how
      I managed to miss the run-time impacts of the new strict
      nla_parse_nested function.  I hopefully have learned something from this
      (glances toward laptop running a net-next kernel)
      
      There is however a still outstanding implication on iproute2 user space
      in that it needs to be told to pass nested netlink messages with the
      nested attribute actually set.  So even with this kernel fix to do
      things correctly you still cannot instantiate a new 'strict'
      nla_parse_nested based action such as act_ctinfo with iproute2's tc.
      Signed-off-by: default avatarKevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c197d636
    • Kevin Darbyshire-Bryant's avatar
      net: sched: act_ctinfo: fix action creation · a658c2e4
      Kevin Darbyshire-Bryant authored
      Use correct return value on action creation: ACT_P_CREATED.
      
      The use of incorrect return value could result in a situation where the
      system thought a ctinfo module was listening but actually wasn't
      instantiated correctly leading to an OOPS in tcf_generic_walker().
      
      Confession time: Until very recently, development of this module has
      been done on 'net-next' tree to 'clean compile' level with run-time
      testing on backports to 4.14 & 4.19 kernels under openwrt.  During the
      back & forward porting during development & testing, the critical
      ACT_P_CREATED return code got missed despite being in the 4.14 & 4.19
      backports.  I have now gone through the init functions, using act_csum
      as reference with a fine toothed comb.  Bonus, no more OOPSes.  I
      managed to also miss this issue till now due to the new strict
      nla_parse_nested function failing validation before action creation.
      
      As an inexperienced developer I've learned that
      copy/pasting/backporting/forward porting code correctly is hard.  If I
      ever get to a developer conference I shall don the cone of shame.
      Signed-off-by: default avatarKevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a658c2e4
    • Jason Wang's avatar
      vhost_net: disable zerocopy by default · 098eadce
      Jason Wang authored
      Vhost_net was known to suffer from HOL[1] issues which is not easy to
      fix. Several downstream disable the feature by default. What's more,
      the datapath was split and datacopy path got the support of batching
      and XDP support recently which makes it faster than zerocopy part for
      small packets transmission.
      
      It looks to me that disable zerocopy by default is more
      appropriate. It cold be enabled by default again in the future if we
      fix the above issues.
      
      [1] https://patchwork.kernel.org/patch/3787671/Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      098eadce
    • Ard Biesheuvel's avatar
      net: ipv4: move tcp_fastopen server side code to SipHash library · c681edae
      Ard Biesheuvel authored
      Using a bare block cipher in non-crypto code is almost always a bad idea,
      not only for security reasons (and we've seen some examples of this in
      the kernel in the past), but also for performance reasons.
      
      In the TCP fastopen case, we call into the bare AES block cipher one or
      two times (depending on whether the connection is IPv4 or IPv6). On most
      systems, this results in a call chain such as
      
        crypto_cipher_encrypt_one(ctx, dst, src)
          crypto_cipher_crt(tfm)->cit_encrypt_one(crypto_cipher_tfm(tfm), ...);
            aesni_encrypt
              kernel_fpu_begin();
              aesni_enc(ctx, dst, src); // asm routine
              kernel_fpu_end();
      
      It is highly unlikely that the use of special AES instructions has a
      benefit in this case, especially since we are doing the above twice
      for IPv6 connections, instead of using a transform which can process
      the entire input in one go.
      
      We could switch to the cbcmac(aes) shash, which would at least get
      rid of the duplicated overhead in *some* cases (i.e., today, only
      arm64 has an accelerated implementation of cbcmac(aes), while x86 will
      end up using the generic cbcmac template wrapping the AES-NI cipher,
      which basically ends up doing exactly the above). However, in the given
      context, it makes more sense to use a light-weight MAC algorithm that
      is more suitable for the purpose at hand, such as SipHash.
      
      Since the output size of SipHash already matches our chosen value for
      TCP_FASTOPEN_COOKIE_SIZE, and given that it accepts arbitrary input
      sizes, this greatly simplifies the code as well.
      
      NOTE: Server farms backing a single server IP for load balancing purposes
            and sharing a single fastopen key will be adversely affected by
            this change unless all systems in the pool receive their kernel
            upgrades at the same time.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c681edae
    • Tuong Lien's avatar
      tipc: include retrans failure detection for unicast · 6a6b5c8b
      Tuong Lien authored
      In patch series, commit 9195948f ("tipc: improve TIPC throughput by
      Gap ACK blocks"), as for simplicity, the repeated retransmit failures'
      detection in the function - "tipc_link_retrans()" was kept there for
      broadcast retransmissions only.
      
      This commit now reapplies this feature for link unicast retransmissions
      that has been done via the function - "tipc_link_advance_transmq()".
      
      Also, the "tipc_link_retrans()" is renamed to "tipc_link_bc_retrans()"
      as it is used only for broadcast.
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.se>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a6b5c8b
    • Hangbin Liu's avatar
      team: add ethtool get_link_ksettings · 9ed68ca0
      Hangbin Liu authored
      Like bond, add ethtool get_link_ksettings to show the total speed.
      
      v2: no update, just repost.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ed68ca0
    • David S. Miller's avatar
      Merge branch 'tcp-fixes' · 4fddbf8a
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: make sack processing more robust
      
      Jonathan Looney brought to our attention multiple problems
      in TCP stack at the sender side.
      
      SACK processing can be abused by malicious peers to either
      cause overflows, or increase of memory usage.
      
      First two patches fix the immediate problems.
      
      Since the malicious peers abuse senders by advertizing a very
      small MSS in their SYN or SYNACK packet, the last two
      patches add a new sysctl so that admins can chose a higher
      limit for MSS clamping.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fddbf8a
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · eb7c825b
      Linus Torvalds authored
      Pull RISC-V fixes from Paul Walmsley:
       "This contains fixes, defconfig, and DT data changes for the v5.2-rc
        series.
      
        The fixes are relatively straightforward:
      
         - Addition of a TLB fence in the vmalloc_fault path, so the CPU
           doesn't enter an infinite page fault loop
      
         - Readdition of the pm_power_off export, so device drivers that
           reassign it can now be built as modules
      
         - A udelay() fix for RV32, fixing a miscomputation of the delay time
      
         - Removal of deprecated smp_mb__*() barriers
      
        This also adds initial DT data infrastructure for arch/riscv, along
        with initial data for the SiFive FU540-C000 SoC and the corresponding
        HiFive Unleashed board.
      
        We also update the RV64 defconfig to include some core drivers for the
        FU540 in the build"
      
      * tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: remove unused barrier defines
        riscv: mm: synchronize MMU after pte change
        riscv: dts: add initial board data for the SiFive HiFive Unleashed
        riscv: dts: add initial support for the SiFive FU540-C000 SoC
        dt-bindings: riscv: convert cpu binding to json-schema
        dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540
        arch: riscv: add support for building DTB files from DT source data
        riscv: Fix udelay in RV32.
        riscv: export pm_power_off again
        RISC-V: defconfig: enable clocks, serial console
      eb7c825b
    • Rolf Eike Beer's avatar
      riscv: remove unused barrier defines · 259931fd
      Rolf Eike Beer authored
      They were introduced in commit fab957c1 ("RISC-V: Atomic and
      Locking Code") long after commit 2e39465a ("locking: Remove
      deprecated smp_mb__() barriers") removed the remnants of all previous
      instances from the tree.
      Signed-off-by: default avatarRolf Eike Beer <eb@emlix.com>
      [paul.walmsley@sifive.com: stripped spurious mbox header from patch
       description; fixed commit references in patch header]
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      259931fd
    • ShihPo Hung's avatar
      riscv: mm: synchronize MMU after pte change · bf587caa
      ShihPo Hung authored
      Because RISC-V compliant implementations can cache invalid entries
      in TLB, an SFENCE.VMA is necessary after changes to the page table.
      This patch adds an SFENCE.vma for the vmalloc_fault path.
      Signed-off-by: default avatarShihPo Hung <shihpo.hung@sifive.com>
      [paul.walmsley@sifive.com: reversed tab->whitespace conversion,
       wrapped comment lines]
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: linux-riscv@lists.infradead.org
      Cc: stable@vger.kernel.org
      bf587caa
    • Paul Walmsley's avatar
      riscv: dts: add initial board data for the SiFive HiFive Unleashed · c35f1b87
      Paul Walmsley authored
      Add initial board data for the SiFive HiFive Unleashed A00.
      
      Currently the data populated in this DT file describes the board
      DRAM configuration and the external clock sources that supply the
      PRCI.
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Tested-by: default avatarLoys Ollivier <lollivier@baylibre.com>
      Tested-by: default avatarKevin Hilman <khilman@baylibre.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Antony Pavlov <antonynpavlov@gmail.com>
      Cc: devicetree@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      c35f1b87
    • Paul Walmsley's avatar
      riscv: dts: add initial support for the SiFive FU540-C000 SoC · 72296bde
      Paul Walmsley authored
      Add initial support for the SiFive FU540-C000 SoC.  This is a 28nm SoC
      based around the SiFive U54-MC core complex and a TileLink
      interconnect.
      
      This file is expected to grow as more device drivers are added to the
      kernel.
      
      This patch includes a fix to the QSPI memory map due to a
      documentation bug, found by ShihPo Hung <shihpo.hung@sifive.com>, adds
      entries for the I2C controller, and merges all DT changes that
      formerly were made dynamically by the riscv-pk BBL proxy kernel.
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Tested-by: default avatarLoys Ollivier <lollivier@baylibre.com>
      Tested-by: default avatarKevin Hilman <khilman@baylibre.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: ShihPo Hung <shihpo.hung@sifive.com>
      Cc: devicetree@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      72296bde
    • Paul Walmsley's avatar
      dt-bindings: riscv: convert cpu binding to json-schema · 4fd669a8
      Paul Walmsley authored
      At Rob's request, we're starting to migrate our DT binding
      documentation to json-schema YAML format.  Start by converting our cpu
      binding documentation.  While doing so, document more properties and
      nodes.  This includes adding binding documentation support for the E51
      and U54 CPU cores ("harts") that are present on this SoC.  These cores
      are described in:
      
          https://static.dev.sifive.com/FU540-C000-v1.0.pdf
      
      This cpus.yaml file is intended to be a starting point and to
      evolve over time.  It passes dt-doc-validate as of the yaml-bindings
      commit 4c79d42e9216.
      
      This patch was originally based on the ARM json-schema binding
      documentation as added by commit 672951cb ("dt-bindings: arm: Convert
      cpu binding to json-schema").
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: devicetree@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      4fd669a8
    • Paul Walmsley's avatar
      dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540 · c7af5598
      Paul Walmsley authored
      Add YAML DT binding documentation for the SiFive FU540 SoC.  This
      SoC is documented at:
      
          https://static.dev.sifive.com/FU540-C000-v1.0.pdf
      
      Passes dt-doc-validate, as of yaml-bindings commit 4c79d42e9216.
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: devicetree@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      c7af5598
    • Paul Walmsley's avatar
      arch: riscv: add support for building DTB files from DT source data · 8d4e048d
      Paul Walmsley authored
      Similar to ARM64, add support for building DTB files from DT source
      data for RISC-V boards.
      
      This patch starts with the infrastructure needed for SiFive boards.
      Boards from other vendors would add support here in a similar form.
      Signed-off-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Tested-by: default avatarLoys Ollivier <lollivier@baylibre.com>
      Tested-by: default avatarKevin Hilman <khilman@baylibre.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      8d4e048d
    • Jeremy Sowden's avatar
      lapb: fixed leak of control-blocks. · 6be8e297
      Jeremy Sowden authored
      lapb_register calls lapb_create_cb, which initializes the control-
      block's ref-count to one, and __lapb_insert_cb, which increments it when
      adding the new block to the list of blocks.
      
      lapb_unregister calls __lapb_remove_cb, which decrements the ref-count
      when removing control-block from the list of blocks, and calls lapb_put
      itself to decrement the ref-count before returning.
      
      However, lapb_unregister also calls __lapb_devtostruct to look up the
      right control-block for the given net_device, and __lapb_devtostruct
      also bumps the ref-count, which means that when lapb_unregister returns
      the ref-count is still 1 and the control-block is leaked.
      
      Call lapb_put after __lapb_devtostruct to fix leak.
      
      Reported-by: syzbot+afb980676c836b4a0afa@syzkaller.appspotmail.com
      Signed-off-by: default avatarJeremy Sowden <jeremy@azazel.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6be8e297
    • Xin Long's avatar
      tipc: purge deferredq list for each grp member in tipc_group_delete · 5cf02612
      Xin Long authored
      Syzbot reported a memleak caused by grp members' deferredq list not
      purged when the grp is be deleted.
      
      The issue occurs when more(msg_grp_bc_seqno(hdr), m->bc_rcv_nxt) in
      tipc_group_filter_msg() and the skb will stay in deferredq.
      
      So fix it by calling __skb_queue_purge for each member's deferredq
      in tipc_group_delete() when a tipc sk leaves the grp.
      
      Fixes: b87a5ea3 ("tipc: guarantee group unicast doesn't bypass group broadcast")
      Reported-by: syzbot+78fbe679c8ca8d264a8d@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cf02612
  3. 16 Jun, 2019 5 commits
    • Willem de Bruijn's avatar
      selftests/net: fix warnings in TFO key rotation selftest · f464100f
      Willem de Bruijn authored
      One warning each on signedness, unused variable and return type.
      
      Fixes: 10fbcdd1 ("selftests/net: add TFO key rotation selftest")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f464100f
    • Jeremy Sowden's avatar
      x25_asy: fixed function name in error message. · 8e6a4817
      Jeremy Sowden authored
      Replaced incorrect hard-coded function-name in error message with
      __func__.
      Signed-off-by: default avatarJeremy Sowden <jeremy@azazel.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e6a4817
    • Jeremy Sowden's avatar
      lapb: moved export of lapb_register. · 4201c926
      Jeremy Sowden authored
      The EXPORT_SYMBOL for lapb_register was next to a different function.
      Moved it to the right place.
      Signed-off-by: default avatarJeremy Sowden <jeremy@azazel.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4201c926
    • Eric Dumazet's avatar
      ax25: fix inconsistent lock state in ax25_destroy_timer · d4d5d8e8
      Eric Dumazet authored
      Before thread in process context uses bh_lock_sock()
      we must disable bh.
      
      sysbot reported :
      
      WARNING: inconsistent lock state
      5.2.0-rc3+ #32 Not tainted
      
      inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      blkid/26581 [HC0[0]:SC1[1]:HE1:SE0] takes:
      00000000e0da85ee (slock-AF_AX25){+.?.}, at: spin_lock include/linux/spinlock.h:338 [inline]
      00000000e0da85ee (slock-AF_AX25){+.?.}, at: ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275
      {SOFTIRQ-ON-W} state was registered at:
        lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
        spin_lock include/linux/spinlock.h:338 [inline]
        ax25_rt_autobind+0x3ca/0x720 net/ax25/ax25_route.c:429
        ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1221
        __sys_connect+0x264/0x330 net/socket.c:1834
        __do_sys_connect net/socket.c:1845 [inline]
        __se_sys_connect net/socket.c:1842 [inline]
        __x64_sys_connect+0x73/0xb0 net/socket.c:1842
        do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      irq event stamp: 2272
      hardirqs last  enabled at (2272): [<ffffffff810065f3>] trace_hardirqs_on_thunk+0x1a/0x1c
      hardirqs last disabled at (2271): [<ffffffff8100660f>] trace_hardirqs_off_thunk+0x1a/0x1c
      softirqs last  enabled at (1522): [<ffffffff87400654>] __do_softirq+0x654/0x94c kernel/softirq.c:320
      softirqs last disabled at (2267): [<ffffffff81449010>] invoke_softirq kernel/softirq.c:374 [inline]
      softirqs last disabled at (2267): [<ffffffff81449010>] irq_exit+0x180/0x1d0 kernel/softirq.c:414
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(slock-AF_AX25);
        <Interrupt>
          lock(slock-AF_AX25);
      
       *** DEADLOCK ***
      
      1 lock held by blkid/26581:
       #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:175 [inline]
       #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: call_timer_fn+0xe0/0x720 kernel/time/timer.c:1312
      
      stack backtrace:
      CPU: 1 PID: 26581 Comm: blkid Not tainted 5.2.0-rc3+ #32
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_usage_bug.cold+0x393/0x4a2 kernel/locking/lockdep.c:2935
       valid_state kernel/locking/lockdep.c:2948 [inline]
       mark_lock_irq kernel/locking/lockdep.c:3138 [inline]
       mark_lock+0xd46/0x1370 kernel/locking/lockdep.c:3513
       mark_irqflags kernel/locking/lockdep.c:3391 [inline]
       __lock_acquire+0x159f/0x5490 kernel/locking/lockdep.c:3745
       lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
       spin_lock include/linux/spinlock.h:338 [inline]
       ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275
       call_timer_fn+0x193/0x720 kernel/time/timer.c:1322
       expire_timers kernel/time/timer.c:1366 [inline]
       __run_timers kernel/time/timer.c:1685 [inline]
       __run_timers kernel/time/timer.c:1653 [inline]
       run_timer_softirq+0x66f/0x1740 kernel/time/timer.c:1698
       __do_softirq+0x25c/0x94c kernel/softirq.c:293
       invoke_softirq kernel/softirq.c:374 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:414
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x13b/0x550 arch/x86/kernel/apic/apic.c:1068
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806
       </IRQ>
      RIP: 0033:0x7f858d5c3232
      Code: 8b 61 08 48 8b 84 24 d8 00 00 00 4c 89 44 24 28 48 8b ac 24 d0 00 00 00 4c 8b b4 24 e8 00 00 00 48 89 7c 24 68 48 89 4c 24 78 <48> 89 44 24 58 8b 84 24 e0 00 00 00 89 84 24 84 00 00 00 8b 84 24
      RSP: 002b:00007ffcaf0cf5c0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
      RAX: 00007f858d7d27a8 RBX: 00007f858d7d8820 RCX: 00007f858d3940d8
      RDX: 00007ffcaf0cf798 RSI: 00000000f5e616f3 RDI: 00007f858d394fee
      RBP: 0000000000000000 R08: 00007ffcaf0cf780 R09: 00007f858d7db480
      R10: 0000000000000000 R11: 0000000009691a75 R12: 0000000000000005
      R13: 00000000f5e616f3 R14: 0000000000000000 R15: 00007ffcaf0cf798
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4d5d8e8
    • Eric Dumazet's avatar
      neigh: fix use-after-free read in pneigh_get_next · f3e92cb8
      Eric Dumazet authored
      Nine years ago, I added RCU handling to neighbours, not pneighbours.
      (pneigh are not commonly used)
      
      Unfortunately I missed that /proc dump operations would use a
      common entry and exit point : neigh_seq_start() and neigh_seq_stop()
      
      We need to read_lock(tbl->lock) or risk use-after-free while
      iterating the pneigh structures.
      
      We might later convert pneigh to RCU and revert this patch.
      
      sysbot reported :
      
      BUG: KASAN: use-after-free in pneigh_get_next.isra.0+0x24b/0x280 net/core/neighbour.c:3158
      Read of size 8 at addr ffff888097f2a700 by task syz-executor.0/9825
      
      CPU: 1 PID: 9825 Comm: syz-executor.0 Not tainted 5.2.0-rc4+ #32
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
       __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       kasan_report+0x12/0x20 mm/kasan/common.c:614
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
       pneigh_get_next.isra.0+0x24b/0x280 net/core/neighbour.c:3158
       neigh_seq_next+0xdb/0x210 net/core/neighbour.c:3240
       seq_read+0x9cf/0x1110 fs/seq_file.c:258
       proc_reg_read+0x1fc/0x2c0 fs/proc/inode.c:221
       do_loop_readv_writev fs/read_write.c:714 [inline]
       do_loop_readv_writev fs/read_write.c:701 [inline]
       do_iter_read+0x4a4/0x660 fs/read_write.c:935
       vfs_readv+0xf0/0x160 fs/read_write.c:997
       kernel_readv fs/splice.c:359 [inline]
       default_file_splice_read+0x475/0x890 fs/splice.c:414
       do_splice_to+0x127/0x180 fs/splice.c:877
       splice_direct_to_actor+0x2d2/0x970 fs/splice.c:954
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1063
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4592c9
      Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f4aab51dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00000000004592c9
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000005
      RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000080000000 R11: 0000000000000246 R12: 00007f4aab51e6d4
      R13: 00000000004c689d R14: 00000000004db828 R15: 00000000ffffffff
      
      Allocated by task 9827:
       save_stack+0x23/0x90 mm/kasan/common.c:71
       set_track mm/kasan/common.c:79 [inline]
       __kasan_kmalloc mm/kasan/common.c:489 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
       __do_kmalloc mm/slab.c:3660 [inline]
       __kmalloc+0x15c/0x740 mm/slab.c:3669
       kmalloc include/linux/slab.h:552 [inline]
       pneigh_lookup+0x19c/0x4a0 net/core/neighbour.c:731
       arp_req_set_public net/ipv4/arp.c:1010 [inline]
       arp_req_set+0x613/0x720 net/ipv4/arp.c:1026
       arp_ioctl+0x652/0x7f0 net/ipv4/arp.c:1226
       inet_ioctl+0x2a0/0x340 net/ipv4/af_inet.c:926
       sock_do_ioctl+0xd8/0x2f0 net/socket.c:1043
       sock_ioctl+0x3ed/0x780 net/socket.c:1194
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:509 [inline]
       do_vfs_ioctl+0xd5f/0x1380 fs/ioctl.c:696
       ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
       __do_sys_ioctl fs/ioctl.c:720 [inline]
       __se_sys_ioctl fs/ioctl.c:718 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
       do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 9824:
       save_stack+0x23/0x90 mm/kasan/common.c:71
       set_track mm/kasan/common.c:79 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
       __cache_free mm/slab.c:3432 [inline]
       kfree+0xcf/0x220 mm/slab.c:3755
       pneigh_ifdown_and_unlock net/core/neighbour.c:812 [inline]
       __neigh_ifdown+0x236/0x2f0 net/core/neighbour.c:356
       neigh_ifdown+0x20/0x30 net/core/neighbour.c:372
       arp_ifdown+0x1d/0x21 net/ipv4/arp.c:1274
       inetdev_destroy net/ipv4/devinet.c:319 [inline]
       inetdev_event+0xa14/0x11f0 net/ipv4/devinet.c:1544
       notifier_call_chain+0xc2/0x230 kernel/notifier.c:95
       __raw_notifier_call_chain kernel/notifier.c:396 [inline]
       raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:403
       call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1749
       call_netdevice_notifiers_extack net/core/dev.c:1761 [inline]
       call_netdevice_notifiers net/core/dev.c:1775 [inline]
       rollback_registered_many+0x9b9/0xfc0 net/core/dev.c:8178
       rollback_registered+0x109/0x1d0 net/core/dev.c:8220
       unregister_netdevice_queue net/core/dev.c:9267 [inline]
       unregister_netdevice_queue+0x1ee/0x2c0 net/core/dev.c:9260
       unregister_netdevice include/linux/netdevice.h:2631 [inline]
       __tun_detach+0xd8a/0x1040 drivers/net/tun.c:724
       tun_detach drivers/net/tun.c:741 [inline]
       tun_chr_close+0xe0/0x180 drivers/net/tun.c:3451
       __fput+0x2ff/0x890 fs/file_table.c:280
       ____fput+0x16/0x20 fs/file_table.c:313
       task_work_run+0x145/0x1c0 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:185 [inline]
       exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:168
       prepare_exit_to_usermode arch/x86/entry/common.c:199 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
       do_syscall_64+0x58e/0x680 arch/x86/entry/common.c:304
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff888097f2a700
       which belongs to the cache kmalloc-64 of size 64
      The buggy address is located 0 bytes inside of
       64-byte region [ffff888097f2a700, ffff888097f2a740)
      The buggy address belongs to the page:
      page:ffffea00025fca80 refcount:1 mapcount:0 mapping:ffff8880aa400340 index:0x0
      flags: 0x1fffc0000000200(slab)
      raw: 01fffc0000000200 ffffea000250d548 ffffea00025726c8 ffff8880aa400340
      raw: 0000000000000000 ffff888097f2a000 0000000100000020 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888097f2a600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
       ffff888097f2a680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff888097f2a700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
                         ^
       ffff888097f2a780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff888097f2a800: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      
      Fixes: 767e97e1 ("neigh: RCU conversion of struct neighbour")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3e92cb8