1. 05 Jul, 2019 3 commits
    • Cong Wang's avatar
      hsr: implement dellink to clean up resources · b9a1e627
      Cong Wang authored
      hsr_link_ops implements ->newlink() but not ->dellink(),
      which leads that resources not released after removing the device,
      particularly the entries in self_node_db and node_db.
      
      So add ->dellink() implementation to replace the priv_destructor.
      This also makes the code slightly easier to understand.
      
      Reported-by: syzbot+c6167ec3de7def23d1e8@syzkaller.appspotmail.com
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9a1e627
    • Cong Wang's avatar
      hsr: fix a memory leak in hsr_del_port() · 619afef0
      Cong Wang authored
      hsr_del_port() should release all the resources allocated
      in hsr_add_port().
      
      As a consequence of this change, hsr_for_each_port() is no
      longer safe to work with hsr_del_port(), switch to
      list_for_each_entry_safe() as we always hold RTNL lock.
      
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      619afef0
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 114b5b35
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2019-07-05
      
      1)  Fix xfrm selector prefix length validation for
          inter address family tunneling.
          From Anirudh Gupta.
      
      2) Fix a memleak in pfkey.
         From Jeremy Sowden.
      
      3) Fix SA selector validation to allow empty selectors again.
         From Nicolas Dichtel.
      
      4) Select crypto ciphers for xfrm_algo, this fixes some
         randconfig builds. From Arnd Bergmann.
      
      5) Remove a duplicated assignment in xfrm_bydst_resize.
         From Cong Wang.
      
      6) Fix a hlist corruption on hash rebuild.
         From Florian Westphal.
      
      7) Fix a memory leak when creating xfrm interfaces.
         From Nicolas Dichtel.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      114b5b35
  2. 03 Jul, 2019 15 commits
    • Cong Wang's avatar
      bonding: validate ip header before check IPPROTO_IGMP · 9d1bc24b
      Cong Wang authored
      bond_xmit_roundrobin() checks for IGMP packets but it parses
      the IP header even before checking skb->protocol.
      
      We should validate the IP header with pskb_may_pull() before
      using iph->protocol.
      
      Reported-and-tested-by: syzbot+e5be16aa39ad6e755391@syzkaller.appspotmail.com
      Fixes: a2fd940f ("bonding: fix broken multicast with round-robin mode")
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d1bc24b
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · c3ead2df
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-07-03
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix the interpreter to properly handle BPF_ALU32 | BPF_ARSH
         on BE architectures, from Jiong.
      
      2) Fix several bugs in the x32 BPF JIT for handling shifts by 0,
         from Luke and Xi.
      
      3) Fix NULL pointer deref in btf_type_is_resolve_source_only(),
         from Stanislav.
      
      4) Properly handle the check that forwarding is enabled on the device
         in bpf_ipv6_fib_lookup() helper code, from Anton.
      
      5) Fix UAPI bpf_prog_info fields alignment for archs that have 16 bit
         alignment such as m68k, from Baruch.
      
      6) Fix kernel hanging in unregister_netdevice loop while unregistering
         device bound to XDP socket, from Ilya.
      
      7) Properly terminate tail update in xskq_produce_flush_desc(), from Nathan.
      
      8) Fix broken always_inline handling in test_lwt_seg6local, from Jiri.
      
      9) Fix bpftool to use correct argument in cgroup errors, from Jakub.
      
      10) Fix detaching dummy prog in XDP redirect sample code, from Prashant.
      
      11) Add Jonathan to AF_XDP reviewers, from Björn.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3ead2df
    • Yonglong Liu's avatar
      net: hns: add support for vlan TSO · 0d581ba3
      Yonglong Liu authored
      The hip07 chip support vlan TSO, this patch adds NETIF_F_TSO
      and NETIF_F_TSO6 flags to vlan_features to improve the
      performance after adding vlan to the net ports.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d581ba3
    • Xin Long's avatar
      sctp: count data bundling sack chunk for outctrlchunks · 7af03301
      Xin Long authored
      Now all ctrl chunks are counted for asoc stats.octrlchunks and net
      SCTP_MIB_OUTCTRLCHUNKS either after queuing up or bundling, other
      than the chunk maked and bundled in sctp_packet_bundle_sack, which
      caused 'outctrlchunks' not consistent with 'inctrlchunks' in peer.
      
      This issue exists since very beginning, here to fix it by increasing
      both net SCTP_MIB_OUTCTRLCHUNKS and asoc stats.octrlchunks when sack
      chunk is maked and bundled in sctp_packet_bundle_sack.
      Reported-by: default avatarJa Ram Jeon <jajeon@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7af03301
    • Hayes Wang's avatar
      r8152: move calling r8153b_rx_agg_chg_indicate() · 9fae5418
      Hayes Wang authored
      r8153b_rx_agg_chg_indicate() needs to be called after enabling TX/RX and
      before calling rxdy_gated_en(tp, false). Otherwise, the change of the
      settings of RX aggregation wouldn't work.
      
      Besides, adjust rtl8152_set_coalesce() for the same reason. If
      rx_coalesce_usecs is changed, restart TX/RX to let the setting work.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fae5418
    • Stephen Hemminger's avatar
      net: don't warn in inet diag when IPV6 is disabled · 1e64d7cb
      Stephen Hemminger authored
      If IPV6 was disabled, then ss command would cause a kernel warning
      because the command was attempting to dump IPV6 socket information.
      The fix is to just remove the warning.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202249
      Fixes: 432490f9 ("net: ip, diag -- Add diag interface for raw sockets")
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e64d7cb
    • Ilya Maximets's avatar
      xdp: fix hang while unregistering device bound to xdp socket · 455302d1
      Ilya Maximets authored
      Device that bound to XDP socket will not have zero refcount until the
      userspace application will not close it. This leads to hang inside
      'netdev_wait_allrefs()' if device unregistering requested:
      
        # ip link del p1
        < hang on recvmsg on netlink socket >
      
        # ps -x | grep ip
        5126  pts/0    D+   0:00 ip link del p1
      
        # journalctl -b
      
        Jun 05 07:19:16 kernel:
        unregister_netdevice: waiting for p1 to become free. Usage count = 1
      
        Jun 05 07:19:27 kernel:
        unregister_netdevice: waiting for p1 to become free. Usage count = 1
        ...
      
      Fix that by implementing NETDEV_UNREGISTER event notification handler
      to properly clean up all the resources and unref device.
      
      This should also allow socket killing via ss(8) utility.
      
      Fixes: 965a9909 ("xsk: add support for bind for Rx")
      Signed-off-by: default avatarIlya Maximets <i.maximets@samsung.com>
      Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      455302d1
    • Ilya Maximets's avatar
      xdp: hold device for umem regardless of zero-copy mode · 162c820e
      Ilya Maximets authored
      Device pointer stored in umem regardless of zero-copy mode,
      so we heed to hold the device in all cases.
      
      Fixes: c9b47cc1 ("xsk: fix bug when trying to use both copy and zero-copy on one queue id")
      Signed-off-by: default avatarIlya Maximets <i.maximets@samsung.com>
      Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      162c820e
    • Jiri Benc's avatar
      selftests: bpf: fix inlines in test_lwt_seg6local · 11aca65e
      Jiri Benc authored
      Selftests are reporting this failure in test_lwt_seg6local.sh:
      
      + ip netns exec ns2 ip -6 route add fb00::6 encap bpf in obj test_lwt_seg6local.o sec encap_srh dev veth2
      Error fetching program/map!
      Failed to parse eBPF program: Operation not permitted
      
      The problem is __attribute__((always_inline)) alone is not enough to prevent
      clang from inserting those functions in .text. In that case, .text is not
      marked as relocateable.
      
      See the output of objdump -h test_lwt_seg6local.o:
      
      Idx Name          Size      VMA               LMA               File off  Algn
        0 .text         00003530  0000000000000000  0000000000000000  00000040  2**3
                        CONTENTS, ALLOC, LOAD, READONLY, CODE
      
      This causes the iproute bpf loader to fail in bpf_fetch_prog_sec:
      bpf_has_call_data returns true but bpf_fetch_prog_relo fails as there's no
      relocateable .text section in the file.
      
      To fix this, convert to 'static __always_inline'.
      
      v2: Use 'static __always_inline' instead of 'static inline
          __attribute__((always_inline))'
      
      Fixes: c99a84ea ("selftests/bpf: test for seg6local End.BPF action")
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      11aca65e
    • Luke Nelson's avatar
      selftests: bpf: add tests for shifts by zero · ac8786c7
      Luke Nelson authored
      There are currently no tests for ALU64 shift operations when the shift
      amount is 0. This adds 6 new tests to make sure they are equivalent
      to a no-op. The x32 JIT had such bugs that could have been caught by
      these tests.
      
      Cc: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarLuke Nelson <luke.r.nels@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ac8786c7
    • Luke Nelson's avatar
      bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_K shift by 0 · 6fa632e7
      Luke Nelson authored
      The current x32 BPF JIT does not correctly compile shift operations when
      the immediate shift amount is 0. The expected behavior is for this to
      be a no-op.
      
      The following program demonstrates the bug. The expexceted result is 1,
      but the current JITed code returns 2.
      
        r0 = 1
        r1 = 1
        r1 <<= 0
        if r1 == 1 goto end
        r0 = 2
      end:
        exit
      
      This patch simplifies the code and fixes the bug.
      
      Fixes: 03f5781b ("bpf, x86_32: add eBPF JIT compiler for ia32")
      Co-developed-by: default avatarXi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarXi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarLuke Nelson <luke.r.nels@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      6fa632e7
    • Luke Nelson's avatar
      bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_X shift by 0 · 68a8357e
      Luke Nelson authored
      The current x32 BPF JIT for shift operations is not correct when the
      shift amount in a register is 0. The expected behavior is a no-op, whereas
      the current implementation changes bits in the destination register.
      
      The following example demonstrates the bug. The expected result of this
      program is 1, but the current JITed code returns 2.
      
        r0 = 1
        r1 = 1
        r2 = 0
        r1 <<= r2
        if r1 == 1 goto end
        r0 = 2
      end:
        exit
      
      The bug is caused by an incorrect assumption by the JIT that a shift by
      32 clear the register. On x32 however, shifts use the lower 5 bits of
      the source, making a shift by 32 equivalent to a shift by 0.
      
      This patch fixes the bug using double-precision shifts, which also
      simplifies the code.
      
      Fixes: 03f5781b ("bpf, x86_32: add eBPF JIT compiler for ia32")
      Co-developed-by: default avatarXi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarXi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarLuke Nelson <luke.r.nels@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      68a8357e
    • Nicolas Dichtel's avatar
      xfrm interface: fix memory leak on creation · 56c5ee1a
      Nicolas Dichtel authored
      The following commands produce a backtrace and return an error but the xfrm
      interface is created (in the wrong netns):
      $ ip netns add foo
      $ ip netns add bar
      $ ip -n foo netns set bar 0
      $ ip -n foo link add xfrmi0 link-netnsid 0 type xfrm dev lo if_id 23
      RTNETLINK answers: Invalid argument
      $ ip -n bar link ls xfrmi0
      2: xfrmi0@lo: <NOARP,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
          link/none 00:00:00:00:00:00 brd 00:00:00:00:00:00
      
      Here is the backtrace:
      [   79.879174] WARNING: CPU: 0 PID: 1178 at net/core/dev.c:8172 rollback_registered_many+0x86/0x3c1
      [   79.880260] Modules linked in: xfrm_interface nfsv3 nfs_acl auth_rpcgss nfsv4 nfs lockd grace sunrpc fscache button parport_pc parport serio_raw evdev pcspkr loop ext4 crc16 mbcache jbd2 crc32c_generic ide_cd_mod ide_gd_mod cdrom ata_$
      eneric ata_piix libata scsi_mod 8139too piix psmouse i2c_piix4 ide_core 8139cp mii i2c_core floppy
      [   79.883698] CPU: 0 PID: 1178 Comm: ip Not tainted 5.2.0-rc6+ #106
      [   79.884462] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      [   79.885447] RIP: 0010:rollback_registered_many+0x86/0x3c1
      [   79.886120] Code: 01 e8 d7 7d c6 ff 0f 0b 48 8b 45 00 4c 8b 20 48 8d 58 90 49 83 ec 70 48 8d 7b 70 48 39 ef 74 44 8a 83 d0 04 00 00 84 c0 75 1f <0f> 0b e8 61 cd ff ff 48 b8 00 01 00 00 00 00 ad de 48 89 43 70 66
      [   79.888667] RSP: 0018:ffffc900015ab740 EFLAGS: 00010246
      [   79.889339] RAX: ffff8882353e5700 RBX: ffff8882353e56a0 RCX: ffff8882353e5710
      [   79.890174] RDX: ffffc900015ab7e0 RSI: ffffc900015ab7e0 RDI: ffff8882353e5710
      [   79.891029] RBP: ffffc900015ab7e0 R08: ffffc900015ab7e0 R09: ffffc900015ab7e0
      [   79.891866] R10: ffffc900015ab7a0 R11: ffffffff82233fec R12: ffffc900015ab770
      [   79.892728] R13: ffffffff81eb7ec0 R14: ffff88822ed6cf00 R15: 00000000ffffffea
      [   79.893557] FS:  00007ff350f31740(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      [   79.894581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   79.895317] CR2: 00000000006c8580 CR3: 000000022c272000 CR4: 00000000000006f0
      [   79.896137] Call Trace:
      [   79.896464]  unregister_netdevice_many+0x12/0x6c
      [   79.896998]  __rtnl_newlink+0x6e2/0x73b
      [   79.897446]  ? __kmalloc_node_track_caller+0x15e/0x185
      [   79.898039]  ? pskb_expand_head+0x5f/0x1fe
      [   79.898556]  ? stack_access_ok+0xd/0x2c
      [   79.899009]  ? deref_stack_reg+0x12/0x20
      [   79.899462]  ? stack_access_ok+0xd/0x2c
      [   79.899927]  ? stack_access_ok+0xd/0x2c
      [   79.900404]  ? __module_text_address+0x9/0x4f
      [   79.900910]  ? is_bpf_text_address+0x5/0xc
      [   79.901390]  ? kernel_text_address+0x67/0x7b
      [   79.901884]  ? __kernel_text_address+0x1a/0x25
      [   79.902397]  ? unwind_get_return_address+0x12/0x23
      [   79.903122]  ? __cmpxchg_double_slab.isra.37+0x46/0x77
      [   79.903772]  rtnl_newlink+0x43/0x56
      [   79.904217]  rtnetlink_rcv_msg+0x200/0x24c
      
      In fact, each time a xfrm interface was created, a netdev was allocated
      by __rtnl_newlink()/rtnl_create_link() and then another one by
      xfrmi_newlink()/xfrmi_create(). Only the second one was registered, it's
      why the previous commands produce a backtrace: dev_change_net_namespace()
      was called on a netdev with reg_state set to NETREG_UNINITIALIZED (the
      first one).
      
      CC: Lorenzo Colitti <lorenzo@google.com>
      CC: Benedict Wong <benedictwong@google.com>
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      CC: Shannon Nelson <shannon.nelson@oracle.com>
      CC: Antony Antony <antony@phenome.org>
      CC: Eyal Birger <eyal.birger@gmail.com>
      Fixes: f203b76d ("xfrm: Add virtual xfrm interfaces")
      Reported-by: default avatarJulien Floret <julien.floret@6wind.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      56c5ee1a
    • Florian Westphal's avatar
      xfrm: policy: fix bydst hlist corruption on hash rebuild · fd709721
      Florian Westphal authored
      syzbot reported following spat:
      
      BUG: KASAN: use-after-free in __write_once_size include/linux/compiler.h:221
      BUG: KASAN: use-after-free in hlist_del_rcu include/linux/rculist.h:455
      BUG: KASAN: use-after-free in xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318
      Write of size 8 at addr ffff888095e79c00 by task kworker/1:3/8066
      Workqueue: events xfrm_hash_rebuild
      Call Trace:
       __write_once_size include/linux/compiler.h:221 [inline]
       hlist_del_rcu include/linux/rculist.h:455 [inline]
       xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318
       process_one_work+0x814/0x1130 kernel/workqueue.c:2269
      Allocated by task 8064:
       __kmalloc+0x23c/0x310 mm/slab.c:3669
       kzalloc include/linux/slab.h:742 [inline]
       xfrm_hash_alloc+0x38/0xe0 net/xfrm/xfrm_hash.c:21
       xfrm_policy_init net/xfrm/xfrm_policy.c:4036 [inline]
       xfrm_net_init+0x269/0xd60 net/xfrm/xfrm_policy.c:4120
       ops_init+0x336/0x420 net/core/net_namespace.c:130
       setup_net+0x212/0x690 net/core/net_namespace.c:316
      
      The faulting address is the address of the old chain head,
      free'd by xfrm_hash_resize().
      
      In xfrm_hash_rehash(), chain heads get re-initialized without
      any hlist_del_rcu:
      
       for (i = hmask; i >= 0; i--)
          INIT_HLIST_HEAD(odst + i);
      
      Then, hlist_del_rcu() gets called on the about to-be-reinserted policy
      when iterating the per-net list of policies.
      
      hlist_del_rcu() will then make chain->first be nonzero again:
      
      static inline void __hlist_del(struct hlist_node *n)
      {
         struct hlist_node *next = n->next;   // address of next element in list
         struct hlist_node **pprev = n->pprev;// location of previous elem, this
                                              // can point at chain->first
              WRITE_ONCE(*pprev, next);       // chain->first points to next elem
              if (next)
                      next->pprev = pprev;
      
      Then, when we walk chainlist to find insertion point, we may find a
      non-empty list even though we're supposedly reinserting the first
      policy to an empty chain.
      
      To fix this first unlink all exact and inexact policies instead of
      zeroing the list heads.
      
      Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh,
      without fix KASAN catches the corruption as it happens, SLUB poisoning
      detects it a bit later.
      
      Reported-by: syzbot+0165480d4ef07360eeda@syzkaller.appspotmail.com
      Fixes: 1548bc4e ("xfrm: policy: delete inexact policies from inexact list on hash rebuild")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      fd709721
    • Po-Hsu Lin's avatar
      selftests/net: skip psock_tpacket test if KALLSYMS was not enabled · ff95bf28
      Po-Hsu Lin authored
      The psock_tpacket test will need to access /proc/kallsyms, this would
      require the kernel config CONFIG_KALLSYMS to be enabled first.
      
      Apart from adding CONFIG_KALLSYMS to the net/config file here, check the
      file existence to determine if we can run this test will be helpful to
      avoid a false-positive test result when testing it directly with the
      following commad against a kernel that have CONFIG_KALLSYMS disabled:
          make -C tools/testing/selftests TARGETS=net run_tests
      Signed-off-by: default avatarPo-Hsu Lin <po-hsu.lin@canonical.com>
      Acked-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff95bf28
  3. 02 Jul, 2019 22 commits
    • David Howells's avatar
      rxrpc: Fix oops in tracepoint · 99f0eae6
      David Howells authored
      If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the
      trace line that rxrpc_extract_header() tries to emit when a protocol error
      occurs (typically because the packet is short) because the call argument is
      NULL.
      
      Fix this by using ?: to assume 0 as the debug_id if call is NULL.
      
      This can then be induced by:
      
      	echo -e '\0\0\0\0\0\0\0\0' | ncat -4u --send-only <addr> 20001
      
      where addr has the following program running on it:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <unistd.h>
      	#include <sys/socket.h>
      	#include <arpa/inet.h>
      	#include <linux/rxrpc.h>
      	int main(void)
      	{
      		struct sockaddr_rxrpc srx;
      		int fd;
      		memset(&srx, 0, sizeof(srx));
      		srx.srx_family			= AF_RXRPC;
      		srx.srx_service			= 0;
      		srx.transport_type		= AF_INET;
      		srx.transport_len		= sizeof(srx.transport.sin);
      		srx.transport.sin.sin_family	= AF_INET;
      		srx.transport.sin.sin_port	= htons(0x4e21);
      		fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6);
      		bind(fd, (struct sockaddr *)&srx, sizeof(srx));
      		sleep(20);
      		return 0;
      	}
      
      It results in the following oops.
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000340
      	#PF: supervisor read access in kernel mode
      	#PF: error_code(0x0000) - not-present page
      	...
      	RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac
      	...
      	Call Trace:
      	 <IRQ>
      	 rxrpc_extract_header+0x86/0x171
      	 ? rcu_read_lock_sched_held+0x5d/0x63
      	 ? rxrpc_new_skb+0xd4/0x109
      	 rxrpc_input_packet+0xef/0x14fc
      	 ? rxrpc_input_data+0x986/0x986
      	 udp_queue_rcv_one_skb+0xbf/0x3d0
      	 udp_unicast_rcv_skb.isra.8+0x64/0x71
      	 ip_protocol_deliver_rcu+0xe4/0x1b4
      	 ip_local_deliver+0xf0/0x154
      	 __netif_receive_skb_one_core+0x50/0x6c
      	 netif_receive_skb_internal+0x26b/0x2e9
      	 napi_gro_receive+0xf8/0x1da
      	 rtl8169_poll+0x303/0x4c4
      	 net_rx_action+0x10e/0x333
      	 __do_softirq+0x1a5/0x38f
      	 irq_exit+0x54/0xc4
      	 do_IRQ+0xda/0xf8
      	 common_interrupt+0xf/0xf
      	 </IRQ>
      	 ...
      	 ? cpuidle_enter_state+0x23c/0x34d
      	 cpuidle_enter+0x2a/0x36
      	 do_idle+0x163/0x1ea
      	 cpu_startup_entry+0x1d/0x1f
      	 start_secondary+0x157/0x172
      	 secondary_startup_64+0xa4/0xb0
      
      Fixes: a25e21f0 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99f0eae6
    • Phong Tran's avatar
      net: usb: asix: init MAC address buffers · 78226f6e
      Phong Tran authored
      This is for fixing bug KMSAN: uninit-value in ax88772_bind
      
      Tested by
      https://groups.google.com/d/msg/syzkaller-bugs/aFQurGotng4/eB_HlNhhCwAJ
      
      Reported-by: syzbot+8a3fc6674bbc3978ed4e@syzkaller.appspotmail.com
      
      syzbot found the following crash on:
      
      HEAD commit:    f75e4cfe kmsan: use kmsan_handle_urb() in urb.c
      git tree:       kmsan
      console output: https://syzkaller.appspot.com/x/log.txt?x=136d720ea00000
      kernel config:
      https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a
      dashboard link:
      https://syzkaller.appspot.com/bug?extid=8a3fc6674bbc3978ed4e
      compiler:       clang version 9.0.0 (/home/glider/llvm/clang
      06d00afa61eef8f7f501ebdb4e8612ea43ec2d78)
      syz repro:
      https://syzkaller.appspot.com/x/repro.syz?x=12788316a00000
      C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=120359aaa00000
      
      ==================================================================
      BUG: KMSAN: uninit-value in is_valid_ether_addr
      include/linux/etherdevice.h:200 [inline]
      BUG: KMSAN: uninit-value in asix_set_netdev_dev_addr
      drivers/net/usb/asix_devices.c:73 [inline]
      BUG: KMSAN: uninit-value in ax88772_bind+0x93d/0x11e0
      drivers/net/usb/asix_devices.c:724
      CPU: 0 PID: 3348 Comm: kworker/0:2 Not tainted 5.1.0+ #1
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Workqueue: usb_hub_wq hub_event
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x191/0x1f0 lib/dump_stack.c:113
        kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
        __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
        is_valid_ether_addr include/linux/etherdevice.h:200 [inline]
        asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline]
        ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724
        usbnet_probe+0x10f5/0x3940 drivers/net/usb/usbnet.c:1728
        usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
        really_probe+0xdae/0x1d80 drivers/base/dd.c:513
        driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
        __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
        bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
        __device_attach+0x454/0x730 drivers/base/dd.c:844
        device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
        bus_probe_device+0x137/0x390 drivers/base/bus.c:514
        device_add+0x288d/0x30e0 drivers/base/core.c:2106
        usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
        generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
        usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
        really_probe+0xdae/0x1d80 drivers/base/dd.c:513
        driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
        __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
        bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
        __device_attach+0x454/0x730 drivers/base/dd.c:844
        device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
        bus_probe_device+0x137/0x390 drivers/base/bus.c:514
        device_add+0x288d/0x30e0 drivers/base/core.c:2106
        usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534
        hub_port_connect drivers/usb/core/hub.c:5089 [inline]
        hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
        port_event drivers/usb/core/hub.c:5350 [inline]
        hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432
        process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269
        process_scheduled_works kernel/workqueue.c:2331 [inline]
        worker_thread+0x189c/0x2460 kernel/workqueue.c:2417
        kthread+0x4b5/0x4f0 kernel/kthread.c:254
        ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355
      Signed-off-by: default avatarPhong Tran <tranmanphong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78226f6e
    • David S. Miller's avatar
      Merge branch 'macsec-fix-some-bugs-in-the-receive-path' · bc389fd1
      David S. Miller authored
      Andreas Steinmetz says:
      
      ====================
      macsec: fix some bugs in the receive path
      
      This series fixes some bugs in the receive path of macsec. The first
      is a use after free when processing macsec frames with a SecTAG that
      has the TCI E bit set but the C bit clear. In the 2nd bug, the driver
      leaves an invalid checksumming state after decrypting the packet.
      
      This is a combined effort of Sabrina Dubroca <sd@queasysnail.net> and me.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc389fd1
    • Andreas Steinmetz's avatar
      macsec: fix checksumming after decryption · 7d8b16b9
      Andreas Steinmetz authored
      Fix checksumming after decryption.
      Signed-off-by: default avatarAndreas Steinmetz <ast@domdv.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d8b16b9
    • Andreas Steinmetz's avatar
      macsec: fix use-after-free of skb during RX · 095c02da
      Andreas Steinmetz authored
      Fix use-after-free of skb when rx_handler returns RX_HANDLER_PASS.
      Signed-off-by: default avatarAndreas Steinmetz <ast@domdv.de>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      095c02da
    • David Howells's avatar
      rxrpc: Fix send on a connected, but unbound socket · e835ada0
      David Howells authored
      If sendmsg() or sendmmsg() is called on a connected socket that hasn't had
      bind() called on it, then an oops will occur when the kernel tries to
      connect the call because no local endpoint has been allocated.
      
      Fix this by implicitly binding the socket if it is in the
      RXRPC_CLIENT_UNBOUND state, just like it does for the RXRPC_UNBOUND state.
      
      Further, the state should be transitioned to RXRPC_CLIENT_BOUND after this
      to prevent further attempts to bind it.
      
      This can be tested with:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/socket.h>
      	#include <arpa/inet.h>
      	#include <linux/rxrpc.h>
      	static const unsigned char inet6_addr[16] = {
      		0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 0xac, 0x14, 0x14, 0xaa
      	};
      	int main(void)
      	{
      		struct sockaddr_rxrpc srx;
      		struct cmsghdr *cm;
      		struct msghdr msg;
      		unsigned char control[16];
      		int fd;
      		memset(&srx, 0, sizeof(srx));
      		srx.srx_family = 0x21;
      		srx.srx_service = 0;
      		srx.transport_type = AF_INET;
      		srx.transport_len = 0x1c;
      		srx.transport.sin6.sin6_family = AF_INET6;
      		srx.transport.sin6.sin6_port = htons(0x4e22);
      		srx.transport.sin6.sin6_flowinfo = htons(0x4e22);
      		srx.transport.sin6.sin6_scope_id = htons(0xaa3b);
      		memcpy(&srx.transport.sin6.sin6_addr, inet6_addr, 16);
      		cm = (struct cmsghdr *)control;
      		cm->cmsg_len	= CMSG_LEN(sizeof(unsigned long));
      		cm->cmsg_level	= SOL_RXRPC;
      		cm->cmsg_type	= RXRPC_USER_CALL_ID;
      		*(unsigned long *)CMSG_DATA(cm) = 0;
      		msg.msg_name = NULL;
      		msg.msg_namelen = 0;
      		msg.msg_iov = NULL;
      		msg.msg_iovlen = 0;
      		msg.msg_control = control;
      		msg.msg_controllen = cm->cmsg_len;
      		msg.msg_flags = 0;
      		fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
      		connect(fd, (struct sockaddr *)&srx, sizeof(srx));
      		sendmsg(fd, &msg, 0);
      		return 0;
      	}
      
      Leading to the following oops:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000018
      	#PF: supervisor read access in kernel mode
      	#PF: error_code(0x0000) - not-present page
      	...
      	RIP: 0010:rxrpc_connect_call+0x42/0xa01
      	...
      	Call Trace:
      	 ? mark_held_locks+0x47/0x59
      	 ? __local_bh_enable_ip+0xb6/0xba
      	 rxrpc_new_client_call+0x3b1/0x762
      	 ? rxrpc_do_sendmsg+0x3c0/0x92e
      	 rxrpc_do_sendmsg+0x3c0/0x92e
      	 rxrpc_sendmsg+0x16b/0x1b5
      	 sock_sendmsg+0x2d/0x39
      	 ___sys_sendmsg+0x1a4/0x22a
      	 ? release_sock+0x19/0x9e
      	 ? reacquire_held_locks+0x136/0x160
      	 ? release_sock+0x19/0x9e
      	 ? find_held_lock+0x2b/0x6e
      	 ? __lock_acquire+0x268/0xf73
      	 ? rxrpc_connect+0xdd/0xe4
      	 ? __local_bh_enable_ip+0xb6/0xba
      	 __sys_sendmsg+0x5e/0x94
      	 do_syscall_64+0x7d/0x1bf
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 2341e077 ("rxrpc: Simplify connect() implementation and simplify sendmsg() op")
      Reported-by: syzbot+7966f2a0b2c7da8939b4@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e835ada0
    • David S. Miller's avatar
      Merge branch 'bridge-stale-ptrs' · f2f17175
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: fix possible stale skb pointers
      
      In the bridge driver we have a couple of places which call pskb_may_pull
      but we've cached skb pointers before that and use them after which can
      lead to out-of-bounds/stale pointer use. I've had these in my "to fix"
      list for some time and now we got a report (patch 01) so here they are.
      Patches 02-04 are fixes based on code inspection. Also patch 01 was
      tested by Martin Weinelt, Martin if you don't mind please add your
      tested-by tag to it by replying with Tested-by: name <email>.
      I've also briefly tested the set by trying to exercise those code paths.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2f17175
    • Nikolay Aleksandrov's avatar
      net: bridge: stp: don't cache eth dest pointer before skb pull · 2446a68a
      Nikolay Aleksandrov authored
      Don't cache eth dest pointer before calling pskb_may_pull.
      
      Fixes: cf0f02d0 ("[BRIDGE]: use llc for receiving STP packets")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2446a68a
    • Nikolay Aleksandrov's avatar
      net: bridge: don't cache ether dest pointer on input · 3d26eb8a
      Nikolay Aleksandrov authored
      We would cache ether dst pointer on input in br_handle_frame_finish but
      after the neigh suppress code that could lead to a stale pointer since
      both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to
      always reload it after the suppress code so there's no point in having
      it cached just retrieve it directly.
      
      Fixes: 057658cb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
      Fixes: ed842fae ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d26eb8a
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query · 3b26a5d0
      Nikolay Aleksandrov authored
      We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
      call pskb_may_pull afterwards and end up using a stale pointer.
      So use the header directly, it's just 1 place where it's needed.
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b26a5d0
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling · e57f6185
      Nikolay Aleksandrov authored
      We take a pointer to grec prior to calling pskb_may_pull and use it
      afterwards to get nsrcs so record nsrcs before the pull when handling
      igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling
      mld2 which again could lead to reading 2 bytes out-of-bounds.
      
       ==================================================================
       BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge]
       Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16
      
       CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G           OE     5.2.0-rc6+ #1
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
       Call Trace:
        dump_stack+0x71/0xab
        print_address_description+0x6a/0x280
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        __kasan_report+0x152/0x1aa
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        kasan_report+0xe/0x20
        br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_disable_port+0x150/0x150 [bridge]
        ? ktime_get_with_offset+0xb4/0x150
        ? __kasan_kmalloc.constprop.6+0xa6/0xf0
        ? __netif_receive_skb+0x1b0/0x1b0
        ? br_fdb_update+0x10e/0x6e0 [bridge]
        ? br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        ? br_pass_frame_up+0x3a0/0x3a0 [bridge]
        ? virtnet_probe+0x1c80/0x1c80 [virtio_net]
        br_handle_frame+0x731/0xd90 [bridge]
        ? select_idle_sibling+0x25/0x7d0
        ? br_handle_frame_finish+0x11d0/0x11d0 [bridge]
        __netif_receive_skb_core+0xced/0x2d70
        ? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring]
        ? do_xdp_generic+0x20/0x20
        ? virtqueue_napi_complete+0x39/0x70 [virtio_net]
        ? virtnet_poll+0x94d/0xc78 [virtio_net]
        ? receive_buf+0x5120/0x5120 [virtio_net]
        ? __netif_receive_skb_one_core+0x97/0x1d0
        __netif_receive_skb_one_core+0x97/0x1d0
        ? __netif_receive_skb_core+0x2d70/0x2d70
        ? _raw_write_trylock+0x100/0x100
        ? __queue_work+0x41e/0xbe0
        process_backlog+0x19c/0x650
        ? _raw_read_lock_irq+0x40/0x40
        net_rx_action+0x71e/0xbc0
        ? __switch_to_asm+0x40/0x70
        ? napi_complete_done+0x360/0x360
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __schedule+0x85e/0x14d0
        __do_softirq+0x1db/0x5f9
        ? takeover_tasklets+0x5f0/0x5f0
        run_ksoftirqd+0x26/0x40
        smpboot_thread_fn+0x443/0x680
        ? sort_range+0x20/0x20
        ? schedule+0x94/0x210
        ? __kthread_parkme+0x78/0xf0
        ? sort_range+0x20/0x20
        kthread+0x2ae/0x3a0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x35/0x40
      
       The buggy address belongs to the page:
       page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
       flags: 0xffffc000000000()
       raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000
       raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
       ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       > ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                           ^
       ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ==================================================================
       Disabling lock debugging due to kernel taint
      
      Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
      Reported-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e57f6185
    • Cong Wang's avatar
      xfrm: remove a duplicated assignment · 52e63a4e
      Cong Wang authored
      Fixes: 30846090 ("xfrm: policy: add sequence count to sync with hash resize")
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      52e63a4e
    • Hayes Wang's avatar
      r8152: fix the setting of detecting the linking change for runtime suspend · 13e04fbf
      Hayes Wang authored
      1. Rename r8153b_queue_wake() to r8153_queue_wake().
      
      2. Correct the setting. The enable bit should be 0xd38c bit 0. Besides,
         the 0xd38a bit 0 and 0xd398 bit 8 have to be cleared for both enabled
         and disabled.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13e04fbf
    • Jakub Kicinski's avatar
      net/tls: make sure offload also gets the keys wiped · acd3e96d
      Jakub Kicinski authored
      Commit 86029d10 ("tls: zero the crypto information from tls_context
      before freeing") added memzero_explicit() calls to clear the key material
      before freeing struct tls_context, but it missed tls_device.c has its
      own way of freeing this structure. Replace the missing free.
      
      Fixes: 86029d10 ("tls: zero the crypto information from tls_context before freeing")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acd3e96d
    • Jakub Kicinski's avatar
      net/tls: reject offload of TLS 1.3 · 618bac45
      Jakub Kicinski authored
      Neither drivers nor the tls offload code currently supports TLS
      version 1.3. Check the TLS version when installing connection
      state. TLS 1.3 will just fallback to the kernel crypto for now.
      
      Fixes: 130b392c ("net: tls: Add tls 1.3 support")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      618bac45
    • David S. Miller's avatar
      Merge branch 'idr-fix-overflow-cases-on-32-bit-CPU' · 8a534f8f
      David S. Miller authored
      Cong Wang says:
      
      ====================
      idr: fix overflow cases on 32-bit CPU
      
      idr_get_next_ul() is problematic by design, it can't handle
      the following overflow case well on 32-bit CPU:
      
      u32 id = UINT_MAX;
      idr_alloc_u32(&id);
      while (idr_get_next_ul(&id) != NULL)
       id++;
      
      when 'id' overflows and becomes 0 after UINT_MAX, the loop
      goes infinite.
      
      Fix this by eliminating external users of idr_get_next_ul()
      and migrating them to idr_for_each_entry_continue_ul(). And
      add an additional parameter for these iteration macros to detect
      overflow properly.
      
      Please merge this through networking tree, as all the users
      are in networking subsystem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a534f8f
    • Davide Caratti's avatar
    • Cong Wang's avatar
      idr: introduce idr_for_each_entry_continue_ul() · d39d7149
      Cong Wang authored
      Similarly, other callers of idr_get_next_ul() suffer the same
      overflow bug as they don't handle it properly either.
      
      Introduce idr_for_each_entry_continue_ul() to help these callers
      iterate from a given ID.
      
      cls_flower needs more care here because it still has overflow when
      does arg->cookie++, we have to fold its nested loops into one
      and remove the arg->cookie++.
      
      Fixes: 01683a14 ("net: sched: refactor flower walk to iterate over idr")
      Fixes: 12d6066c ("net/mlx5: Add flow counters idr")
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Cc: Davide Caratti <dcaratti@redhat.com>
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Chris Mi <chrism@mellanox.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Tested-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d39d7149
    • Cong Wang's avatar
      idr: fix overflow case for idr_for_each_entry_ul() · e33d2b74
      Cong Wang authored
      idr_for_each_entry_ul() is buggy as it can't handle overflow
      case correctly. When we have an ID == UINT_MAX, it becomes an
      infinite loop. This happens when running on 32-bit CPU where
      unsigned long has the same size with unsigned int.
      
      There is no better way to fix this than casting it to a larger
      integer, but we can't just 64 bit integer on 32 bit CPU. Instead
      we could just use an additional integer to help us to detect this
      overflow case, that is, adding a new parameter to this macro.
      Fortunately tc action is its only user right now.
      
      Fixes: 65a206c0 ("net/sched: Change act_api and act_xxx modules to use IDR")
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Tested-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chris Mi <chrism@mellanox.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e33d2b74
    • David S. Miller's avatar
      Merge branch 'vsock-virtio-fixes' · eb1f5c02
      David S. Miller authored
      Stefano Garzarella says:
      
      ====================
      vsock/virtio: several fixes in the .probe() and .remove()
      
      During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
      before registering the driver", Stefan pointed out some possible issues
      in the .probe() and .remove() callbacks of the virtio-vsock driver.
      
      This series tries to solve these issues:
      - Patch 1 adds RCU critical sections to avoid use-after-free of
        'the_virtio_vsock' pointer.
      - Patch 2 stops workers before to call vdev->config->reset(vdev) to
        be sure that no one is accessing the device.
      - Patch 3 moves the works flush at the end of the .remove() to avoid
        use-after-free of 'vsock' object.
      
      v2:
      - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
      - Patch 2: no changes
      - Patch 3: flush works only at the end of .remove()
      - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
        allocated.
      
      v1: https://patchwork.kernel.org/cover/10964733/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb1f5c02
    • Stefano Garzarella's avatar
      vsock/virtio: fix flush of works during the .remove() · 0d20e56e
      Stefano Garzarella authored
      This patch moves the flush of works after vdev->config->del_vqs(vdev),
      because we need to be sure that no workers run before to free the
      'vsock' object.
      
      Since we stopped the workers using the [tx|rx|event]_run flags,
      we are sure no one is accessing the device while we are calling
      vdev->config->reset(vdev), so we can safely move the workers' flush.
      
      Before the vdev->config->del_vqs(vdev), workers can be scheduled
      by VQ callbacks, so we must flush them after del_vqs(), to avoid
      use-after-free of 'vsock' object.
      Suggested-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d20e56e
    • Stefano Garzarella's avatar
      vsock/virtio: stop workers during the .remove() · 17dd1367
      Stefano Garzarella authored
      Before to call vdev->config->reset(vdev) we need to be sure that
      no one is accessing the device, for this reason, we add new variables
      in the struct virtio_vsock to stop the workers during the .remove().
      
      This patch also add few comments before vdev->config->reset(vdev)
      and vdev->config->del_vqs(vdev).
      Suggested-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Suggested-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17dd1367