1. 28 Sep, 2021 4 commits
    • Daniel Borkmann's avatar
      bpf, cgroup: Assign cgroup in cgroup_sk_alloc when called from interrupt · 78cc316e
      Daniel Borkmann authored
      If cgroup_sk_alloc() is called from interrupt context, then just assign the
      root cgroup to skcd->cgroup. Prior to commit 8520e224 ("bpf, cgroups:
      Fix cgroup v2 fallback on v1/v2 mixed mode") we would just return, and later
      on in sock_cgroup_ptr(), we were NULL-testing the cgroup in fast-path, and
      iff indeed NULL returning the root cgroup (v ?: &cgrp_dfl_root.cgrp). Rather
      than re-adding the NULL-test to the fast-path we can just assign it once from
      cgroup_sk_alloc() given v1/v2 handling has been simplified. The migration from
      NULL test with returning &cgrp_dfl_root.cgrp to assigning &cgrp_dfl_root.cgrp
      directly does /not/ change behavior for callers of sock_cgroup_ptr().
      
      syzkaller was able to trigger a splat in the legacy netrom code base, where
      the RX handler in nr_rx_frame() calls nr_make_new() which calls sk_alloc()
      and therefore cgroup_sk_alloc() with in_interrupt() condition. Thus the NULL
      skcd->cgroup, where it trips over on cgroup_sk_free() side given it expects
      a non-NULL object. There are a few other candidates aside from netrom which
      have similar pattern where in their accept-like implementation, they just call
      to sk_alloc() and thus cgroup_sk_alloc() instead of sk_clone_lock() with the
      corresponding cgroup_sk_clone() which then inherits the cgroup from the parent
      socket. None of them are related to core protocols where BPF cgroup programs
      are running from. However, in future, they should follow to implement a similar
      inheritance mechanism.
      
      Additionally, with a !CONFIG_CGROUP_NET_PRIO and !CONFIG_CGROUP_NET_CLASSID
      configuration, the same issue was exposed also prior to 8520e224 due to
      commit e876ecc6 ("cgroup: memcg: net: do not associate sock with unrelated
      cgroup") which added the early in_interrupt() return back then.
      
      Fixes: 8520e224 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode")
      Fixes: e876ecc6 ("cgroup: memcg: net: do not associate sock with unrelated cgroup")
      Reported-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
      Reported-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
      Tested-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/bpf/20210927123921.21535-1-daniel@iogearbox.net
      78cc316e
    • Kumar Kartikeya Dwivedi's avatar
      libbpf: Fix segfault in static linker for objects without BTF · bcfd367c
      Kumar Kartikeya Dwivedi authored
      When a BPF object is compiled without BTF info (without -g),
      trying to link such objects using bpftool causes a SIGSEGV due to
      btf__get_nr_types accessing obj->btf which is NULL. Fix this by
      checking for the NULL pointer, and return error.
      
      Reproducer:
      $ cat a.bpf.c
      extern int foo(void);
      int bar(void) { return foo(); }
      $ cat b.bpf.c
      int foo(void) { return 0; }
      $ clang -O2 -target bpf -c a.bpf.c
      $ clang -O2 -target bpf -c b.bpf.c
      $ bpftool gen obj out a.bpf.o b.bpf.o
      Segmentation fault (core dumped)
      
      After fix:
      $ bpftool gen obj out a.bpf.o b.bpf.o
      libbpf: failed to find BTF info for object 'a.bpf.o'
      Error: failed to link 'a.bpf.o': Unknown error -22 (-22)
      
      Fixes: a4634922 (libbpf: Add linker extern resolution support for functions and global variables)
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
      bcfd367c
    • Dave Marchevsky's avatar
      MAINTAINERS: Add btf headers to BPF · b3aa173d
      Dave Marchevsky authored
      BPF folks maintain these and they're not picked up by the current
      MAINTAINERS entries.
      
      Files caught by the added globs:
      
        include/linux/btf.h
        include/linux/btf_ids.h
        include/uapi/linux/btf.h
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210924193557.3081469-1-davemarchevsky@fb.com
      b3aa173d
    • Lorenz Bauer's avatar
      bpf: Exempt CAP_BPF from checks against bpf_jit_limit · 8a98ae12
      Lorenz Bauer authored
      When introducing CAP_BPF, bpf_jit_charge_modmem() was not changed to treat
      programs with CAP_BPF as privileged for the purpose of JIT memory allocation.
      This means that a program without CAP_BPF can block a program with CAP_BPF
      from loading a program.
      
      Fix this by checking bpf_capable() in bpf_jit_charge_modmem().
      
      Fixes: 2c78ee89 ("bpf: Implement CAP_BPF")
      Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210922111153.19843-1-lmb@cloudflare.com
      8a98ae12
  2. 15 Sep, 2021 1 commit
    • Piotr Krysiuk's avatar
      bpf, mips: Validate conditional branch offsets · 37cb28ec
      Piotr Krysiuk authored
      The conditional branch instructions on MIPS use 18-bit signed offsets
      allowing for a branch range of 128 KBytes (backward and forward).
      However, this limit is not observed by the cBPF JIT compiler, and so
      the JIT compiler emits out-of-range branches when translating certain
      cBPF programs. A specific example of such a cBPF program is included in
      the "BPF_MAXINSNS: exec all MSH" test from lib/test_bpf.c that executes
      anomalous machine code containing incorrect branch offsets under JIT.
      
      Furthermore, this issue can be abused to craft undesirable machine
      code, where the control flow is hijacked to execute arbitrary Kernel
      code.
      
      The following steps can be used to reproduce the issue:
      
        # echo 1 > /proc/sys/net/core/bpf_jit_enable
        # modprobe test_bpf test_name="BPF_MAXINSNS: exec all MSH"
      
      This should produce multiple warnings from build_bimm() similar to:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 209 at arch/mips/mm/uasm-mips.c:210 build_insn+0x558/0x590
        Micro-assembler field overflow
        Modules linked in: test_bpf(+)
        CPU: 0 PID: 209 Comm: modprobe Not tainted 5.14.3 #1
        Stack : 00000000 807bb824 82b33c9c 801843c0 00000000 00000004 00000000 63c9b5ee
                82b33af4 80999898 80910000 80900000 82fd6030 00000001 82b33a98 82087180
                00000000 00000000 80873b28 00000000 000000fc 82b3394c 00000000 2e34312e
                6d6d6f43 809a180f 809a1836 6f6d203a 80900000 00000001 82b33bac 80900000
                00027f80 00000000 00000000 807bb824 00000000 804ed790 001cc317 00000001
        [...]
        Call Trace:
        [<80108f44>] show_stack+0x38/0x118
        [<807a7aac>] dump_stack_lvl+0x5c/0x7c
        [<807a4b3c>] __warn+0xcc/0x140
        [<807a4c3c>] warn_slowpath_fmt+0x8c/0xb8
        [<8011e198>] build_insn+0x558/0x590
        [<8011e358>] uasm_i_bne+0x20/0x2c
        [<80127b48>] build_body+0xa58/0x2a94
        [<80129c98>] bpf_jit_compile+0x114/0x1e4
        [<80613fc4>] bpf_prepare_filter+0x2ec/0x4e4
        [<8061423c>] bpf_prog_create+0x80/0xc4
        [<c0a006e4>] test_bpf_init+0x300/0xba8 [test_bpf]
        [<8010051c>] do_one_initcall+0x50/0x1d4
        [<801c5e54>] do_init_module+0x60/0x220
        [<801c8b20>] sys_finit_module+0xc4/0xfc
        [<801144d0>] syscall_common+0x34/0x58
        [...]
        ---[ end trace a287d9742503c645 ]---
      
      Then the anomalous machine code executes:
      
      => 0xc0a18000:  addiu   sp,sp,-16
         0xc0a18004:  sw      s3,0(sp)
         0xc0a18008:  sw      s4,4(sp)
         0xc0a1800c:  sw      s5,8(sp)
         0xc0a18010:  sw      ra,12(sp)
         0xc0a18014:  move    s5,a0
         0xc0a18018:  move    s4,zero
         0xc0a1801c:  move    s3,zero
      
         # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
         0xc0a18020:  lui     t6,0x8012
         0xc0a18024:  ori     t4,t6,0x9e14
         0xc0a18028:  li      a1,0
         0xc0a1802c:  jalr    t4
         0xc0a18030:  move    a0,s5
         0xc0a18034:  bnez    v0,0xc0a1ffb8           # incorrect branch offset
         0xc0a18038:  move    v0,zero
         0xc0a1803c:  andi    s4,s3,0xf
         0xc0a18040:  b       0xc0a18048
         0xc0a18044:  sll     s4,s4,0x2
         [...]
      
         # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
         0xc0a1ffa0:  lui     t6,0x8012
         0xc0a1ffa4:  ori     t4,t6,0x9e14
         0xc0a1ffa8:  li      a1,0
         0xc0a1ffac:  jalr    t4
         0xc0a1ffb0:  move    a0,s5
         0xc0a1ffb4:  bnez    v0,0xc0a1ffb8           # incorrect branch offset
         0xc0a1ffb8:  move    v0,zero
         0xc0a1ffbc:  andi    s4,s3,0xf
         0xc0a1ffc0:  b       0xc0a1ffc8
         0xc0a1ffc4:  sll     s4,s4,0x2
      
         # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0)
         0xc0a1ffc8:  lui     t6,0x8012
         0xc0a1ffcc:  ori     t4,t6,0x9e14
         0xc0a1ffd0:  li      a1,0
         0xc0a1ffd4:  jalr    t4
         0xc0a1ffd8:  move    a0,s5
         0xc0a1ffdc:  bnez    v0,0xc0a3ffb8           # correct branch offset
         0xc0a1ffe0:  move    v0,zero
         0xc0a1ffe4:  andi    s4,s3,0xf
         0xc0a1ffe8:  b       0xc0a1fff0
         0xc0a1ffec:  sll     s4,s4,0x2
         [...]
      
         # epilogue
         0xc0a3ffb8:  lw      s3,0(sp)
         0xc0a3ffbc:  lw      s4,4(sp)
         0xc0a3ffc0:  lw      s5,8(sp)
         0xc0a3ffc4:  lw      ra,12(sp)
         0xc0a3ffc8:  addiu   sp,sp,16
         0xc0a3ffcc:  jr      ra
         0xc0a3ffd0:  nop
      
      To mitigate this issue, we assert the branch ranges for each emit call
      that could generate an out-of-range branch.
      
      Fixes: 36366e36 ("MIPS: BPF: Restore MIPS32 cBPF JIT")
      Fixes: c6610de3 ("MIPS: net: Add BPF JIT")
      Signed-off-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarJohan Almbladh <johan.almbladh@anyfinetworks.com>
      Acked-by: default avatarJohan Almbladh <johan.almbladh@anyfinetworks.com>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Link: https://lore.kernel.org/bpf/20210915160437.4080-1-piotras@gmail.com
      37cb28ec
  3. 14 Sep, 2021 5 commits
    • Hou Tao's avatar
      bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog · 356ed649
      Hou Tao authored
      Currently if a function ptr in struct_ops has a return value, its
      caller will get a random return value from it, because the return
      value of related BPF_PROG_TYPE_STRUCT_OPS prog is just dropped.
      
      So adding a new flag BPF_TRAMP_F_RET_FENTRY_RET to tell bpf trampoline
      to save and return the return value of struct_ops prog if ret_size of
      the function ptr is greater than 0. Also restricting the flag to be
      used alone.
      
      Fixes: 85d33df3 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS")
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20210914023351.3664499-1-houtao1@huawei.com
      356ed649
    • Eric Dumazet's avatar
      Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers"" · d198b277
      Eric Dumazet authored
      This reverts commit d7807a9a.
      
      As mentioned in https://lkml.org/lkml/2021/9/13/1819
      5 years old commit 91948309 ("ipv4: fix memory leaks in ip_cmsg_send() callers")
      was a correct fix.
      
        ip_cmsg_send() can loop over multiple cmsghdr()
      
        If IP_RETOPTS has been successful, but following cmsghdr generates an error,
        we do not free ipc.ok
      
        If IP_RETOPTS is not successful, we have freed the allocated temporary space,
        not the one currently in ipc.opt.
      
      Sure, code could be refactored, but let's not bring back old bugs.
      
      Fixes: d7807a9a ("Revert "ipv4: fix memory leaks in ip_cmsg_send() callers"")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yajun Deng <yajun.deng@linux.dev>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d198b277
    • zhenggy's avatar
      tcp: fix tp->undo_retrans accounting in tcp_sacktag_one() · 4f884f39
      zhenggy authored
      Commit 10d3be56 ("tcp-tso: do not split TSO packets at retransmit
      time") may directly retrans a multiple segments TSO/GSO packet without
      split, Since this commit, we can no longer assume that a retransmitted
      packet is a single segment.
      
      This patch fixes the tp->undo_retrans accounting in tcp_sacktag_one()
      that use the actual segments(pcount) of the retransmitted packet.
      
      Before that commit (10d3be56), the assumption underlying the
      tp->undo_retrans-- seems correct.
      
      Fixes: 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time")
      Signed-off-by: default avatarzhenggy <zhenggy@chinatelecom.cn>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f884f39
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 2865ba82
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-09-14
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 7 non-merge commits during the last 13 day(s) which contain
      a total of 18 files changed, 334 insertions(+), 193 deletions(-).
      
      The main changes are:
      
      1) Fix mmap_lock lockdep splat in BPF stack map's build_id lookup, from Yonghong Song.
      
      2) Fix BPF cgroup v2 program bypass upon net_cls/prio activation, from Daniel Borkmann.
      
      3) Fix kvcalloc() BTF line info splat on oversized allocation attempts, from Bixuan Cui.
      
      4) Fix BPF selftest build of task_pt_regs test for arm64/s390, from Jean-Philippe Brucker.
      
      5) Fix BPF's disasm.{c,h} to dual-license so that it is aligned with bpftool given the former
         is a build dependency for the latter, from Daniel Borkmann with ACKs from contributors.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2865ba82
    • Eric Dumazet's avatar
      net-caif: avoid user-triggerable WARN_ON(1) · 550ac9c1
      Eric Dumazet authored
      syszbot triggers this warning, which looks something
      we can easily prevent.
      
      If we initialize priv->list_field in chnl_net_init(),
      then always use list_del_init(), we can remove robust_list_del()
      completely.
      
      WARNING: CPU: 0 PID: 3233 at net/caif/chnl_net.c:67 robust_list_del net/caif/chnl_net.c:67 [inline]
      WARNING: CPU: 0 PID: 3233 at net/caif/chnl_net.c:67 chnl_net_uninit+0xc9/0x2e0 net/caif/chnl_net.c:375
      Modules linked in:
      CPU: 0 PID: 3233 Comm: syz-executor.3 Not tainted 5.14.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:robust_list_del net/caif/chnl_net.c:67 [inline]
      RIP: 0010:chnl_net_uninit+0xc9/0x2e0 net/caif/chnl_net.c:375
      Code: 89 eb e8 3a a3 ba f8 48 89 d8 48 c1 e8 03 42 80 3c 28 00 0f 85 bf 01 00 00 48 81 fb 00 14 4e 8d 48 8b 2b 75 d0 e8 17 a3 ba f8 <0f> 0b 5b 5d 41 5c 41 5d e9 0a a3 ba f8 4c 89 e3 e8 02 a3 ba f8 4c
      RSP: 0018:ffffc90009067248 EFLAGS: 00010202
      RAX: 0000000000008780 RBX: ffffffff8d4e1400 RCX: ffffc9000fd34000
      RDX: 0000000000040000 RSI: ffffffff88bb6e49 RDI: 0000000000000003
      RBP: ffff88802cd9ee08 R08: 0000000000000000 R09: ffffffff8d0e6647
      R10: ffffffff88bb6dc2 R11: 0000000000000000 R12: ffff88803791ae08
      R13: dffffc0000000000 R14: 00000000e600ffce R15: ffff888073ed3480
      FS:  00007fed10fa0700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2c322000 CR3: 00000000164a6000 CR4: 00000000001506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       register_netdevice+0xadf/0x1500 net/core/dev.c:10347
       ipcaif_newlink+0x4c/0x260 net/caif/chnl_net.c:468
       __rtnl_newlink+0x106d/0x1750 net/core/rtnetlink.c:3458
       rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3506
       rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5572
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
       netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340
       netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:724
       __sys_sendto+0x21c/0x320 net/socket.c:2036
       __do_sys_sendto net/socket.c:2048 [inline]
       __se_sys_sendto net/socket.c:2044 [inline]
       __x64_sys_sendto+0xdd/0x1b0 net/socket.c:2044
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: cc36a070 ("net-caif: add CAIF netdevice")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      550ac9c1
  4. 13 Sep, 2021 20 commits
  5. 12 Sep, 2021 3 commits
  6. 11 Sep, 2021 1 commit
    • Jesper Nilsson's avatar
      net: stmmac: allow CSR clock of 300MHz · 08dad2f4
      Jesper Nilsson authored
      The Synopsys Ethernet IP uses the CSR clock as a base clock for MDC.
      The divisor used is set in the MAC_MDIO_Address register field CR
      (Clock Rate)
      
      The divisor is there to change the CSR clock into a clock that falls
      below the IEEE 802.3 specified max frequency of 2.5MHz.
      
      If the CSR clock is 300MHz, the code falls back to using the reset
      value in the MAC_MDIO_Address register, as described in the comment
      above this code.
      
      However, 300MHz is actually an allowed value and the proper divider
      can be estimated quite easily (it's just 1Hz difference!)
      
      A CSR frequency of 300MHz with the maximum clock rate value of 0x5
      (STMMAC_CSR_250_300M, a divisor of 124) gives somewhere around
      ~2.42MHz which is below the IEEE 802.3 specified maximum.
      
      For the ARTPEC-8 SoC, the CSR clock is this problematic 300MHz,
      and unfortunately, the reset-value of the MAC_MDIO_Address CR field
      is 0x0.
      
      This leads to a clock rate of zero and a divisor of 42, and gives an
      MDC frequency of ~7.14MHz.
      
      Allow CSR clock of 300MHz by making the comparison inclusive.
      Signed-off-by: default avatarJesper Nilsson <jesper.nilsson@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08dad2f4
  7. 10 Sep, 2021 6 commits
    • Yonghong Song's avatar
      bpf, mm: Fix lockdep warning triggered by stack_map_get_build_id_offset() · 2f1aaf3e
      Yonghong Song authored
      Currently the bpf selftest "get_stack_raw_tp" triggered the warning:
      
        [ 1411.304463] WARNING: CPU: 3 PID: 140 at include/linux/mmap_lock.h:164 find_vma+0x47/0xa0
        [ 1411.304469] Modules linked in: bpf_testmod(O) [last unloaded: bpf_testmod]
        [ 1411.304476] CPU: 3 PID: 140 Comm: systemd-journal Tainted: G        W  O      5.14.0+ #53
        [ 1411.304479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
        [ 1411.304481] RIP: 0010:find_vma+0x47/0xa0
        [ 1411.304484] Code: de 48 89 ef e8 ba f5 fe ff 48 85 c0 74 2e 48 83 c4 08 5b 5d c3 48 8d bf 28 01 00 00 be ff ff ff ff e8 2d 9f d8 00 85 c0 75 d4 <0f> 0b 48 89 de 48 8
        [ 1411.304487] RSP: 0018:ffffabd440403db8 EFLAGS: 00010246
        [ 1411.304490] RAX: 0000000000000000 RBX: 00007f00ad80a0e0 RCX: 0000000000000000
        [ 1411.304492] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
        [ 1411.304494] RBP: ffff9cf5c2f50000 R08: ffff9cf5c3eb25d8 R09: 00000000fffffffe
        [ 1411.304496] R10: 0000000000000001 R11: 00000000ef974e19 R12: ffff9cf5c39ae0e0
        [ 1411.304498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9cf5c39ae0e0
        [ 1411.304501] FS:  00007f00ae754780(0000) GS:ffff9cf5fba00000(0000) knlGS:0000000000000000
        [ 1411.304504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 1411.304506] CR2: 000000003e34343c CR3: 0000000103a98005 CR4: 0000000000370ee0
        [ 1411.304508] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 1411.304510] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 1411.304512] Call Trace:
        [ 1411.304517]  stack_map_get_build_id_offset+0x17c/0x260
        [ 1411.304528]  __bpf_get_stack+0x18f/0x230
        [ 1411.304541]  bpf_get_stack_raw_tp+0x5a/0x70
        [ 1411.305752] RAX: 0000000000000000 RBX: 5541f689495641d7 RCX: 0000000000000000
        [ 1411.305756] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
        [ 1411.305758] RBP: ffff9cf5c02b2f40 R08: ffff9cf5ca7606c0 R09: ffffcbd43ee02c04
        [ 1411.306978]  bpf_prog_32007c34f7726d29_bpf_prog1+0xaf/0xd9c
        [ 1411.307861] R10: 0000000000000001 R11: 0000000000000044 R12: ffff9cf5c2ef60e0
        [ 1411.307865] R13: 0000000000000005 R14: 0000000000000000 R15: ffff9cf5c2ef6108
        [ 1411.309074]  bpf_trace_run2+0x8f/0x1a0
        [ 1411.309891] FS:  00007ff485141700(0000) GS:ffff9cf5fae00000(0000) knlGS:0000000000000000
        [ 1411.309896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 1411.311221]  syscall_trace_enter.isra.20+0x161/0x1f0
        [ 1411.311600] CR2: 00007ff48514d90e CR3: 0000000107114001 CR4: 0000000000370ef0
        [ 1411.312291]  do_syscall_64+0x15/0x80
        [ 1411.312941] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 1411.313803]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [ 1411.314223] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 1411.315082] RIP: 0033:0x7f00ad80a0e0
        [ 1411.315626] Call Trace:
        [ 1411.315632]  stack_map_get_build_id_offset+0x17c/0x260
      
      To reproduce, first build `test_progs` binary:
      
        make -C tools/testing/selftests/bpf -j60
      
      and then run the binary at tools/testing/selftests/bpf directory:
      
        ./test_progs -t get_stack_raw_tp
      
      The warning is due to commit 5b78ed24 ("mm/pagemap: add mmap_assert_locked()
      annotations to find_vma*()") which added mmap_assert_locked() in find_vma()
      function. The mmap_assert_locked() function asserts that mm->mmap_lock needs
      to be held. But this is not the case for bpf_get_stack() or bpf_get_stackid()
      helper (kernel/bpf/stackmap.c), which uses mmap_read_trylock_non_owner()
      instead. Since mm->mmap_lock is not held in bpf_get_stack[id]() use case,
      the above warning is emitted during test run.
      
      This patch fixed the issue by (1). using mmap_read_trylock() instead of
      mmap_read_trylock_non_owner() to satisfy lockdep checking in find_vma(), and
      (2). droping lockdep for mmap_lock right before the irq_work_queue(). The
      function mmap_read_trylock_non_owner() is also removed since after this
      patch nobody calls it any more.
      
      Fixes: 5b78ed24 ("mm/pagemap: add mmap_assert_locked() annotations to find_vma*()")
      Suggested-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Luigi Rizzo <lrizzo@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: linux-mm@kvack.org
      Link: https://lore.kernel.org/bpf/20210909155000.1610299-1-yhs@fb.com
      2f1aaf3e
    • Colin Ian King's avatar
      qlcnic: Remove redundant initialization of variable ret · 666eb96d
      Colin Ian King authored
      The variable ret is being initialized with a value that is never read, it
      is being updated later on. The assignment is redundant and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      666eb96d
    • Shai Malin's avatar
      qed: Handle management FW error · 20e100f5
      Shai Malin authored
      Handle MFW (management FW) error response in order to avoid a crash
      during recovery flows.
      
      Changes from v1:
      - Add "Fixes tag".
      
      Fixes: tag 5e7ba042 ("qed: Fix reading stale configuration information")
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarShai Malin <smalin@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20e100f5
    • Baruch Siach's avatar
      net/packet: clarify source of pr_*() messages · dc41c4a9
      Baruch Siach authored
      Add pr_fmt macro to spell out the source of messages in prefix.
      
      Before this patch:
      
        packet size is too long (1543 > 1518)
      
      With this patch:
      
        af_packet: packet size is too long (1543 > 1518)
      Signed-off-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc41c4a9
    • Florian Fainelli's avatar
      r6040: Restore MDIO clock frequency after MAC reset · e3f0cc1a
      Florian Fainelli authored
      A number of users have reported that they were not able to get the PHY
      to successfully link up, especially after commit c36757eb ("net:
      phy: consider AN_RESTART status when reading link status") where we
      stopped reading just BMSR, but we also read BMCR to determine the link
      status.
      
      Andrius at NetBSD did a wonderful job at debugging the problem
      and found out that the MDIO bus clock frequency would be incorrectly set
      back to its default value which would prevent the MDIO bus controller
      from reading PHY registers properly. Back when we only read BMSR, if we
      read all 1s, we could falsely indicate a link status, though in general
      there is a cable plugged in, so this went unnoticed. After a second read
      of BMCR was added, a wrong read will lead to the inability to determine
      a link UP condition which is when it started to be visibly broken, even
      if it was long before that.
      
      The fix consists in restoring the value of the MD_CSR register that was
      set prior to the MAC reset.
      
      Link: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=53494
      Fixes: 90f750a8 ("r6040: consolidate MAC reset to its own function")
      Reported-by: default avatarAndrius V <vezhlys@gmail.com>
      Reported-by: default avatarDarek Strugacz <darek.strugacz@op.pl>
      Tested-by: default avatarDarek Strugacz <darek.strugacz@op.pl>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3f0cc1a
    • Dave Ertman's avatar
      ice: Correctly deal with PFs that do not support RDMA · bfe84435
      Dave Ertman authored
      There are two cases where the current PF does not support RDMA
      functionality.  The first is if the NVM loaded on the device is set
      to not support RDMA (common_caps.rdma is false).  The second is if
      the kernel bonding driver has included the current PF in an active
      link aggregate.
      
      When the driver has determined that this PF does not support RDMA, then
      auxiliary devices should not be created on the auxiliary bus.  Without
      a device on the auxiliary bus, even if the irdma driver is present, there
      will be no RDMA activity attempted on this PF.
      
      Currently, in the reset flow, an attempt to create auxiliary devices is
      performed without regard to the ability of the PF.  There needs to be a
      check in ice_aux_plug_dev (as the central point that creates auxiliary
      devices) to see if the PF is in a state to support the functionality.
      
      When disabling and re-enabling RDMA due to the inclusion/removal of the PF
      in a link aggregate, we also need to set/clear the bit which controls
      auxiliary device creation so that a reset recovery in a link aggregate
      situation doesn't try to create auxiliary devices when it shouldn't.
      
      Fixes: f9f5301e ("ice: Register auxiliary device to provide RDMA")
      Reported-by: default avatarYongxin Liu <yongxin.liu@windriver.com>
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfe84435