1. 07 Dec, 2023 15 commits
  2. 06 Dec, 2023 25 commits
    • Jiri Olsa's avatar
      selftests/bpf: Add test for early update in prog_array_map_poke_run · ffed24ef
      Jiri Olsa authored
      Adding test that tries to trigger the BUG_IN during early map update
      in prog_array_map_poke_run function.
      
      The idea is to share prog array map between thread that constantly
      updates it and another one loading a program that uses that prog
      array.
      
      Eventually we will hit a place where the program is ok to be updated
      (poke->tailcall_target_stable check) but the address is still not
      registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL
      and cause imbalance for the next tail call update check, which will
      fail with -EBUSY in bpf_arch_text_poke as described in previous fix.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Link: https://lore.kernel.org/bpf/20231206083041.1306660-3-jolsa@kernel.org
      ffed24ef
    • Jiri Olsa's avatar
      bpf: Fix prog_array_map_poke_run map poke update · 4b7de801
      Jiri Olsa authored
      Lee pointed out issue found by syscaller [0] hitting BUG in prog array
      map poke update in prog_array_map_poke_run function due to error value
      returned from bpf_arch_text_poke function.
      
      There's race window where bpf_arch_text_poke can fail due to missing
      bpf program kallsym symbols, which is accounted for with check for
      -EINVAL in that BUG_ON call.
      
      The problem is that in such case we won't update the tail call jump
      and cause imbalance for the next tail call update check which will
      fail with -EBUSY in bpf_arch_text_poke.
      
      I'm hitting following race during the program load:
      
        CPU 0                             CPU 1
      
        bpf_prog_load
          bpf_check
            do_misc_fixups
              prog_array_map_poke_track
      
                                          map_update_elem
                                            bpf_fd_array_map_update_elem
                                              prog_array_map_poke_run
      
                                                bpf_arch_text_poke returns -EINVAL
      
          bpf_prog_kallsyms_add
      
      After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
      poke update fails on expected jump instruction check in bpf_arch_text_poke
      with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
      
      Similar race exists on the program unload.
      
      Fixing this by moving the update to bpf_arch_poke_desc_update function which
      makes sure we call __bpf_arch_text_poke that skips the bpf address check.
      
      Each architecture has slightly different approach wrt looking up bpf address
      in bpf_arch_text_poke, so instead of splitting the function or adding new
      'checkip' argument in previous version, it seems best to move the whole
      map_poke_run update as arch specific code.
      
        [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
      
      Fixes: ebf7d1f5 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
      Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Cc: Lee Jones <lee@kernel.org>
      Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
      Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org
      4b7de801
    • Phil Sutter's avatar
      netfilter: xt_owner: Fix for unsafe access of sk->sk_socket · 7ae836a3
      Phil Sutter authored
      A concurrently running sock_orphan() may NULL the sk_socket pointer in
      between check and deref. Follow other users (like nft_meta.c for
      instance) and acquire sk_callback_lock before dereferencing sk_socket.
      
      Fixes: 0265ab44 ("[NETFILTER]: merge ipt_owner/ip6t_owner in xt_owner")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7ae836a3
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: validate family when identifying table via handle · f6e1532a
      Pablo Neira Ayuso authored
      Validate table family when looking up for it via NFTA_TABLE_HANDLE.
      
      Fixes: 3ecbfd65 ("netfilter: nf_tables: allocate handle and delete objects via handle")
      Reported-by: default avatarXingyuan Mo <hdthky0@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f6e1532a
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: bail out on mismatching dynset and set expressions · 3701cd39
      Pablo Neira Ayuso authored
      If dynset expressions provided by userspace is larger than the declared
      set expressions, then bail out.
      
      Fixes: 48b0ae04 ("netfilter: nftables: netlink support for several set element expressions")
      Reported-by: default avatarXingyuan Mo <hdthky0@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3701cd39
    • Florian Westphal's avatar
      netfilter: nf_tables: fix 'exist' matching on bigendian arches · 63331e37
      Florian Westphal authored
      Maze reports "tcp option fastopen exists" fails to match on
      OpenWrt 22.03.5, r20134-5f15225c1e (5.10.176) router.
      
      "tcp option fastopen exists" translates to:
      inet
        [ exthdr load tcpopt 1b @ 34 + 0 present => reg 1 ]
        [ cmp eq reg 1 0x00000001 ]
      
      .. but existing nft userspace generates a 1-byte compare.
      
      On LSB (x86), "*reg32 = 1" is identical to nft_reg_store8(reg32, 1), but
      not on MSB, which will place the 1 last. IOW, on bigendian aches the cmp8
      is awalys false.
      
      Make sure we store this in a consistent fashion, so existing userspace
      will also work on MSB (bigendian).
      
      Regardless of this patch we can also change nft userspace to generate
      'reg32 == 0' and 'reg32 != 0' instead of u8 == 0 // u8 == 1 when
      adding 'option x missing/exists' expressions as well.
      
      Fixes: 3c1fece8 ("netfilter: nft_exthdr: Allow checking TCP option presence, too")
      Fixes: b9f9a485 ("netfilter: nft_exthdr: add boolean DCCP option matching")
      Fixes: 055c4b34 ("netfilter: nft_fib: Support existence check")
      Reported-by: default avatarMaciej Żenczykowski <zenczykowski@gmail.com>
      Closes: https://lore.kernel.org/netfilter-devel/CAHo-OozyEqHUjL2-ntATzeZOiuftLWZ_HU6TOM_js4qLfDEAJg@mail.gmail.com/Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      63331e37
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: skip inactive elements during set walk · 317eb968
      Florian Westphal authored
      Otherwise set elements can be deactivated twice which will cause a crash.
      Reported-by: default avatarXingyuan Mo <hdthky0@gmail.com>
      Fixes: 3c4287f6 ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      317eb968
    • D. Wythe's avatar
      netfilter: bpf: fix bad registration on nf_defrag · 1834d62a
      D. Wythe authored
      We should pass a pointer to global_hook to the get_proto_defrag_hook()
      instead of its value, since the passed value won't be updated even if
      the request module was loaded successfully.
      
      Log:
      
      [   54.915713] nf_defrag_ipv4 has bad registration
      [   54.915779] WARNING: CPU: 3 PID: 6323 at net/netfilter/nf_bpf_link.c:62 get_proto_defrag_hook+0x137/0x160
      [   54.915835] CPU: 3 PID: 6323 Comm: fentry Kdump: loaded Tainted: G            E      6.7.0-rc2+ #35
      [   54.915839] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
      [   54.915841] RIP: 0010:get_proto_defrag_hook+0x137/0x160
      [   54.915844] Code: 4f 8c e8 2c cf 68 ff 80 3d db 83 9a 01 00 0f 85 74 ff ff ff 48 89 ee 48 c7 c7 8f 12 4f 8c c6 05 c4 83 9a 01 01 e8 09 ee 5f ff <0f> 0b e9 57 ff ff ff 49 8b 3c 24 4c 63 e5 e8 36 28 6c ff 4c 89 e0
      [   54.915849] RSP: 0018:ffffb676003fbdb0 EFLAGS: 00010286
      [   54.915852] RAX: 0000000000000023 RBX: ffff9596503d5600 RCX: ffff95996fce08c8
      [   54.915854] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff95996fce08c0
      [   54.915855] RBP: ffffffff8c4f12de R08: 0000000000000000 R09: 00000000fffeffff
      [   54.915859] R10: ffffb676003fbc70 R11: ffffffff8d363ae8 R12: 0000000000000000
      [   54.915861] R13: ffffffff8e1f75c0 R14: ffffb676003c9000 R15: 00007ffd15e78ef0
      [   54.915864] FS:  00007fb6e9cab740(0000) GS:ffff95996fcc0000(0000) knlGS:0000000000000000
      [   54.915867] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   54.915868] CR2: 00007ffd15e75c40 CR3: 0000000101e62006 CR4: 0000000000360ef0
      [   54.915870] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   54.915871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   54.915873] Call Trace:
      [   54.915891]  <TASK>
      [   54.915894]  ? __warn+0x84/0x140
      [   54.915905]  ? get_proto_defrag_hook+0x137/0x160
      [   54.915908]  ? __report_bug+0xea/0x100
      [   54.915925]  ? report_bug+0x2b/0x80
      [   54.915928]  ? handle_bug+0x3c/0x70
      [   54.915939]  ? exc_invalid_op+0x18/0x70
      [   54.915942]  ? asm_exc_invalid_op+0x1a/0x20
      [   54.915948]  ? get_proto_defrag_hook+0x137/0x160
      [   54.915950]  bpf_nf_link_attach+0x1eb/0x240
      [   54.915953]  link_create+0x173/0x290
      [   54.915969]  __sys_bpf+0x588/0x8f0
      [   54.915974]  __x64_sys_bpf+0x20/0x30
      [   54.915977]  do_syscall_64+0x45/0xf0
      [   54.915989]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [   54.915998] RIP: 0033:0x7fb6e9daa51d
      [   54.916001] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 89 0c 00 f7 d8 64 89 01 48
      [   54.916003] RSP: 002b:00007ffd15e78ed8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
      [   54.916006] RAX: ffffffffffffffda RBX: 00007ffd15e78fc0 RCX: 00007fb6e9daa51d
      [   54.916007] RDX: 0000000000000040 RSI: 00007ffd15e78ef0 RDI: 000000000000001c
      [   54.916009] RBP: 000000000000002d R08: 00007fb6e9e73a60 R09: 0000000000000001
      [   54.916010] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000006
      [   54.916012] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
      [   54.916014]  </TASK>
      [   54.916015] ---[ end trace 0000000000000000 ]---
      
      Fixes: 91721c2d ("netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Acked-by: default avatarDaniel Xu <dxu@dxuuu.xyz>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1834d62a
    • Heiner Kallweit's avatar
      leds: trigger: netdev: fix RTNL handling to prevent potential deadlock · fe2b1226
      Heiner Kallweit authored
      When working on LED support for r8169 I got the following lockdep
      warning. Easiest way to prevent this scenario seems to be to take
      the RTNL lock before the trigger_data lock in set_device_name().
      
      ======================================================
      WARNING: possible circular locking dependency detected
      6.7.0-rc2-next-20231124+ #2 Not tainted
      ------------------------------------------------------
      bash/383 is trying to acquire lock:
      ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
      
      but task is already holding lock:
      ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (rtnl_mutex){+.+.}-{3:3}:
             __mutex_lock+0x9b/0xb50
             mutex_lock_nested+0x16/0x20
             rtnl_lock+0x12/0x20
             set_device_name+0xa9/0x120 [ledtrig_netdev]
             netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev]
             led_trigger_set+0x172/0x2c0
             led_trigger_write+0xf1/0x140
             sysfs_kf_bin_write+0x5d/0x80
             kernfs_fop_write_iter+0x15d/0x210
             vfs_write+0x1f0/0x510
             ksys_write+0x6c/0xf0
             __x64_sys_write+0x14/0x20
             do_syscall_64+0x3f/0xf0
             entry_SYSCALL_64_after_hwframe+0x6c/0x74
      
      -> #0 (&trigger_data->lock){+.+.}-{3:3}:
             __lock_acquire+0x1459/0x25a0
             lock_acquire+0xc8/0x2d0
             __mutex_lock+0x9b/0xb50
             mutex_lock_nested+0x16/0x20
             netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
             call_netdevice_register_net_notifiers+0x5a/0x100
             register_netdevice_notifier+0x85/0x120
             netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
             led_trigger_set+0x172/0x2c0
             led_trigger_write+0xf1/0x140
             sysfs_kf_bin_write+0x5d/0x80
             kernfs_fop_write_iter+0x15d/0x210
             vfs_write+0x1f0/0x510
             ksys_write+0x6c/0xf0
             __x64_sys_write+0x14/0x20
             do_syscall_64+0x3f/0xf0
             entry_SYSCALL_64_after_hwframe+0x6c/0x74
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(rtnl_mutex);
                                     lock(&trigger_data->lock);
                                     lock(rtnl_mutex);
        lock(&trigger_data->lock);
      
       *** DEADLOCK ***
      
      8 locks held by bash/383:
       #0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0
       #1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210
       #2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210
       #3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140
       #4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140
       #5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140
       #6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120
       #7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
      
      stack backtrace:
      CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2
      Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023
      Call Trace:
       <TASK>
       dump_stack_lvl+0x5c/0xd0
       dump_stack+0x10/0x20
       print_circular_bug+0x2dd/0x410
       check_noncircular+0x131/0x150
       __lock_acquire+0x1459/0x25a0
       lock_acquire+0xc8/0x2d0
       ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
       __mutex_lock+0x9b/0xb50
       ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
       ? __this_cpu_preempt_check+0x13/0x20
       ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
       ? __cancel_work_timer+0x11c/0x1b0
       ? __mutex_lock+0x123/0xb50
       mutex_lock_nested+0x16/0x20
       ? mutex_lock_nested+0x16/0x20
       netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
       call_netdevice_register_net_notifiers+0x5a/0x100
       register_netdevice_notifier+0x85/0x120
       netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
       led_trigger_set+0x172/0x2c0
       ? preempt_count_add+0x49/0xc0
       led_trigger_write+0xf1/0x140
       sysfs_kf_bin_write+0x5d/0x80
       kernfs_fop_write_iter+0x15d/0x210
       vfs_write+0x1f0/0x510
       ksys_write+0x6c/0xf0
       __x64_sys_write+0x14/0x20
       do_syscall_64+0x3f/0xf0
       entry_SYSCALL_64_after_hwframe+0x6c/0x74
      RIP: 0033:0x7f269055d034
      Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
      RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034
      RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001
      RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007
      R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000
       </TASK>
      
      Fixes: d5e01266 ("leds: trigger: netdev: add additional specific link speed mode")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Acked-by: default avatarLee Jones <lee@kernel.org>
      Link: https://lore.kernel.org/r/fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fe2b1226
    • Paolo Abeni's avatar
      Merge branch 'octeontx2-af-miscellaneous-fixes' · 2078a341
      Paolo Abeni authored
      Geetha sowjanya says:
      
      ====================
      octeontx2-af: miscellaneous fixes
      
      The series of patches fixes various issues related to mcs
      and NIX link registers.
      
      v3-v4:
       Used FIELD_PREP macro and proper data types.
      
      v2-v3:
       Fixed typo error in patch 4 commit message.
      
      v1-v2:
       Fixed author name for patch 5.
       Added Reviewed-by.
      ====================
      
      Link: https://lore.kernel.org/r/20231205080434.27604-1-gakula@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2078a341
    • Rahul Bhansali's avatar
      octeontx2-af: Update Tx link register range · 7336fc19
      Rahul Bhansali authored
      On new silicons the TX channels for transmit level has increased.
      This patch fixes the respective register offset range to
      configure the newly added channels.
      
      Fixes: b279bbb3 ("octeontx2-af: NIX Tx scheduler queue config support")
      Signed-off-by: default avatarRahul Bhansali <rbhansali@marvell.com>
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7336fc19
    • Geetha sowjanya's avatar
      octeontx2-af: Add missing mcs flr handler call · d431abd0
      Geetha sowjanya authored
      If mcs resources are attached to PF/VF. These resources need
      to be freed on FLR. This patch add missing mcs flr call on PF FLR.
      
      Fixes: bd69476e ("octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic")
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d431abd0
    • Geetha sowjanya's avatar
      octeontx2-af: Fix mcs stats register address · 3ba98a8c
      Geetha sowjanya authored
      This patch adds the miss mcs stats register
      for mcs supported platforms.
      
      Fixes: 9312150a ("octeontx2-af: cn10k: mcs: Support for stats collection")
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3ba98a8c
    • Geetha sowjanya's avatar
      octeontx2-af: Fix mcs sa cam entries size · 9723b2cc
      Geetha sowjanya authored
      On latest silicon versions SA cam entries increased to 256.
      This patch fixes the datatype of sa_entries in mcs_hw_info
      struct to u16 to hold 256 entries.
      
      Fixes: 080bbd19 ("octeontx2-af: cn10k: mcs: Add mailboxes for port related operations")
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9723b2cc
    • Nithin Dabilpuram's avatar
      octeontx2-af: Adjust Tx credits when MCS external bypass is disabled · dca6fa86
      Nithin Dabilpuram authored
      When MCS external bypass is disabled, MCS returns additional
      2 credits(32B) for every packet Tx'ed on LMAC. To account for
      these extra credits, NIX_AF_TX_LINKX_NORM_CREDIT.CC_MCS_CNT
      needs to be configured as otherwise NIX Tx credits would overflow
      and will never be returned to idle state credit count
      causing issues with credit control and MTU change.
      
      This patch fixes the same by configuring CC_MCS_CNT at probe
      time for MCS enabled SoC's
      
      Fixes: bd69476e ("octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic")
      Signed-off-by: default avatarNithin Dabilpuram <ndabilpuram@marvell.com>
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      dca6fa86
    • Paolo Abeni's avatar
      Merge branch 'tcp-ao-fixes' · 3142dbf0
      Paolo Abeni authored
      Dmitry Safonov says:
      
      ====================
      TCP-AO fixes
      
      Changes from v4:
      - Dropped 2 patches on which there's no consensus. They will require
        more work TBD if they may made acceptable. Those are:
        o "net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets"
        o "net/tcp: Store SNEs + SEQs on ao_info"
      
      Changes from v3:
      - Don't restrict adding any keys on TCP-AO connection in VRF, but only
        the ones that don't match l3index (David)
      
      Changes from v2:
      - rwlocks are problematic in net code (Paolo)
        Changed the SNE code to avoid spin/rw locks on RX/TX fastpath by
        double-accounting SEQ numbers for TCP-AO enabled connections.
      
      Changes from v1:
      - Use tcp_can_repair_sock() helper to limit TCP_AO_REPAIR (Eric)
      - Instead of hook to listen() syscall, allow removing current/rnext keys
        on TCP_LISTEN (addressing Eric's objection)
      - Add sne_lock to protect snd_sne/rcv_sne
      - Don't move used_tcp_ao in struct tcp_request_sock (Eric)
      
      I've been working on TCP-AO key-rotation selftests and as a result
      exercised some corner-cases that are not usually met in production.
      
      Here are a bunch of semi-related fixes:
      - Documentation typo (reported by Markus Elfring)
      - Proper alignment for TCP-AO option in TCP header that has MAC length
        of non 4 bytes (now a selftest with randomized maclen/algorithm/etc
        passes)
      - 3 uAPI restricting patches that disallow more things to userspace in
        order to prevent it shooting itself in any parts of the body
      - SNEs READ_ONCE()/WRITE_ONCE() that went missing by my human factor
      - Avoid storing MAC length from SYN header as SYN-ACK will use
        rnext_key.maclen (drops an extra check that fails on new selftests)
      ====================
      
      Link: https://lore.kernel.org/r/Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3142dbf0
    • Dmitry Safonov's avatar
      net/tcp: Don't store TCP-AO maclen on reqsk · 9396c4ee
      Dmitry Safonov authored
      This extra check doesn't work for a handshake when SYN segment has
      (current_key.maclen != rnext_key.maclen). It could be amended to
      preserve rnext_key.maclen instead of current_key.maclen, but that
      requires a lookup on listen socket.
      
      Originally, this extra maclen check was introduced just because it was
      cheap. Drop it and convert tcp_request_sock::maclen into boolean
      tcp_request_sock::used_tcp_ao.
      
      Fixes: 06b22ef2 ("net/tcp: Wire TCP-AO to request sockets")
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9396c4ee
    • Dmitry Safonov's avatar
      net/tcp: Don't add key with non-matching VRF on connected sockets · 12083d72
      Dmitry Safonov authored
      If the connection was established, don't allow adding TCP-AO keys that
      don't match the peer. Currently, there are checks for ip-address
      matching, but L3 index check is missing. Add it to restrict userspace
      shooting itself somewhere.
      
      Yet, nothing restricts the CAP_NET_RAW user from trying to shoot
      themselves by performing setsockopt(SO_BINDTODEVICE) or
      setsockopt(SO_BINDTOIFINDEX) over an established TCP-AO connection.
      So, this is just "minimum effort" to potentially save someone's
      debugging time, rather than a full restriction on doing weird things.
      
      Fixes: 248411b8 ("net/tcp: Wire up l3index to TCP-AO")
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      12083d72
    • Dmitry Safonov's avatar
      net/tcp: Limit TCP_AO_REPAIR to non-listen sockets · 965c00e4
      Dmitry Safonov authored
      Listen socket is not an established TCP connection, so
      setsockopt(TCP_AO_REPAIR) doesn't have any impact.
      
      Restrict this uAPI for listen sockets.
      
      Fixes: faadfaba ("net/tcp: Add TCP_AO_REPAIR")
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      965c00e4
    • Dmitry Safonov's avatar
      net/tcp: Consistently align TCP-AO option in the header · da7dfaa6
      Dmitry Safonov authored
      Currently functions that pre-calculate TCP header options length use
      unaligned TCP-AO header + MAC-length for skb reservation.
      And the functions that actually write TCP-AO options into skb do align
      the header. Nothing good can come out of this for ((maclen % 4) != 0).
      
      Provide tcp_ao_len_aligned() helper and use it everywhere for TCP
      header options space calculations.
      
      Fixes: 1e03d32b ("net/tcp: Add TCP-AO sign to outgoing packets")
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      da7dfaa6
    • Dmitry Safonov's avatar
      Documentation/tcp: Fix an obvious typo · 714589c2
      Dmitry Safonov authored
      Yep, my VIM spellchecker is not good enough for typos like this one.
      
      Fixes: 7fe0e38b ("Documentation/tcp: Add TCP-AO documentation")
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Reported-by: default avatarMarkus Elfring <Markus.Elfring@web.de>
      Closes: https://lore.kernel.org/all/2745ab4e-acac-40d4-83bf-37f2600d0c3d@web.de/Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      714589c2
    • Paolo Abeni's avatar
      Merge branch 'there-are-some-bugfix-for-the-hns-ethernet-driver' · 6b07b522
      Paolo Abeni authored
      Jijie Shao says:
      
      ====================
      There are some bugfix for the HNS ethernet driver
      
      There are some bugfix for the HNS ethernet driver
      ---
      changeLog:
      v2 -> v3:
        - Refine the commit msg as Wojciech suggestions
        - Reconstruct the "hns_mac_link_anti_shake" function suggested by Wojciech
        v2: https://lore.kernel.org/all/20231204011051.4055031-1-shaojijie@huawei.com/
      v1 -> v2:
        - Fixed the internal function is not decorated with static issue, suggested by Jakub
        v1: https://lore.kernel.org/all/20231201102703.4134592-1-shaojijie@huawei.com/
      ---
      ====================
      
      Link: https://lore.kernel.org/r/20231204143232.3221542-1-shaojijie@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6b07b522
    • Yonglong Liu's avatar
      net: hns: fix fake link up on xge port · f708aba4
      Yonglong Liu authored
      If a xge port just connect with an optical module and no fiber,
      it may have a fake link up because there may be interference on
      the hardware. This patch adds an anti-shake to avoid the problem.
      And the time of anti-shake is base on tests.
      
      Fixes: b917078c ("net: hns: Add ACPI support to check SFP present")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f708aba4
    • Yonglong Liu's avatar
      net: hns: fix wrong head when modify the tx feature when sending packets · 84757d08
      Yonglong Liu authored
      Upon changing the tx feature, the hns driver will modify the
      maybe_stop_tx() and fill_desc() functions, if the modify happens
      during packet sending, will cause the hardware and software
      pointers do not match, and the port can not work anymore.
      
      This patch deletes the maybe_stop_tx() and fill_desc() functions
      modification when setting tx feature, and use the skb_is_gro()
      to determine which functions to use in the tx path.
      
      Fixes: 38f616da ("net:hns: Add support of ethtool TSO set option for Hip06 in HNS")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      84757d08
    • Daniil Maximov's avatar
      net: atlantic: Fix NULL dereference of skb pointer in · cbe860be
      Daniil Maximov authored
      If is_ptp_ring == true in the loop of __aq_ring_xdp_clean function,
      then a timestamp is stored from a packet in a field of skb object,
      which is not allocated at the moment of the call (skb == NULL).
      
      Generalize aq_ptp_extract_ts and other affected functions so they don't
      work with struct sk_buff*, but with struct skb_shared_hwtstamps*.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE
      
      Fixes: 26efaef7 ("net: atlantic: Implement xdp data plane")
      Signed-off-by: default avatarDaniil Maximov <daniil31415it@gmail.com>
      Reviewed-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Link: https://lore.kernel.org/r/20231204085810.1681386-1-daniil31415it@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      cbe860be