1. 06 Jun, 2024 7 commits
    • Geliang Tang's avatar
      selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca · cd984b2e
      Geliang Tang authored
      The "if (sk_stg_map)" block in do_test() is only used by test_dctcp(),
      it makes sense to move it from do_test() into test_dctcp(). Then
      do_test() can be used by other tests except test_dctcp().
      Signed-off-by: default avatarGeliang Tang <tanggeliang@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/9938916627b9810c877e5c03a621bc0ba5acf5c5.1717054461.git.tanggeliang@kylinos.cn
      cd984b2e
    • Geliang Tang's avatar
      selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca · 224eeb55
      Geliang Tang authored
      The newly added helper start_test() can be used in test_dctcp_fallback()
      too, to replace start_server_str() and connect_to_fd_opts(). In that
      way, two network_helper_opts srv_opts and cli_opts are used instead of
      the previously shared opts.
      Signed-off-by: default avatarGeliang Tang <tanggeliang@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/792ca3bb013fa06e618176da02d75e4f79a76733.1717054461.git.tanggeliang@kylinos.cn
      224eeb55
    • Geliang Tang's avatar
      selftests/bpf: Add start_test helper in bpf_tcp_ca · fee97d0c
      Geliang Tang authored
      For moving the "if (sk_stg_map)" block out of do_test(), extract the
      code before this block as a new function start_test(). It creates
      server-side and client-side sockets and returns them to the caller.
      Signed-off-by: default avatarGeliang Tang <tanggeliang@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/48f2921ff9be958f5d3d28fe6bb7269a61cafa9f.1717054461.git.tanggeliang@kylinos.cn
      fee97d0c
    • Geliang Tang's avatar
      selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca · 9abdfd8a
      Geliang Tang authored
      This patch uses connect_to_fd_opts() instead of using connect_fd_to_fd()
      and settcpca() in do_test() in prog_tests/bpf_tcp_ca.c to accept a struct
      network_helper_opts argument.
      
      Then define a dctcp dedicated post_socket_cb callback stg_post_socket_cb(),
      invoking both settcpca() and bpf_map_update_elem() in it, and set it in
      test_dctcp(). For passing map_fd into stg_post_socket_cb() callback, a new
      member map_fd is added in struct cb_opts.
      
      Add another "const struct network_helper_opts *cli_opts" to do_test() to
      separate it from the server "opts".
      Signed-off-by: default avatarGeliang Tang <tanggeliang@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/876ec90430865bc468e3b7f6fb2648420b075548.1717054461.git.tanggeliang@kylinos.cn
      9abdfd8a
    • Mykyta Yatsenko's avatar
      libbpf: Auto-attach struct_ops BPF maps in BPF skeleton · 08ac454e
      Mykyta Yatsenko authored
      Similarly to `bpf_program`, support `bpf_map` automatic attachment in
      `bpf_object__attach_skeleton`. Currently only struct_ops maps could be
      attached.
      
      On bpftool side, code-generate links in skeleton struct for struct_ops maps.
      Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`.
      
      On libbpf side, extend `bpf_map` with new `autoattach` field to support
      enabling or disabling autoattach functionality, introducing
      getter/setter for this field.
      
      `bpf_object__(attach|detach)_skeleton` is extended with
      attaching/detaching struct_ops maps logic.
      Signed-off-by: default avatarMykyta Yatsenko <yatsenko@meta.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com
      08ac454e
    • Alan Maguire's avatar
      selftests/bpf: Add btf_field_iter selftests · b24862ba
      Alan Maguire authored
      The added selftests verify that for every BTF kind we iterate correctly
      over consituent strings and ids.
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20240605153314.3727466-1-alan.maguire@oracle.com
      b24862ba
    • Yonghong Song's avatar
      selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT · 7015843a
      Yonghong Song authored
      Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
      configs. In this particular case, the base VM is AMD with 166 cpus, and I
      run selftests with regular qemu on top of that and indeed send_signal test
      failed. I also tried with an Intel box with 80 cpus and there is no issue.
      
      The main qemu command line includes:
      
        -enable-kvm -smp 16 -cpu host
      
      The failure log looks like:
      
        $ ./test_progs -t send_signal
        [   48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
        [   48.503622] Modules linked in: bpf_testmod(O)
        [   48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G           O       6.9.0-08561-g2c1713a8-dirty #69
        [   48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
        [   48.511635] RIP: 0010:handle_softirqs+0x71/0x290
        [   48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
        [   48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
        [   48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
        [   48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
        [   48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
        [   48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
        [   48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
        [   48.528525] FS:  00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
        [   48.531600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [   48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
        [   48.537538] Call Trace:
        [   48.537538]  <IRQ>
        [   48.537538]  ? watchdog_timer_fn+0x1cd/0x250
        [   48.539590]  ? lockup_detector_update_enable+0x50/0x50
        [   48.539590]  ? __hrtimer_run_queues+0xff/0x280
        [   48.542520]  ? hrtimer_interrupt+0x103/0x230
        [   48.544524]  ? __sysvec_apic_timer_interrupt+0x4f/0x140
        [   48.545522]  ? sysvec_apic_timer_interrupt+0x3a/0x90
        [   48.547612]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
        [   48.547612]  ? handle_softirqs+0x71/0x290
        [   48.547612]  irq_exit_rcu+0x63/0x80
        [   48.551585]  sysvec_apic_timer_interrupt+0x75/0x90
        [   48.552521]  </IRQ>
        [   48.553529]  <TASK>
        [   48.553529]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
        [   48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
        [   48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
        [   48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
        [   48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
        [   48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
        [   48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
        [   48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
        [   48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
        [   48.575614]  ? finish_task_switch.isra.0+0x8d/0x260
        [   48.576523]  __schedule+0x364/0xac0
        [   48.577535]  schedule+0x2e/0x110
        [   48.578555]  pipe_read+0x301/0x400
        [   48.579589]  ? destroy_sched_domains_rcu+0x30/0x30
        [   48.579589]  vfs_read+0x2b3/0x2f0
        [   48.579589]  ksys_read+0x8b/0xc0
        [   48.583590]  do_syscall_64+0x3d/0xc0
        [   48.583590]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
        [   48.586525] RIP: 0033:0x7f2f28703fa1
        [   48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
        [   48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
        [   48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
        [   48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
        [   48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
        [   48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
        [   48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
        [   48.605527]  </TASK>
      
      In the test, two processes are communicating through pipe. Further debugging
      with strace found that the above splat is triggered as read() syscall could
      not receive the data even if the corresponding write() syscall in another
      process successfully wrote data into the pipe.
      
      The failed subtest is "send_signal_perf". The corresponding perf event has
      sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
      overflow event will trigger a call to the BPF program. So I suspect this may
      overwhelm the system. So I increased the sample_period to 100,000 and the test
      passed. The sample_period 10,000 still has the test failed.
      
      In other parts of selftest, e.g., [1], sample_freq is used instead. So I
      decided to use sample_freq = 1,000 since the test can pass as well.
      
        [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev
      7015843a
  2. 05 Jun, 2024 5 commits
  3. 04 Jun, 2024 12 commits
  4. 03 Jun, 2024 14 commits
  5. 01 Jun, 2024 1 commit
  6. 30 May, 2024 1 commit
    • Martin KaFai Lau's avatar
      Merge branch 'Notify user space when a struct_ops object is detached/unregistered' · 3f8fde31
      Martin KaFai Lau authored
      Kui-Feng Lee says:
      
      ====================
      The subsystems managing struct_ops objects may need to detach a
      struct_ops object due to errors or other reasons. It would be useful
      to notify user space programs so that error recovery or logging can be
      carried out.
      
      This patch set enables the detach feature for struct_ops links and
      send an event to epoll when a link is detached.  Subsystems could call
      link->ops->detach() to detach a link and notify user space programs
      through epoll.
      
      The signatures of callback functions in "struct bpf_struct_ops" have
      been changed as well to pass an extra link argument to
      subsystems. Subsystems could detach the links received from reg() and
      update() callbacks if there is. This also provides a way that
      subsystems can distinguish registrations for an object that has been
      registered multiple times for several links.
      
      However, bpf struct_ops maps without BPF_F_LINK have no any link.
      Subsystems will receive NULL link pointer for this case.
      ---
      Changes from v6:
      
       - Fix the missing header at patch 5.
      
       - Move RCU_INIT_POINTER() back to its original position.
      
      Changes from v5:
      
       - Change the commit title of the patch for bpftool.
      
      Changes from v4:
      
       - Change error code for bpf_struct_ops_map_link_update()
      
       - Always return 0 for bpf_struct_ops_map_link_detach()
      
       - Hold update_mutex in bpf_struct_ops_link_create()
      
       - Add a separated instance of file_operations for links supporting
          poll.
      
       - Fix bpftool for bpf_link_fops_poll.
      
      Changes from v3:
      
       - Add a comment to explain why holding update_mutex is not necessary
          in bpf_struct_ops_link_create()
      
       - Use rcu_access_pointer() in bpf_struct_ops_map_link_poll().
      
      Changes from v2:
      
       - Rephrased commit logs and comments.
      
       - Addressed some mistakes from patch splitting.
      
       - Replace mutex with spinlock in bpf_testmod.c to address lockdep
          Splat and simplify the implementation.
      
       - Fix an argument passing to rcu_dereference_protected().
      
      Changes from v1:
      
       - Pass a link to reg, unreg, and update callbacks.
      
       - Provide a function to detach a link from underlying subsystems.
      
       - Add a kfunc to mimic detachments from subsystems, and provide a
          flexible way to control when to do detachments.
      
       - Add two tests to detach a link from the subsystem after the refcount
          of the link drops to zero.
      
      v6: https://lore.kernel.org/bpf/20240524223036.318800-1-thinker.li@gmail.com/
      v5: https://lore.kernel.org/all/20240523230848.2022072-1-thinker.li@gmail.com/
      v4: https://lore.kernel.org/all/20240521225121.770930-1-thinker.li@gmail.com/
      v3: https://lore.kernel.org/all/20240510002942.1253354-1-thinker.li@gmail.com/
      v2: https://lore.kernel.org/all/20240507055600.2382627-1-thinker.li@gmail.com/
      v1: https://lore.kernel.org/all/20240429213609.487820-1-thinker.li@gmail.com/
      ====================
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      3f8fde31