1. 12 Jul, 2023 16 commits
    • Alexei Starovoitov's avatar
      bpf: Let free_all() return the number of freed elements. · 9de3e815
      Alexei Starovoitov authored
      Let free_all() helper return the number of freed elements.
      It's not used in this patch, but helps in debug/development of bpf_mem_alloc.
      
      For example this diff for __free_rcu():
      -       free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size);
      +       printk("cpu %d freed %d objs after tasks trace\n", raw_smp_processor_id(),
      +       	free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size));
      
      would show how busy RCU tasks trace is.
      In artificial benchmark where one cpu is allocating and different cpu is freeing
      the RCU tasks trace won't be able to keep up and the list of objects
      would keep growing from thousands to millions and eventually OOMing.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/bpf/20230706033447.54696-4-alexei.starovoitov@gmail.com
      9de3e815
    • Alexei Starovoitov's avatar
    • Alexei Starovoitov's avatar
      bpf: Rename few bpf_mem_alloc fields. · 12c8d0f4
      Alexei Starovoitov authored
      Rename:
      -       struct rcu_head rcu;
      -       struct llist_head free_by_rcu;
      -       struct llist_head waiting_for_gp;
      -       atomic_t call_rcu_in_progress;
      +       struct llist_head free_by_rcu_ttrace;
      +       struct llist_head waiting_for_gp_ttrace;
      +       struct rcu_head rcu_ttrace;
      +       atomic_t call_rcu_ttrace_in_progress;
      ...
      -	static void do_call_rcu(struct bpf_mem_cache *c)
      +	static void do_call_rcu_ttrace(struct bpf_mem_cache *c)
      
      to better indicate intended use.
      
      The 'tasks trace' is shortened to 'ttrace' to reduce verbosity.
      No functional changes.
      
      Later patches will add free_by_rcu/waiting_for_gp fields to be used with normal RCU.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/bpf/20230706033447.54696-2-alexei.starovoitov@gmail.com
      12c8d0f4
    • Andrii Nakryiko's avatar
      selftests/bpf: extend existing map resize tests for per-cpu use case · c21de5fc
      Andrii Nakryiko authored
      Add a per-cpu array resizing use case and demonstrate how
      bpf_get_smp_processor_id() can be used to directly access proper data
      with no extra checks.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230711232400.1658562-2-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c21de5fc
    • Andrii Nakryiko's avatar
      bpf: teach verifier actual bounds of bpf_get_smp_processor_id() result · f42bcd16
      Andrii Nakryiko authored
      bpf_get_smp_processor_id() helper returns current CPU on which BPF
      program runs. It can't return value that is bigger than maximum allowed
      number of CPUs (minus one, due to zero indexing). Teach BPF verifier to
      recognize that. This makes it possible to use bpf_get_smp_processor_id()
      result to index into arrays without extra checks, as demonstrated in
      subsequent selftests/bpf patch.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230711232400.1658562-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f42bcd16
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: Support ->fill_link_info for kprobe_multi and perf_event links' · 87e098e6
      Alexei Starovoitov authored
      Yafang Shao says:
      
      ====================
      This patchset enhances the usability of kprobe_multi program by introducing
      support for ->fill_link_info. This allows users to easily determine the
      probed functions associated with a kprobe_multi program. While
      `bpftool perf show` already provides information about functions probed by
      perf_event programs, supporting ->fill_link_info ensures consistent access
      to this information across all bpf links.
      
      In addition, this patch extends support to generic perf events, which are
      currently not covered by `bpftool perf show`. While userspace is exposed to
      only the perf type and config, other attributes such as sample_period and
      sample_freq are disregarded.
      
      To ensure accurate identification of probed functions, it is preferable to
      expose the address directly rather than relying solely on the symbol name.
      However, this implementation respects the kptr_restrict setting and avoids
      exposing the address if it is not permitted.
      
      v6->v7:
      - From Daniel
        - No need to explicitly cast in many places
        - Use ptr_to_u64() instead of the cast
        - return -ENOMEM when calloc fails
        - Simplify the code in bpf_get_kprobe_info() further
        - Squash #9 with #8
        - And other coding style improvement
      - From Andrii
        - Comment improvement
        - Use ENOSPC instead of E2BIG
        - Use strlen only when buf in not NULL
      - Clear probe_addr in bpf_get_uprobe_info()
      
      v5->v6:
      - From Andrii
        - if ucount is too less, copy ucount items and return -E2BIG
        - zero out kmulti_link->cnt elements if it is not permitted by kptr
        - avoid leaking information when ucount is greater than kmulti_link->cnt
        - drop the flags, and add BPF_PERF_EVENT_[UK]RETPROBE
      - From Quentin
        - use jsonw_null instead when we have no module name
        - add explanation on perf_type_name in the commit log
        - avoid the unnecessary out lable
      
      v4->v5:
      - Print "func [module]" in the kprobe_multi header (Andrii)
      - Remove MAX_BPF_PERF_EVENT_TYPE (Alexei)
      - Add padding field for future reuse (Yonghong)
      
      v3->v4:
      - From Quentin
        - Rename MODULE_NAME_LEN to MODULE_MAX_NAME
        - Convert retprobe to boolean for json output
        - Trim the square brackets around module names for json output
        - Move perf names into link.c
        - Use a generic helper to get perf names
        - Show address before func name, for consistency
        - Use switch-case instead of if-else
        - Increase the buff len to PATH_MAX
        - Move macros to the top of the file
      - From Andrii
        - kprobe_multi flags should always be returned
        - Keep it single line if it fits in under 100 characters
        - Change the output format when showing kprobe_multi
        - Imporve the format of perf_event names
        - Rename struct perf_link to struct perf_event, and change the names of
          the enum consequently
      - From Yonghong
        - Avoid disallowing extensions for all structs in the big union
      - From Jiri
        - Add flags to bpf_kprobe_multi_link
        - Report kprobe_multi selftests errors
        - Rename bpf_perf_link_fill_name and make it a separate patch
        - Avoid breaking compilation when CONFIG_KPROBE_EVENTS or
          CONFIG_UPROBE_EVENTS options are not defined
      
      v2->v3:
      - Expose flags instead of retporbe (Andrii)
      - Simplify the check on kmulti_link->cnt (Andrii)
      - Use kallsyms_show_value() instead (Andrii)
      - Show also the module name for kprobe_multi (Andrii)
      - Add new enum bpf_perf_link_type (Andrii)
      - Move perf event names into bpftool (Andrii, Quentin, Jiri)
      - Keep perf event names in sync with perf tools (Jiri)
      
      v1->v2:
      - Fix sparse warning (Stanislav, lkp@intel.com)
      - Fix BPF CI build error
      - Reuse kernel_syms_load() (Alexei)
      - Print 'name' instead of 'func' (Alexei)
      - Show whether the probe is retprobe or not (Andrii)
      - Add comment for the meaning of perf_event name (Andrii)
      - Add support for generic perf event
      - Adhere to the kptr_restrict setting
      
      RFC->v1:
      - Use a single copy_to_user() instead (Jiri)
      - Show also the symbol name in bpftool (Quentin, Alexei)
      - Use calloc() instead of malloc() in bpftool (Quentin)
      - Avoid having conditional entries in the JSON output (Quentin)
      - Drop ->show_fdinfo (Alexei)
      - Use __u64 instead of __aligned_u64 for the field addr (Alexei)
      - Avoid the contradiction in perf_event name length (Alexei)
      - Address a build warning reported by kernel test robot <lkp@intel.com>
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      87e098e6
    • Yafang Shao's avatar
      bpftool: Show perf link info · 88d61607
      Yafang Shao authored
      Enhance bpftool to display comprehensive information about exposed
      perf_event links, covering uprobe, kprobe, tracepoint, and generic perf
      event. The resulting output will include the following details:
      
      $ tools/bpf/bpftool/bpftool link show
      3: perf_event  prog 14
              event software:cpu-clock
              bpf_cookie 0
              pids perf_event(19483)
      4: perf_event  prog 14
              event hw-cache:LLC-load-misses
              bpf_cookie 0
              pids perf_event(19483)
      5: perf_event  prog 14
              event hardware:cpu-cycles
              bpf_cookie 0
              pids perf_event(19483)
      6: perf_event  prog 19
              tracepoint sched_switch
              bpf_cookie 0
              pids tracepoint(20947)
      7: perf_event  prog 26
              uprobe /home/dev/waken/bpf/uprobe/a.out+0x1338
              bpf_cookie 0
              pids uprobe(21973)
      8: perf_event  prog 27
              uretprobe /home/dev/waken/bpf/uprobe/a.out+0x1338
              bpf_cookie 0
              pids uprobe(21973)
      10: perf_event  prog 43
              kprobe ffffffffb70a9660 kernel_clone
              bpf_cookie 0
              pids kprobe(35275)
      11: perf_event  prog 41
              kretprobe ffffffffb70a9660 kernel_clone
              bpf_cookie 0
              pids kprobe(35275)
      
      $ tools/bpf/bpftool/bpftool link show -j
      [{"id":3,"type":"perf_event","prog_id":14,"event_type":"software","event_config":"cpu-clock","bpf_cookie":0,"pids":[{"pid":19483,"comm":"perf_event"}]},{"id":4,"type":"perf_event","prog_id":14,"event_type":"hw-cache","event_config":"LLC-load-misses","bpf_cookie":0,"pids":[{"pid":19483,"comm":"perf_event"}]},{"id":5,"type":"perf_event","prog_id":14,"event_type":"hardware","event_config":"cpu-cycles","bpf_cookie":0,"pids":[{"pid":19483,"comm":"perf_event"}]},{"id":6,"type":"perf_event","prog_id":19,"tracepoint":"sched_switch","bpf_cookie":0,"pids":[{"pid":20947,"comm":"tracepoint"}]},{"id":7,"type":"perf_event","prog_id":26,"retprobe":false,"file":"/home/dev/waken/bpf/uprobe/a.out","offset":4920,"bpf_cookie":0,"pids":[{"pid":21973,"comm":"uprobe"}]},{"id":8,"type":"perf_event","prog_id":27,"retprobe":true,"file":"/home/dev/waken/bpf/uprobe/a.out","offset":4920,"bpf_cookie":0,"pids":[{"pid":21973,"comm":"uprobe"}]},{"id":10,"type":"perf_event","prog_id":43,"retprobe":false,"addr":18446744072485508704,"func":"kernel_clone","offset":0,"bpf_cookie":0,"pids":[{"pid":35275,"comm":"kprobe"}]},{"id":11,"type":"perf_event","prog_id":41,"retprobe":true,"addr":18446744072485508704,"func":"kernel_clone","offset":0,"bpf_cookie":0,"pids":[{"pid":35275,"comm":"kprobe"}]}]
      
      For generic perf events, the displayed information in bpftool is limited to
      the type and configuration, while other attributes such as sample_period,
      sample_freq, etc., are not included.
      
      The kernel function address won't be exposed if it is not permitted by
      kptr_restrict. The result as follows when kptr_restrict is 2.
      
      $ tools/bpf/bpftool/bpftool link show
      3: perf_event  prog 14
              event software:cpu-clock
      4: perf_event  prog 14
              event hw-cache:LLC-load-misses
      5: perf_event  prog 14
              event hardware:cpu-cycles
      6: perf_event  prog 19
              tracepoint sched_switch
      7: perf_event  prog 26
              uprobe /home/dev/waken/bpf/uprobe/a.out+0x1338
      8: perf_event  prog 27
              uretprobe /home/dev/waken/bpf/uprobe/a.out+0x1338
      10: perf_event  prog 43
              kprobe kernel_clone
      11: perf_event  prog 41
              kretprobe kernel_clone
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-11-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      88d61607
    • Yafang Shao's avatar
      bpftool: Add perf event names · 62b57e3d
      Yafang Shao authored
      Add new functions and macros to get perf event names. These names except
      the perf_type_name are all copied from
      tool/perf/util/{parse-events,evsel}.c, so that in the future we will
      have a good chance to use the same code.
      Suggested-by: default avatarJiri Olsa <olsajiri@gmail.com>
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-10-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      62b57e3d
    • Yafang Shao's avatar
      bpf: Support ->fill_link_info for perf_event · 1b715e1b
      Yafang Shao authored
      By introducing support for ->fill_link_info to the perf_event link, users
      gain the ability to inspect it using `bpftool link show`. While the current
      approach involves accessing this information via `bpftool perf show`,
      consolidating link information for all link types in one place offers
      greater convenience. Additionally, this patch extends support to the
      generic perf event, which is not currently accommodated by
      `bpftool perf show`. While only the perf type and config are exposed to
      userspace, other attributes such as sample_period and sample_freq are
      ignored. It's important to note that if kptr_restrict is not permitted, the
      probed address will not be exposed, maintaining security measures.
      
      A new enum bpf_perf_event_type is introduced to help the user understand
      which struct is relevant.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-9-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1b715e1b
    • Yafang Shao's avatar
      bpf: Add a common helper bpf_copy_to_user() · 57d48537
      Yafang Shao authored
      Add a common helper bpf_copy_to_user(), which will be used at multiple
      places.
      No functional change.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-8-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      57d48537
    • Yafang Shao's avatar
      bpf: Expose symbol's respective address · cd3910d0
      Yafang Shao authored
      Since different symbols can share the same name, it is insufficient to only
      expose the symbol name. It is essential to also expose the symbol address
      so that users can accurately identify which one is being probed.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-7-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cd3910d0
    • Yafang Shao's avatar
      bpf: Clear the probe_addr for uprobe · 5125e757
      Yafang Shao authored
      To avoid returning uninitialized or random values when querying the file
      descriptor (fd) and accessing probe_addr, it is necessary to clear the
      variable prior to its use.
      
      Fixes: 41bdc4b4 ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-6-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5125e757
    • Yafang Shao's avatar
      bpf: Protect probed address based on kptr_restrict setting · f1a41453
      Yafang Shao authored
      The probed address can be accessed by userspace through querying the task
      file descriptor (fd). However, it is crucial to adhere to the kptr_restrict
      setting and refrain from exposing the address if it is not permitted.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-5-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f1a41453
    • Yafang Shao's avatar
      bpftool: Show kprobe_multi link info · edd7f49b
      Yafang Shao authored
      Show the already expose kprobe_multi link info in bpftool. The result as
      follows,
      
      $ tools/bpf/bpftool/bpftool link show
      91: kprobe_multi  prog 244
              kprobe.multi  func_cnt 7
              addr             func [module]
              ffffffff98c44f20 schedule_timeout_interruptible
              ffffffff98c44f60 schedule_timeout_killable
              ffffffff98c44fa0 schedule_timeout_uninterruptible
              ffffffff98c44fe0 schedule_timeout_idle
              ffffffffc075b8d0 xfs_trans_get_efd [xfs]
              ffffffffc0768a10 xfs_trans_get_buf_map [xfs]
              ffffffffc076c320 xfs_trans_get_dqtrx [xfs]
              pids kprobe_multi(188367)
      92: kprobe_multi  prog 244
              kretprobe.multi  func_cnt 7
              addr             func [module]
              ffffffff98c44f20 schedule_timeout_interruptible
              ffffffff98c44f60 schedule_timeout_killable
              ffffffff98c44fa0 schedule_timeout_uninterruptible
              ffffffff98c44fe0 schedule_timeout_idle
              ffffffffc075b8d0 xfs_trans_get_efd [xfs]
              ffffffffc0768a10 xfs_trans_get_buf_map [xfs]
              ffffffffc076c320 xfs_trans_get_dqtrx [xfs]
              pids kprobe_multi(188367)
      
      $ tools/bpf/bpftool/bpftool link show -j
      [{"id":91,"type":"kprobe_multi","prog_id":244,"retprobe":false,"func_cnt":7,"funcs":[{"addr":18446744071977586464,"func":"schedule_timeout_interruptible","module":null},{"addr":18446744071977586528,"func":"schedule_timeout_killable","module":null},{"addr":18446744071977586592,"func":"schedule_timeout_uninterruptible","module":null},{"addr":18446744071977586656,"func":"schedule_timeout_idle","module":null},{"addr":18446744072643524816,"func":"xfs_trans_get_efd","module":"xfs"},{"addr":18446744072643578384,"func":"xfs_trans_get_buf_map","module":"xfs"},{"addr":18446744072643592992,"func":"xfs_trans_get_dqtrx","module":"xfs"}],"pids":[{"pid":188367,"comm":"kprobe_multi"}]},{"id":92,"type":"kprobe_multi","prog_id":244,"retprobe":true,"func_cnt":7,"funcs":[{"addr":18446744071977586464,"func":"schedule_timeout_interruptible","module":null},{"addr":18446744071977586528,"func":"schedule_timeout_killable","module":null},{"addr":18446744071977586592,"func":"schedule_timeout_uninterruptible","module":null},{"addr":18446744071977586656,"func":"schedule_timeout_idle","module":null},{"addr":18446744072643524816,"func":"xfs_trans_get_efd","module":"xfs"},{"addr":18446744072643578384,"func":"xfs_trans_get_buf_map","module":"xfs"},{"addr":18446744072643592992,"func":"xfs_trans_get_dqtrx","module":"xfs"}],"pids":[{"pid":188367,"comm":"kprobe_multi"}]}]
      
      When kptr_restrict is 2, the result is,
      
      $ tools/bpf/bpftool/bpftool link show
      91: kprobe_multi  prog 244
              kprobe.multi  func_cnt 7
      92: kprobe_multi  prog 244
              kretprobe.multi  func_cnt 7
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-4-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      edd7f49b
    • Yafang Shao's avatar
      bpftool: Dump the kernel symbol's module name · dc651944
      Yafang Shao authored
      If the kernel symbol is in a module, we will dump the module name as
      well. The square brackets around the module name are trimmed.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-3-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      dc651944
    • Yafang Shao's avatar
      bpf: Support ->fill_link_info for kprobe_multi · 7ac8d0d2
      Yafang Shao authored
      With the addition of support for fill_link_info to the kprobe_multi link,
      users will gain the ability to inspect it conveniently using the
      `bpftool link show`. This enhancement provides valuable information to the
      user, including the count of probed functions and their respective
      addresses. It's important to note that if the kptr_restrict setting is not
      permitted, the probed address will not be exposed, ensuring security.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230709025630.3735-2-laoar.shao@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7ac8d0d2
  2. 11 Jul, 2023 5 commits
  3. 10 Jul, 2023 4 commits
  4. 09 Jul, 2023 1 commit
  5. 07 Jul, 2023 1 commit
  6. 06 Jul, 2023 10 commits
  7. 05 Jul, 2023 3 commits