• Martin KaFai Lau's avatar
    bpftool: Add struct_ops support · 65c93628
    Martin KaFai Lau authored
    This patch adds struct_ops support to the bpftool.
    
    To recap a bit on the recent bpf_struct_ops feature on the kernel side:
    It currently supports "struct tcp_congestion_ops" to be implemented
    in bpf.  At a high level, bpf_struct_ops is struct_ops map populated
    with a number of bpf progs.  bpf_struct_ops currently supports the
    "struct tcp_congestion_ops".  However, the bpf_struct_ops design is
    generic enough that other kernel struct ops can be supported in
    the future.
    
    Although struct_ops is map+progs at a high lever, there are differences
    in details.  For example,
    1) After registering a struct_ops, the struct_ops is held by the kernel
       subsystem (e.g. tcp-cc).  Thus, there is no need to pin a
       struct_ops map or its progs in order to keep them around.
    2) To iterate all struct_ops in a system, it iterates all maps
       in type BPF_MAP_TYPE_STRUCT_OPS.  BPF_MAP_TYPE_STRUCT_OPS is
       the current usual filter.  In the future, it may need to
       filter by other struct_ops specific properties.  e.g. filter by
       tcp_congestion_ops or other kernel subsystem ops in the future.
    3) struct_ops requires the running kernel having BTF info.  That allows
       more flexibility in handling other kernel structs.  e.g. it can
       always dump the latest bpf_map_info.
    4) Also, "struct_ops" command is not intended to repeat all features
       already provided by "map" or "prog".  For example, if there really
       is a need to pin the struct_ops map, the user can use the "map" cmd
       to do that.
    
    While the first attempt was to reuse parts from map/prog.c,  it ended up
    not a lot to share.  The only obvious item is the map_parse_fds() but
    that still requires modifications to accommodate struct_ops map specific
    filtering (for the immediate and the future needs).  Together with the
    earlier mentioned differences, it is better to part away from map/prog.c.
    
    The initial set of subcmds are, register, unregister, show, and dump.
    
    For register, it registers all struct_ops maps that can be found in an
    obj file.  Option can be added in the future to specify a particular
    struct_ops map.  Also, the common bpf_tcp_cc is stateless (e.g.
    bpf_cubic.c and bpf_dctcp.c).  The "reuse map" feature is not
    implemented in this patch and it can be considered later also.
    
    For other subcmds, please see the man doc for details.
    
    A sample output of dump:
    [root@arch-fb-vm1 bpf]# bpftool struct_ops dump name cubic
    [{
            "bpf_map_info": {
                "type": 26,
                "id": 64,
                "key_size": 4,
                "value_size": 256,
                "max_entries": 1,
                "map_flags": 0,
                "name": "cubic",
                "ifindex": 0,
                "btf_vmlinux_value_type_id": 18452,
                "netns_dev": 0,
                "netns_ino": 0,
                "btf_id": 52,
                "btf_key_type_id": 0,
                "btf_value_type_id": 0
            }
        },{
            "bpf_struct_ops_tcp_congestion_ops": {
                "refcnt": {
                    "refs": {
                        "counter": 1
                    }
                },
                "state": "BPF_STRUCT_OPS_STATE_INUSE",
                "data": {
                    "list": {
                        "next": 0,
                        "prev": 0
                    },
                    "key": 0,
                    "flags": 0,
                    "init": "void (struct sock *) bictcp_init/prog_id:138",
                    "release": "void (struct sock *) 0",
                    "ssthresh": "u32 (struct sock *) bictcp_recalc_ssthresh/prog_id:141",
                    "cong_avoid": "void (struct sock *, u32, u32) bictcp_cong_avoid/prog_id:140",
                    "set_state": "void (struct sock *, u8) bictcp_state/prog_id:142",
                    "cwnd_event": "void (struct sock *, enum tcp_ca_event) bictcp_cwnd_event/prog_id:139",
                    "in_ack_event": "void (struct sock *, u32) 0",
                    "undo_cwnd": "u32 (struct sock *) tcp_reno_undo_cwnd/prog_id:144",
                    "pkts_acked": "void (struct sock *, const struct ack_sample *) bictcp_acked/prog_id:143",
                    "min_tso_segs": "u32 (struct sock *) 0",
                    "sndbuf_expand": "u32 (struct sock *) 0",
                    "cong_control": "void (struct sock *, const struct rate_sample *) 0",
                    "get_info": "size_t (struct sock *, u32, int *, union tcp_cc_info *) 0",
                    "name": "bpf_cubic",
                    "owner": 0
                }
            }
        }
    ]
    Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
    Link: https://lore.kernel.org/bpf/20200318171656.129650-1-kafai@fb.com
    65c93628
main.h 7.42 KB