• Daniel Borkmann's avatar
    bpf: avoid stack copy and use skb ctx for event output · 555c8a86
    Daniel Borkmann authored
    This work addresses a couple of issues bpf_skb_event_output()
    helper currently has: i) We need two copies instead of just a
    single one for the skb data when it should be part of a sample.
    The data can be non-linear and thus needs to be extracted via
    bpf_skb_load_bytes() helper first, and then copied once again
    into the ring buffer slot. ii) Since bpf_skb_load_bytes()
    currently needs to be used first, the helper needs to see a
    constant size on the passed stack buffer to make sure BPF
    verifier can do sanity checks on it during verification time.
    Thus, just passing skb->len (or any other non-constant value)
    wouldn't work, but changing bpf_skb_load_bytes() is also not
    the proper solution, since the two copies are generally still
    needed. iii) bpf_skb_load_bytes() is just for rather small
    buffers like headers, since they need to sit on the limited
    BPF stack anyway. Instead of working around in bpf_skb_load_bytes(),
    this work improves the bpf_skb_event_output() helper to address
    all 3 at once.
    
    We can make use of the passed in skb context that we have in
    the helper anyway, and use some of the reserved flag bits as
    a length argument. The helper will use the new __output_custom()
    facility from perf side with bpf_skb_copy() as callback helper
    to walk and extract the data. It will pass the data for setup
    to bpf_event_output(), which generates and pushes the raw record
    with an additional frag part. The linear data used in the first
    frag of the record serves as programmatically defined meta data
    passed along with the appended sample.
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    555c8a86
bpf.h 9.99 KB