• John Fastabend's avatar
    bpf: sockmap, convert bpf_compute_data_pointers to bpf_*_sk_skb · 0ea488ff
    John Fastabend authored
    In commit
    
      'bpf: bpf_compute_data uses incorrect cb structure' (8108a775)
    
    we added the routine bpf_compute_data_end_sk_skb() to compute the
    correct data_end values, but this has since been lost. In kernel
    v4.14 this was correct and the above patch was applied in it
    entirety. Then when v4.14 was merged into v4.15-rc1 net-next tree
    we lost the piece that renamed bpf_compute_data_pointers to the
    new function bpf_compute_data_end_sk_skb. This was done here,
    
    e1ea2f98 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
    
    When it conflicted with the following rename patch,
    
    6aaae2b6 ("bpf: rename bpf_compute_data_end into bpf_compute_data_pointers")
    
    Finally, after a refactor I thought even the function
    bpf_compute_data_end_sk_skb() was no longer needed and it was
    erroneously removed.
    
    However, we never reverted the sk_skb_convert_ctx_access() usage of
    tcp_skb_cb which had been committed and survived the merge conflict.
    Here we fix this by adding back the helper and *_data_end_sk_skb()
    usage. Using the bpf_skc_data_end mapping is not correct because it
    expects a qdisc_skb_cb object but at the sock layer this is not the
    case. Even though it happens to work here because we don't overwrite
    any data in-use at the socket layer and the cb structure is cleared
    later this has potential to create some subtle issues. But, even
    more concretely the filter.c access check uses tcp_skb_cb.
    
    And by some act of chance though,
    
    struct bpf_skb_data_end {
            struct qdisc_skb_cb        qdisc_cb;             /*     0    28 */
    
            /* XXX 4 bytes hole, try to pack */
    
            void *                     data_meta;            /*    32     8 */
            void *                     data_end;             /*    40     8 */
    
            /* size: 48, cachelines: 1, members: 3 */
            /* sum members: 44, holes: 1, sum holes: 4 */
            /* last cacheline: 48 bytes */
    };
    
    and then tcp_skb_cb,
    
    struct tcp_skb_cb {
    	[...]
                    struct {
                            __u32      flags;                /*    24     4 */
                            struct sock * sk_redir;          /*    32     8 */
                            void *     data_end;             /*    40     8 */
                    } bpf;                                   /*          24 */
            };
    
    So when we use offset_of() to track down the byte offset we get 40 in
    either case and everything continues to work. Fix this mess and use
    correct structures its unclear how long this might actually work for
    until someone moves the structs around.
    Reported-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Fixes: e1ea2f98 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
    Fixes: 6aaae2b6 ("bpf: rename bpf_compute_data_end into bpf_compute_data_pointers")
    Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    0ea488ff
sockmap.c 59.3 KB