• Jordan Rife's avatar
    net: Avoid address overwrite in kernel_connect · 0bdf3993
    Jordan Rife authored
    BPF programs that run on connect can rewrite the connect address. For
    the connect system call this isn't a problem, because a copy of the address
    is made when it is moved into kernel space. However, kernel_connect
    simply passes through the address it is given, so the caller may observe
    its address value unexpectedly change.
    
    A practical example where this is problematic is where NFS is combined
    with a system such as Cilium which implements BPF-based load balancing.
    A common pattern in software-defined storage systems is to have an NFS
    mount that connects to a persistent virtual IP which in turn maps to an
    ephemeral server IP. This is usually done to achieve high availability:
    if your server goes down you can quickly spin up a replacement and remap
    the virtual IP to that endpoint. With BPF-based load balancing, mounts
    will forget the virtual IP address when the address rewrite occurs
    because a pointer to the only copy of that address is passed down the
    stack. Server failover then breaks, because clients have forgotten the
    virtual IP address. Reconnects fail and mounts remain broken. This patch
    was tested by setting up a scenario like this and ensuring that NFS
    reconnects worked after applying the patch.
    Signed-off-by: default avatarJordan Rife <jrife@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    0bdf3993
socket.c 89.2 KB