• Martin Belanger's avatar
    nvme-tcp: allow selecting the network interface for connections · 3ede8f72
    Martin Belanger authored
    In our application, we need a way to force TCP connections to go out a
    specific IP interface instead of letting Linux select the interface
    based on the routing tables.
    
    Add the 'host-iface' option to allow specifying the interface to use.
    When the option host-iface is specified, the driver uses the specified
    interface to set the option SO_BINDTODEVICE on the TCP socket before
    connecting.
    
    This new option is needed in addtion to the existing host-traddr for
    the following reasons:
    
    Specifying an IP interface by its associated IP address is less
    intuitive than specifying the actual interface name and, in some cases,
    simply doesn't work. That's because the association between interfaces
    and IP addresses is not predictable. IP addresses can be changed or can
    change by themselves over time (e.g. DHCP). Interface names are
    predictable [1] and will persist over time. Consider the following
    configuration.
    
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state ...
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 100.0.0.100/24 scope global lo
           valid_lft forever preferred_lft forever
    2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
        link/ether 08:00:27:21:65:ec brd ff:ff:ff:ff:ff:ff
        inet 100.0.0.100/24 scope global enp0s3
           valid_lft forever preferred_lft forever
    3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
        link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
        inet 100.0.0.100/24 scope global enp0s8
           valid_lft forever preferred_lft forever
    
    The above is a VM that I configured with the same IP address
    (100.0.0.100) on all interfaces. Doing a reverse lookup to identify the
    unique interface associated with 100.0.0.100 does not work here. And
    this is why the option host_iface is required. I understand that the
    above config does not represent a standard host system, but I'm using
    this to prove a point: "We can never know how users will configure
    their systems". By te way, The above configuration is perfectly fine
    by Linux.
    
    The current TCP implementation for host_traddr performs a
    bind()-before-connect(). This is a common construct to set the source
    IP address on a TCP socket before connecting. This has no effect on how
    Linux selects the interface for the connection. That's because Linux
    uses the Weak End System model as described in RFC1122 [2]. On the other
    hand, setting the Source IP Address has benefits and should be supported
    by linux-nvme. In fact, setting the Source IP Address is a mandatory
    FedGov requirement (e.g. connection to a RADIUS/TACACS+ server).
    Consider the following configuration.
    
    $ ip addr list dev enp0s8
    3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
        link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
        inet 192.168.56.101/24 brd 192.168.56.255 scope global enp0s8
           valid_lft 426sec preferred_lft 426sec
        inet 192.168.56.102/24 scope global secondary enp0s8
           valid_lft forever preferred_lft forever
        inet 192.168.56.103/24 scope global secondary enp0s8
           valid_lft forever preferred_lft forever
        inet 192.168.56.104/24 scope global secondary enp0s8
           valid_lft forever preferred_lft forever
    
    Here we can see that several addresses are associated with interface
    enp0s8. By default, Linux always selects the default IP address,
    192.168.56.101, as the source address when connecting over interface
    enp0s8. Some users, however, want the ability to specify a different
    source address (e.g., 192.168.56.102, 192.168.56.103, ...). The option
    host_traddr can be used as-is to perform this function.
    
    In conclusion, I believe that we need 2 options for TCP connections.
    One that can be used to specify an interface (host-iface). And one that
    can be used to set the source address (host-traddr). Users should be
    allowed to use one or the other, or both, or none. Of course, the
    documentation for host_traddr will need some clarification. It should
    state that when used for TCP connection, this option only sets the
    source address. And the documentation for host_iface should say that
    this option is only available for TCP connections.
    
    References:
    [1] https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
    [2] https://tools.ietf.org/html/rfc1122
    
    Tested both IPv4 and IPv6 connections.
    Signed-off-by: default avatarMartin Belanger <martin.belanger@dell.com>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    3ede8f72
tcp.c 65.6 KB