• Mathieu Desnoyers's avatar
    ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2) · e1e84eb5
    Mathieu Desnoyers authored
    As per RFC792, ICMP errors should be sent to the source host.
    
    However, in configurations with Virtual Routing and Forwarding tables,
    looking up which routing table to use is currently done by using the
    destination net_device.
    
    commit 9d1a6c4e ("net: icmp_route_lookup should use rt dev to
    determine L3 domain") changes the interface passed to
    l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev
    to skb_dst(skb_in)->dev. This effectively uses the destination device
    rather than the source device for choosing which routing table should be
    used to lookup where to send the ICMP error.
    
    Therefore, if the source and destination interfaces are within separate
    VRFs, or one in the global routing table and the other in a VRF, looking
    up the source host in the destination interface's routing table will
    fail if the destination interface's routing table contains no route to
    the source host.
    
    One observable effect of this issue is that traceroute does not work in
    the following cases:
    
    - Route leaking between global routing table and VRF
    - Route leaking between VRFs
    
    Preferably use the source device routing table when sending ICMP error
    messages. If no source device is set, fall-back on the destination
    device routing table. Else, use the main routing table (index 0).
    
    [ It has been pointed out that a similar issue may exist with ICMP
      errors triggered when forwarding between network namespaces. It would
      be worthwhile to investigate, but is outside of the scope of this
      investigation. ]
    
    [ It has also been pointed out that a similar issue exists with
      unreachable / fragmentation needed messages, which can be triggered by
      changing the MTU of eth1 in r1 to 1400 and running:
    
      ip netns exec h1 ping -s 1450 -Mdo -c1 172.16.2.2
    
      Some investigation points to raw_icmp_error() and raw_err() as being
      involved in this last scenario. The focus of this patch is TTL expired
      ICMP messages, which go through icmp_route_lookup.
      Investigation of failure modes related to raw_icmp_error() is beyond
      this investigation's scope. ]
    
    Fixes: 9d1a6c4e ("net: icmp_route_lookup should use rt dev to determine L3 domain")
    Link: https://tools.ietf.org/html/rfc792Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    e1e84eb5
icmp.c 32.6 KB