• Martin KaFai Lau's avatar
    tcp: seq_file: Replace listening_hash with lhash2 · 05c0b357
    Martin KaFai Lau authored
    This patch moves the tcp seq_file iteration on listeners
    from the port only listening_hash to the port+addr lhash2.
    
    When iterating from the bpf iter, the next patch will need to
    lock the socket such that the bpf iter can call setsockopt (e.g. to
    change the TCP_CONGESTION).  To avoid locking the bucket and then locking
    the sock, the bpf iter will first batch some sockets from the same bucket
    and then unlock the bucket.  If the bucket size is small (which
    usually is), it is easier to batch the whole bucket such that it is less
    likely to miss a setsockopt on a socket due to changes in the bucket.
    
    However, the port only listening_hash could have many listeners
    hashed to a bucket (e.g. many individual VIP(s):443 and also
    multiple by the number of SO_REUSEPORT).  We have seen bucket size in
    tens of thousands range.  Also, the chance of having changes
    in some popular port buckets (e.g. 443) is also high.
    
    The port+addr lhash2 was introduced to solve this large listener bucket
    issue.  Also, the listening_hash usage has already been replaced with
    lhash2 in the fast path inet[6]_lookup_listener().  This patch follows
    the same direction on moving to lhash2 and iterates the lhash2
    instead of listening_hash.
    Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Acked-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
    Acked-by: default avatarYonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20210701200606.1035783-1-kafai@fb.com
    05c0b357
inet_hashtables.h 14 KB