• Catalin(ux) M. BOIE's avatar
    IPVS: Allow boot time change of hash size · 6f7edb48
    Catalin(ux) M. BOIE authored
    I was very frustrated about the fact that I have to recompile the kernel
    to change the hash size. So, I created this patch.
    
    If IPVS is built-in you can append ip_vs.conn_tab_bits=?? to kernel
    command line, or, if you built IPVS as modules, you can add
    options ip_vs conn_tab_bits=??.
    
    To keep everything backward compatible, you still can select the size at
    compile time, and that will be used as default.
    
    It has been about a year since this patch was originally posted
    and subsequently dropped on the basis of insufficient test data.
    
    Mark Bergsma has provided the following test results which seem
    to strongly support the need for larger hash table sizes:
    
    We do however run into the same problem with the default setting (212 =
    4096 entries), as most of our LVS balancers handle around a million
    connections/SLAB entries at any point in time (around 100-150 kpps
    load). With only 4096 hash table entries this implies that each entry
    consists of a linked list of 256 connections *on average*.
    
    To provide some statistics, I did an oprofile run on an 2.6.31 kernel,
    with both the default 4096 table size, and the same kernel recompiled
    with IP_VS_CONN_TAB_BITS set to 18 (218 = 262144 entries). I built a
    quick test setup with a part of Wikimedia/Wikipedia's live traffic
    mirrored by the switch to the test host.
    
    With the default setting, at ~ 120 kpps packet load we saw a typical %si
    CPU usage of around 30-35%, and oprofile reported a hot spot in
    ip_vs_conn_in_get:
    
    samples  %        image name               app name
    symbol name
    1719761  42.3741  ip_vs.ko                 ip_vs.ko      ip_vs_conn_in_get
    302577    7.4554  bnx2                     bnx2          /bnx2
    181984    4.4840  vmlinux                  vmlinux       __ticket_spin_lock
    128636    3.1695  vmlinux                  vmlinux       ip_route_input
    74345     1.8318  ip_vs.ko                 ip_vs.ko      ip_vs_conn_out_get
    68482     1.6874  vmlinux                  vmlinux       mwait_idle
    
    After loading the recompiled kernel with 218 entries, %si CPU usage
    dropped in half to around 12-18%, and oprofile looks much healthier,
    with only 7% spent in ip_vs_conn_in_get:
    
    samples  %        image name               app name
    symbol name
    265641   14.4616  bnx2                     bnx2         /bnx2
    143251    7.7986  vmlinux                  vmlinux      __ticket_spin_lock
    140661    7.6576  ip_vs.ko                 ip_vs.ko     ip_vs_conn_in_get
    94364     5.1372  vmlinux                  vmlinux      mwait_idle
    86267     4.6964  vmlinux                  vmlinux      ip_route_input
    
    [ horms@verge.net.au: trivial up-port and minor style fixes ]
    Signed-off-by: default avatarCatalin(ux) M. BOIE <catab@embedromix.ro>
    Cc: Mark Bergsma <mark@wikimedia.org>
    Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
    6f7edb48
ip_vs.h 27.1 KB