1. 06 Feb, 2023 1 commit
    • Julian Anastasov's avatar
      neigh: make sure used and confirmed times are valid · c1d2ecdf
      Julian Anastasov authored
      Entries can linger in cache without timer for days, thanks to
      the gc_thresh1 limit. As result, without traffic, the confirmed
      time can be outdated and to appear to be in the future. Later,
      on traffic, NUD_STALE entries can switch to NUD_DELAY and start
      the timer which can see the invalid confirmed time and wrongly
      switch to NUD_REACHABLE state instead of NUD_PROBE. As result,
      timer is set many days in the future. This is more visible on
      32-bit platforms, with higher HZ value.
      
      Why this is a problem? While we expect unused entries to expire,
      such entries stay in REACHABLE state for too long, locked in
      cache. They are not expired normally, only when cache is full.
      
      Problem and the wrong state change reported by Zhang Changzhong:
      
      172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE
      
      350520 seconds have elapsed since this entry was last updated, but it is
      still in the REACHABLE state (base_reachable_time_ms is 30000),
      preventing lladdr from being updated through probe.
      
      Fix it by ensuring timer is started with valid used/confirmed
      times. Considering the valid time range is LONG_MAX jiffies,
      we try not to go too much in the past while we are in
      DELAY/PROBE state. There are also places that need
      used/updated times to be validated while timer is not running.
      Reported-by: default avatarZhang Changzhong <zhangchangzhong@huawei.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Tested-by: default avatarZhang Changzhong <zhangchangzhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1d2ecdf
  2. 04 Feb, 2023 7 commits
  3. 03 Feb, 2023 1 commit
  4. 02 Feb, 2023 31 commits