1. 28 Jul, 2014 11 commits
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · f1b714bb
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2014-07-25
      
      This series contains updates to e1000e, ixgbe and ixgbevf.
      
      Mark provides all the changes for ixgbe and ixgbevf.  Converts some udelay()
      calls to the preferred usleep_range().  Fixes a spurious release of the
      semaphore in several functions when there was a failure to acquire the
      semaphore in the first place.  Fixes a X540 semaphore error where an
      incorrect check was treating success as failure and vice-versa.  Fixed
      ixgbe_write_mbx() error when it was being called and there was no
      mbx->ops.write method defined, so no error code was returned.  The
      corresponding read function would explicitly return an error in such a
      case as do other functions.  Cleans up unused (dead) code by removing it.
      Finally make return values more direct, eliminating some gotos and
      otherwise unneeded conditionals, which allows the removal of some local
      variables.
      
      David provides all the changes for e1000e.  Fix CRC errors with jumbo
      traffic for 82579, i217 and i218 client parts to increase the gap
      between the read and write pointers in the transmit FIFO.  Added code
      to check and respond to previously ignored return values from NVM
      access functions.  Added support for EEE in Sx states and fixed EEE in
      S5 with runtime PM enabled.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1b714bb
    • David S. Miller's avatar
      Merge branch 'inet_frag_kill_lru_list' · 6ceed786
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      inet: frag: cleanup and update
      
      The end goal of this patchset is to remove the LRU list and to move the
      frag eviction to a work queue. It also does a couple of necessary cleanups
      and fixes. Brief patch descriptions:
      Patches 1 - 3 inclusive: necessary clean ups
      Patch 4 moves the eviction from the softirqs to a workqueue.
      Patch 5 removes the nqueues counter which was protected by the LRU lock
      Patch 6 removes the, by now unused, lru list.
      Patch 7 moves the rebuild timer to the workqueue and schedules the rebuilds
              only if we've hit the maximum queue length on some of the chains.
      Patch 8 migrate the rwlock to a seqlock since the rehash is usually a rare
              operation.
      Patch 9 introduces an artificial global memory limit based on the value of
              init_net's high_thresh which is used to cap the high_thresh of the
              other namespaces. Also introduces some sane limits on the other
              tunables, and makes it impossible to have low_thresh > high_thresh.
      
      Here are some numbers from running netperf before and after the patchset:
      Each test consists of the following setting: -I 95,5 -i 15,10
      
      1. Bound test (-T 4,4)
      1.1 Virtio before the patchset -
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.  : cpu bind
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      
      212992   64000   30.00      722177      0    12325.1     34.55    2.025
      212992           30.00      368020            6280.9     34.05    0.752
      
      1.2 Virtio after the patchset -
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.  : cpu bind
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      
      212992   64000   30.00      727030      0    12407.9     35.45    1.876
      212992           30.00      505405            8625.5     34.92    0.693
      
      2. Virtio unbound test
      2.1 Before the patchset
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992   64000   30.00      730008      0    12458.77
      212992           30.00      416721           7112.02
      
      2.2 After the patchset
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992   64000   30.00      731129      0    12477.89
      212992           30.00      487707           8323.50
      
      3. 10 gig unbound tests
      3.1 Before the patchset
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.133.1 () port 0 AF_INET : +/-2.500% @ 95% conf.
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992   64000   30.00      417209      0    7120.33
      212992           30.00      416740           7112.33
      
      3.2 After the patchset
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.133.1 () port 0 AF_INET : +/-2.500% @ 95% conf.
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992   64000   30.00      438009      0    7475.33
      212992           30.00      437630           7468.87
      
      Given the options each netperf ran between 10 and 15 times for 30 seconds
      to get the necessary confidence, also the tests themselves ran 3 times and
      were consistent.
      Another set of tests that I ran were parallel stress tests which consisted
      of flooding the machine with fragmented packets from different sources with
      frag timeout set to 0 (so there're lots of timeouts) and low_thresh set to
      1 byte (so evictions are happening all the time) and on top of that running
      a namespace create/destroy endless loop with network interfaces and
      addresses that got flooded (for the brief periods they were up) in parallel.
      This test ran for an hour without any issues.
      ====================
      6ceed786
    • Nikolay Aleksandrov's avatar
      inet: frag: set limits and make init_net's high_thresh limit global · 1bab4c75
      Nikolay Aleksandrov authored
      This patch makes init_net's high_thresh limit to be the maximum for all
      namespaces, thus introducing a global memory limit threshold equal to the
      sum of the individual high_thresh limits which are capped.
      It also introduces some sane minimums for low_thresh as it shouldn't be
      able to drop below 0 (or > high_thresh in the unsigned case), and
      overall low_thresh should not ever be above high_thresh, so we make the
      following relations for a namespace:
      init_net:
       high_thresh - max(not capped), min(init_net low_thresh)
       low_thresh - max(init_net high_thresh), min (0)
      
      all other namespaces:
       high_thresh = max(init_net high_thresh), min(namespace's low_thresh)
       low_thresh = max(namespace's high_thresh), min(0)
      
      The major issue with having low_thresh > high_thresh is that we'll
      schedule eviction but never evict anything and thus rely only on the
      timers.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bab4c75
    • Florian Westphal's avatar
      inet: frag: use seqlock for hash rebuild · ab1c724f
      Florian Westphal authored
      rehash is rare operation, don't force readers to take
      the read-side rwlock.
      
      Instead, we only have to detect the (rare) case where
      the secret was altered while we are trying to insert
      a new inetfrag queue into the table.
      
      If it was changed, drop the bucket lock and recompute
      the hash to get the 'new' chain bucket that we have to
      insert into.
      
      Joint work with Nikolay Aleksandrov.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab1c724f
    • Florian Westphal's avatar
      inet: frag: remove periodic secret rebuild timer · e3a57d18
      Florian Westphal authored
      merge functionality into the eviction workqueue.
      
      Instead of rebuilding every n seconds, take advantage of the upper
      hash chain length limit.
      
      If we hit it, mark table for rebuild and schedule workqueue.
      To prevent frequent rebuilds when we're completely overloaded,
      don't rebuild more than once every 5 seconds.
      
      ipfrag_secret_interval sysctl is now obsolete and has been marked as
      deprecated, it still can be changed so scripts won't be broken but it
      won't have any effect. A comment is left above each unused secret_timer
      variable to avoid confusion.
      
      Joint work with Nikolay Aleksandrov.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3a57d18
    • Florian Westphal's avatar
      inet: frag: remove lru list · 3fd588eb
      Florian Westphal authored
      no longer used.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fd588eb
    • Florian Westphal's avatar
      inet: frag: don't account number of fragment queues · 434d3054
      Florian Westphal authored
      The 'nqueues' counter is protected by the lru list lock,
      once thats removed this needs to be converted to atomic
      counter.  Given this isn't used for anything except for
      reporting it to userspace via /proc, just remove it.
      
      We still report the memory currently used by fragment
      reassembly queues.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      434d3054
    • Florian Westphal's avatar
      inet: frag: move eviction of queues to work queue · b13d3cbf
      Florian Westphal authored
      When the high_thresh limit is reached we try to toss the 'oldest'
      incomplete fragment queues until memory limits are below the low_thresh
      value.  This happens in softirq/packet processing context.
      
      This has two drawbacks:
      
      1) processors might evict a queue that was about to be completed
      by another cpu, because they will compete wrt. resource usage and
      resource reclaim.
      
      2) LRU list maintenance is expensive.
      
      But when constantly overloaded, even the 'least recently used' element is
      recent, so removing 'lru' queue first is not 'fairer' than removing any
      other fragment queue.
      
      This moves eviction out of the fast path:
      
      When the low threshold is reached, a work queue is scheduled
      which then iterates over the table and removes the queues that exceed
      the memory limits of the namespace. It sets a new flag called
      INET_FRAG_EVICTED on the evicted queues so the proper counters will get
      incremented when the queue is forcefully expired.
      
      When the high threshold is reached, no more fragment queues are
      created until we're below the limit again.
      
      The LRU list is now unused and will be removed in a followup patch.
      
      Joint work with Nikolay Aleksandrov.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b13d3cbf
    • Florian Westphal's avatar
      inet: frag: move evictor calls into frag_find function · 86e93e47
      Florian Westphal authored
      First step to move eviction handling into a work queue.
      
      We lose two spots that accounted evicted fragments in MIB counters.
      
      Accounting will be restored since the upcoming work-queue evictor
      invokes the frag queue timer callbacks instead.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86e93e47
    • Florian Westphal's avatar
      inet: frag: remove hash size assumptions from callers · fb3cfe6e
      Florian Westphal authored
      hide actual hash size from individual users: The _find
      function will now fold the given hash value into the required range.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb3cfe6e
    • Florian Westphal's avatar
  2. 26 Jul, 2014 12 commits
  3. 25 Jul, 2014 17 commits